top of page

STAT3064 STAT4067

For solutions, purchase a LIVE CHAT plan or contact us

Due date: Tuesday 23 August 10pm

Assignment Questions
Data required for this assignment can be found in the Data Sets folder.
1. (a) What does reproducible calculations and reproducible simulations refer to and
when or why should the calculations/simulations be reproducible?
(b) When is it a sensible strategy to use the Gaussian model in a simulation? When
is it not and why? (Hint. Your answer could contain an illustrative example.)
(c) Consider a random sample of observations X = [X 1 ,...,X n ]. Why would we
expect that, typically, the variables of the sample X are correlated, but the principal
component scores obtained from these variables are not?
2. Consider the aircraft data with the logged variables as in Question 2 of Computer Lab 1.
Divide the data into two period groups consisting of the same number of observations.
We are interested in comparing changes that occur over time.
(a) Show smoothed histograms of logLength and logPower separately for the two
periods. Comment on the shapes of the histograms and how the change over
time affects this shape.
(b) Construct contour plots of the 2D smoothed histograms of the pairs (logPower,
logWeight) and (logSpeed, logLength). Describe the shapes of the density plots
and discuss how they change over time.
(c) For which pair of variables would you expect the largest change in correlation or
shape of their density over time and why?
3. Consider the aircraft data of Q2 of this assignment.
(a) Separately for the two periods selected in Q2, carry out a principal component
analysis using prcomp based on the logged data (without scaling).
(b) Show eigenvalue plots for each of the two periods. Interpret the results.
(c) Show score plots of the first three PCs for each period. Comment on the results.
1
(d) Which logged variable contributes most to PC 1 for each period? Does this change
across the two periods? Comment on the results.
(e) Based on your analysis, discuss the main changes that have occurred over time.
4. The data set ass2pop.csv is available in the LMS folder ‘Data sets’. It contains
the means and covariance matrices corresponding to two populations. The first and
second column of ass2pop.csv are the means µ 1 and µ 2 of the first and second
population respectively; columns 3:22 correspond to the covariance matrix Σ 1 of the
first population, and the remaining columns correspond to the covariance matrix Σ 2
of the second populations. In this question we generate random samples from these
populations as described below.
(a) Read the data into R. What is the dimension of the covariance matrix Σ 1 ?
(b) Generate 250 random samples from the Gaussian distribution N(µ 1 ,Σ 1 ) and 150
samples from the Gaussian distribution N(µ 2 ,Σ 2 ). What is the size of the data
matrix consisting of these random samples? Calculate the sample covariance
matrix S of the random samples, and find eigenvalues of S. Save the vector of
eigenvalues into a file for later analysis.
(c) Repeat part (b) another 49 times, so you have a total of 50 vectors of eigenvalues.
(d) Calculate the mean vector of eigenvalues over the 50 repetitions and list/print
this mean vector.
(e) Display the 50 vectors of eigenvalues and their mean vector in an eigenvalue
or scree plot. How similar are these eigenvalue plots? Where does the largest
deviation from the mean vector occur?
(f) Repeat parts (b) to (e) with 250 samples from the t-distribution t 10 (µ 1 ,Σ 01 )
and 150 samples from t-distribution t 3 (µ 2 ,Σ 02 ). (Hint. Σ 0k is the scale matrix
which is obtained from the covariance matrix Σ k using the following relationship
Σ k =
ν
ν−2 Σ 0k , with ν the degree of freedom of the t-distribution and k = 1 and
2 here.)
(g) Compare the results of the two different simulations and comment on interesting
findings and differences between them. Why do we expect differences between
the pairs of simulations?

For solutions, purchase a LIVE CHAT plan or contact us

Limited time offer:

Follow us on Instagram and tag 10 friends for a $50 voucher! No minimum purchase required.

bottom of page