October 10

For solutions, purchase a LIVE CHAT plan or contact us

ECON UN3412: Introduction to Econometrics Fall 2022
Tuesday, 11 October 2022 at 11:59pm

(1) Graff Zivin and Neidell (2012) looked at the impacts of pollution on labor
market outcomes. In their study, they examined the relationship between
the productivity of agricultural workers to ozone pollution. They argue
diminished lung functioning due to ozone might reduce productivity for
physically demanding agriculture work. Let the dependent variable Yi
denote the measure of labor productivity for worker i and let the main

explanatory variable Xi be ozone concentration at worker i’s work envi-
ronment. In addition, let Wi be an indicator variable for female workers

and let Zi denote worker i’s experience measured in weeks.
(a) Suppose that you estimate a linear regression model:

Yi = β0 + β1Xi + β2Zi + β3Z
2
i + β4Wi + ui
.
What is the interpretation of β1 in the regression model? What is the
expected sign of β1? Explain your answer.
(b) On one hand, agricultural workers typically work outside and it is more
difficult to work when temperature is too high. On the other hand,
ozone formation depends on ambient temperatures. The figure below
plots the average hourly ozone and temperature by day.

Is there any concern for the omitted variable bias for the regression
model considered in part (a)? Justify your answer and discuss the
direction of the bias.
(2) Allcott and Gentzkow (2017) conducted an online survey of US adults
regarding fake news after the 2016 presidential election. In their survey,
they showed survey respondents news headlines about the 2016 election
and asked about whether the news headlines were true or false. Some
of the news headlines were fake and others were true. Their dependent
variable Yi takes value 1 if survey respondent i correctly identifies whether
the headline is true or false, value 0.5 if respondent is “not sure”, and value
0 otherwise. Suppose that one conducts a similar survey and obtains the
following regression result:
Ybi = 0.65 + 0.012College + 0.015 ln(Daily media time) + 0.003Age,
(0.02) (0.004) (0.003) (0.001)
where R ̄2 = 0.14, n = 828, College is a binary indicator that equals 1 if a
survey respondent is college graduate and 0 otherwise, ln(Daily media time)
is the logarithm of daily time consuming media, and Age is age in years.
(a) Suppose that you would like to test that people with higher education
have more accurate beliefs about news at the 1% level. State your null
hypothesis precisely and report your test result.

(b) The estimated coefficient for ln(Daily media time) is significantly pos-
itive. Interpret this result. Explain why this is plausible.

(c) Even if Age is omitted, there will be little concern about the omitted
variable bias problem. Do you agree? Explain briefly.

(d) Suppose that you now conjecture that Republicans may have differ-
ent beliefs about news than Democrats. Assume that there are three

groups in the data: Democrats, Republicans and Independents. How
would you change the specification of the linear regression model by
adding or subtracting regressors? Explain briefly.

(3) This question is related to the following paper:

Angrist, Joshua, Daniel Lang, and Philip Oreopoulos. 2009. “In-
centives and Services for College Achievement: Evidence from a
Randomized Trial.” American Economic Journal: Applied Eco-
nomics, 1 (1): 136–63. https://doi.org/10.1257/app.1.1.

(a) Read the abstract, introduction, and II.A in the paper. Summarize
the main research question in the paper.
(b) Read section II.B called “Student and School Background.” On page
143, the authors remark that
Merit scholarship programs like STAR may affect course
enrollment decisions and/or the selection of courses by
treated students.
In view of this concern, the authors show the empirical results reported
in Table 2. Describe the authors’ findings briefly.
(c) Read section II.C called “Consent Rates and Service Use” and discuss
findings in Table 3.
(d) Study empirical results in Table 5 and discuss differences between men
and women.

=================================================================

ENG1014
11:55PM, Friday Week 11 (14th October)

A wind farm is a group of wind turbines grouped together which are
used to produce electricity. Wind turbines convert the kinetic energy
of wind into mechanical energy of turbine blades, which drives a
generator to generate electricity. Modern wind farms may have
capacities in the order of 100 Megawatts (MW) and are installed either
offshore or on land.
For wind turbines to work effectively over their lifespan, they naturally
need access to favourable atmospheric conditions with suitable wind
speeds. To develop a clear picture of these dynamics at a given
location, attainment of correct data and the capability to rigorously analyse it are essential.

ASSIGNMENT OBJECTIVES
In this assignment you will be investigating wind farm location viability and the performance of three Australian
wind farms located in Ararat, Silverton, and Boco Rock. The assignment aims are structured into two parts:
Part 1 – Data Processing: You will clean and interpret satellite recordings of on-shore wind speed data
for analysis in later parts of the assignment.
Part 2 – Performance Analysis: Calculate power estimates and use this to assess the operational
performance capacity of the turbines using the data prepared in Part 1.

PART 1 – DATA PROCESSING: 20 MARKS
To complete the tasks in Part 1, you will need the wind data recorded at the Ararat, Silverton and Boco
Rock wind farms that have been provided to you in the files named ararat.txt, silverton.txt, and
boco_rock.txt, respectively.
The wind data files hold onshore wind speed measurements taken 100m above ground level by
satellites at regular 10-minute intervals over a year starting at 12:00pm on the 21st of March 2020 (i.e.,
21/03/2020) and ending at 11:50am on the same date in 2021 (i.e., 21/03/2021). The first column in each file
is a 12-digit timestamp formatted as YYYYDDMMhhmm such that,
▪ YYYY is the Year as a four-digit sequence – e.g., 2009 in 15/11/2009
▪ DD is the day of the month e.g., 23 in 23/12/2019
▪ MM is the Month as a number with a leading zero e.g., 07 in 15/07/2022
▪ hh is the Hour of day using a 24-hour format with a leading zero in e.g., 03 in 03:15
▪ mm are the Minutes of the hour with a leading zero e.g., 15 in 03:15
For example, on the 21/03/2020 at 14:25 using the format YYYYDDMMhhmm is written as 202021031425. The
second column is the wind speed (m/s) recorded at the time and date given in the first column.
Figure 1 – Wind
Farm

Q1. (2 + 4 MARKS)
(a) Create a 52560 x 8 matrix that contains the date/time information in the first 5 columns (1 column each
for year/month/day/hour/minute), as well as the wind speeds for each site. Print the first four lines of
this matrix to the screen.
(b) It turns out there are some unrealistic measurements and outliers that need to be removed.
a. Wind speeds of less than zero can be considered to be glitches.
b. From the remaining datapoints (once negative points are cleaned), any points where the
measurement is more than 70% different to the average of the 6 points within 30 minutes
either side (excluding the measured point) can be considered to be unreliable. In other
words, the following equation should apply for all valid data points:
vmeasured
vavg
∈ [0.3, 1.7]
where vavg is the moving average of the 3 points on either side of the datapoint, excluding the datapoint
itself.
Clean the data by replacing any unrealistic readings of either type by their hourly average. Create a log
file called “cleaned_data.txt” that records every measurement where this was done, including the
date/time, which site has been edited, and the original and final values.

Q2. (2 + 2 + 4 MARKS)
(a) Plot graphs of the data in 3 subplots (3 rows × 1 column)
a. You may wish to use the inbuilt datenum() function to convert the year/month/day vectors
into a single value that can be used on your x axis.
b. Use datetick(‘x’) to display the x axis with sensible labels.
c. Use solid lines with a linewidth of 1 for all plots.
(b) Create a pie chart showing the distribution of time when each site has the largest wind speed, compared to the others
(c) You want to know if there are any trends in the time of day when different sites receive high winds (e.g. do some sites
typically get stronger winds in the morning, or weaker winds at night, etc). Determine the average wind speeds at
different times of the day across the year, and create a plot showing what you find.

Q3. (4 + 2 MARKS)
The wind speed data were collected at a height, h0, of 100 m; not at the height of the turbine hub of 80 m. It
would be too time-consuming and expensive to re-collect the data at the hub height, and even more so if future
upgrades were to lead to a change in the turbine hub height. Fortunately, we can estimate the wind speed Wh
at a height h above the ground based on an initial wind speed W0 at initial height h0 is given by the model:
Wh = W0 (
h
h0
)
α
[Eq. 1]
where α is a surface roughness parameter dependent on the landscape topology of fixed elements such as trees,
hills, and buildings. Typical ranges for α are described in Table 1.

Table 1: Typical surface roughness parameter ranges.
α
Terrain Features
10-4
- 10-3 Minimal impact e.g., open water, smooth snow fields, barren terrains.
10-3
- 10-2 Featureless terrain e.g., deserts, flat grass plains, glaciers.
10-2
- 0.1 Flat terrain e.g., grass fields, airport runways.
0.1 - 0.5 Elements separated by large distances, e.g., scattered shelters, low-rising crops.
0.5 - 1.0 Landscape with moderate occurrences e.g., vegetation, bushes, new dense forests.
1.0 - 2.0 Larger elements uniformly distributed, e.g., mature forest, low-rise built-up areas.
> 2.0 Irregular distribution of large elements, e.g., city centres, forests with clearings.

(a) As we do not know what the surface roughness parameter is, your company has made measurements
of the wind at different heights at each of the sites. These measurement are supplied in the file
avg_wind_data.txt.
Import this data into MATLAB. Plot the measured points on a single plot in a new figure, using the
following characteristics:
Ararat: Blue circle markers of size five.
Silverton: Red Asterix markers of size 9.
Boco Rock: Magenta plus sign markers of size 8.
Based on the form of Equation 1, fit a suitable model to the average wind speed data using linear
regression to find an estimate for α at the three wind farm locations. Output these estimates
accurate to three significant figures along with the coefficient of determination r2
(to an appropriate number of decimal places) for each location to the command window using fprintf(). Plot the
fitted curves on the same figure as the raw data, using the following characteristics:
Ararat: Blue solid line;
Silverton: Green dashed line;
Boco Rock: Magenta dashed-dotted line;
Identify the likely terrain at each site using Table 1, and print your answers to the screen using
fprintf().
(b) Create an anonymous function that takes both the initial wind speed W0 and surface roughness α
as inputs and uses this to update the wind speeds in your dataset from initial height h0 = 100m, to
the correct height of h = 80m. Create a new variable for the corrected values. In the original figure
from Q2(a), plot the corrected wind speeds on the 3-by-1 subplot, so that the effect of the
correction can be easily visualised. Use a thicker linewidth so it is clear which plot is the updated
data.

PART 2 – PERFORMANCE ANALYSIS: 20 MARKS
Note: If you were unable to complete any or all questions in Part 1, you can try the questions in Part B using
the raw satellite data instead.

Figure 2: GE series 1.5 MW turbine, and diagram showing turbine characteristics

The performance index, Cp, is an indication of the percentage (%) of mechanical energy from the wind will be
converted into electrical energy. Not only is there more power in the wind when it blows at higher speeds, but
that power can also be more efficiently turned into electricity. This is described in the equation below:
P =
1
2
CpρAW3
[Eq. 2]
where P is the power generated (J/s), ρ is the density of air (= 1.225 kg/m3 at ground level), A is the swept area
of the turbine (= πr2 where r is the turbine blade length) and W is the windspeed in m/s.

Q4. (4 + 3 MARKS)
The file “performance_measures.txt”
contains a set of measurements of the
simplified Performance Index (Cp) for a GE 1.5
MW turbine at different wind speeds.
Plot the performance coefficient against the
corresponding wind speeds as red squares
(no marks – this is easy). The data should look
like the following:

Figure 3: Coefficient of performance data

When there is too little wind speed, not enough wind energy is present for the turbines to start or generate
power – this is called the cut-in speed WC. On the other hand, the furling speed, Wf , occurs when the wind is
too high, and the blades actively start to furl (rotate) to prevent any damage. The range of wind speed
immediately before Wf is when the turbines operate at a constant optimal power and where the performance
coefficient is maximum. The cutout speed, Wcutout , is when the blades are completely furled, held motionless
by a brake, and no longer generating power.
(a) Fit appropriate piecewise functions of your choice to any regions of the graph where Cp is not constant,
and plot these on the graph as a continuous line. You should now have fitted functions that can predict
Cp for any value of v. Be prepared to discuss your choice of fitted function. (Note: you do not need to fit
functions for regions of the graph where Cp is clearly constant: before the cut-in speed, after the cutout speed,
or in the region of maximum performance).
(b) Use one of these functions with an appropriate root-finding algorithm to estimate:
• the cut-in speed.
• the furling speed.
• the cutout speed
Show each of these points on the graph as a large blue asterisk.
Q5. (1 + 1 + 1 + 1 + 1 + 1 MARKS)
Question 5 consists of a range of questions covering rotational mechanics. Calculate each quantity within
Matlab, and print you answer for each question to the screen, including units and an appropriate number of
significant figures. You may also print any intermediate answers if you wish (partial marks may be awarded).
(a) Each turbine blade weighs approximately 1.5 tonnes (1500 kg). The furling speed is typically designed
to occur when the tip of the blades are travelling at around 120 m/s. Determine the force exerted on
the central housing by each turbine blade at this speed.
(b) Assuming the moving part of the central housing weighs around 500 kg, estimate the moment of inertia
for each assembled turbine. You can assume the central housing is a cylinder of 1 m diameter, and the
blades can be modelled as slender rods.
(c) The bearings in the housing can be assumed to have an effective coefficient of friction of 0.001 (note
that this is technically a “rolling friction”, but can be treated similarly to kinetic friction). Calculate the
expected friction force on the rotating turbine.
(d) This friction force is exerted steadily on the turbines to produce a torque. The force acts at a radius of
0.20 m. Determine the rotational deceleration of the turbine that is caused by this torque (ignoring
other forces e.g. from the generator).
(e) A turbine is rotating at a speed where its velocity at the tip of the blade is 30 m/s. Determine how long
it would take to decelerate to rest under this torque.
(f) Calculate the total energy that would be dissipated as heat as the turbine decelerates in part (e).

Q6. (4 + 3 MARKS)
(a) Create a function PowerFunc() that takes the following inputs:
• a vector of wind speeds W,
• a scalar cut-in speed WC and scalar furling speed Wf, the cutout speed, and any other endpoints
you’ve manually defined, and,
• the function handles for the equations determined in Q4(a)
• any other variables you need
and outputs a vector P containing the amount of power generated by the GE turbine for each entry of
the input windspeed (W) vector.
When the wind speed satisfies the inequality Wc ≤ W ≤ Wcutout , the function should return the expected
power output. Otherwise, the function should return 0 for the corresponding element in the array.
(b) The energy produced by a turbine is given by:
E = ∫ P(t)dt
t2
t1
[Eq. 3]
Use an appropriate numerical integration method to determine how much energy would have been
produced by a turbine at the Ararat site in the first full month of data collection (April 2020). Print
your answer to the screen. Plot the monthly power generation from the Ararat site over the entire
year and comment on the results of this plot using an fprintf() statement. Also compare this plot
to your raw data from Q2(a).
(Note 1: you will need to extract these data out from your dataset. Note: If you are not able to complete
part (a), you may integrate the windspeed data from Part 1 instead of the power data to demonstrate your
skills. Note 2: you should combine data from March 2020 and March 2021 into a single “March” datapoint.)

===========================================================================

EFB344 Assignment - Part A
Wednesday the 12 th of October, 2022 at 11:59pm

Overview
The task you are given is to estimate the market risk for a holding of 10,000 BHP shares (BHP.ax) and
1,000 CSL shares (CSL.ax), held on September 1 , 2022 (you are working out the risk position
assuming that you own these shares before the open of trading that day). You will do this by
estimating the Value-at-Risk for the stock portfolio. This will require you to choose the best VaR
model by backtesting several methods to determine the most reliable for the task at hand.
Description
You will be asked to calculate the following;
- 10 day VaR for the portfolio of shares at a confidence level of 99%.
Note: This risk estimate applies to the next 10 trading days from September 1, 2022 until September
14, 2022 (i.e. – it should be a forecast of risk).
Based on what you have learnt from EFB344, you are considering several options for how to
compute this risk measure, a) the normal distribution using the EWMA, b) the normal distribution
using a rolling window, or c) historical simulation based on a rolling windo. All methods require
choosing parameters to assign weight to past data, for the EWMA and the window length for the
other two. You will consider the following choices for each;
Normal EWMA for both volatilities and the covariance.
Normal Rolling Window Rolling window with 252 trading days.
Historical Simulation Rolling window with 252 trading days.

This leaves you with 3 possible models that could be used to provide the VaR measure asked for
above. You must choose the most appropriate model and report the associated VaR. To inform
your decision of which to use, you are going to consider the recent historical performance of the
three models in calculating 1 day VaR at the confidence level of 99%. You will do so by first
examining the frequency of instances when the VaR was exceeded by the observed return over the
prior five years.
You will then evaluate the appropriateness of these frequencies over time relative to the Basel
traffic light levels discussed in lectures. Based on this performance, select the best model and report
the required VaR(10, 99%) for September 1, 2022.

For solutions, purchase a LIVE CHAT plan or contact us

Limited time offer: