QBUS1040 Foundations of Business Analytics/COMP30026 Models of Computation/ECMT2130/ENSC3012/SLE226 DATA ANALYSIS

For solutions, purchase a LIVE CHAT plan or contact us

1. Average, RMS value, and standard deviation. Use formula (3.5) of VMLS to show that for any vector
x, the following inequalities hold:
(a) (5 points) |avg(x)| ≤ rms(x). Is it possible to have equality in this inequality? If |avg(x)| =
rms(x) is possible, give the conditions on x under which it holds.
(b) (5 points) std(x) ≤ rms(x). Is it possible to have equality in this inequality? If std(x) =
rms(x) is possible, give the conditions on x under which it holds.
2. Suppose x and y are Boolean feature vectors, i.e., each entry is either 0 or 1, that encode the presence
of symptoms in patients Alice and Bob.
(a) (5 points) Use x and y to express the number of symptoms that neither Alice nor Bob have.
(b) (5 points) Use x and y to express the number of symptoms Alice has, and Bob doesn’t.
3. (10 points) Can we generalise the triangle inequality to three vectors? That is, do we have
∥a + b + c∥ ≤ ∥a∥ + ∥b∥ + ∥c∥
for any n-vectors a, b, c? If so, justify it. If it’s not true, give a counter-example, i.e., specific vectors
a, b, and c for which the inequality is false.
4. (10 points) Taylor approximation. Consider the function f : R 2 → R given by f(x 1 ,x 2 ) = x 2
1 +
x 1 x 2 + 4x 2 + 1. Find the Taylor approximation
ˆ
f at the point z = (1,1). Compare f(x) and
ˆ
f(x)
for the following values of x:
x = (1,1), x = (1.05,0.95), x = (0.85,1.25), x = (−1,2).
Make a brief comment about the accuracy of the Taylor approximation in each case.
5. General norms. Any real-valued function f that satisfies the four properties given on page 46 of
VMLS (nonnegative homogeneity, triangle inequality, nonnegativity, and definiteness) is called a
vector norm, and is usually written as f(x) = ∥x∥ mn , where the subscript is some kind of identifier
or mnemonic to identify it. The most commonly used norm is the one we use in VMLS, the Euclidean
norm, which is sometimes written with the subscript 2, as ∥x∥ 2 . Two other common vector norms
for n-vectors are the 1-norm ∥x∥ 1 and the ∞-norm ∥x∥ ∞ , defined as
∥x∥ 1 = |x 1 | + ··· + |x n |, ∥x∥ ∞ = max{|x 1 |,...,|x n |}.
These norms are the sum and the maximum of the absolute values of the entries in the vector,
respectively. The 1-norm and the ∞-norm arise in some recent and advanced applications, but we
will not encounter them in this course.
(a) (6 points) Verify that the 1-norm satisfies the four norm properties listed on page 46 of VMLS:
nonnegative homogeneity, triangle inequality, nonnegativity and definiteness.
(b) (6 points) Verify that the ∞-norm satisfies the four norm properties listed on page 46 of VMLS:
nonnegative homogeneity, triangle inequality, nonnegativity and definiteness.
6. (10 points) Weighted norm. On page 51 of VMLS we discuss the importance of choosing the units or
scaling for the individual entries of vectors, when they represent heterogeneous quantities. Another
approach is to use a weighted norm of a vector x, defined as
∥x∥ w =
q
w 1 x 2
1 + ··· + w n x 2 n ,
where w 1 ,...,w n are given positive weights, used to assign more or less importance to the different
elements of the n-vector x. If all the weights are one, the weighted norm reduces to the usual
(‘unweighted’) norm. It can be shown that the weighted norm is a general norm, i.e., it satisfies
the four norm properties listed on page 46. Following the discussion on page 51, one common rule
of thumb is to choose the weight w i as the inverse of the typical value of x 2
i
in the application. A
version of the Cauchy-Schwarz inequality holds for weighted norms: For any n-vector x and y, we
have
|w 1 x 1 y 1 + ··· + w n x n y n | ≤ ∥x∥ w ∥y∥ w .
(The expression inside the absolute value on the left-hand side is sometimes called the weighted
inner product of x and y.) Show that this inequality holds. Hint. Consider the vectors ˜ x =
(x 1 √ w 1 ,...,x n √ w n ) and ˜ y = (y 1 √ w 1 ,...,y n √ w n ), and use the (standard) Cauchy-Schwarz in-
equality.
7. (15 points) Please refer to the Jupyter Notebook file for this problem.
8. (13 points) Please refer to the Jupyter Notebook file for this problem.

========================================================================================

COMP30026 Models of Computation
16 Sept 2022

Challenge 1 – Predictably Inconsistent Weather
The city of Melbourne, Australia is infamous for its predictably inconsistent weather. The mobile

apps Parrot and Carrot compete to predict the correct weather over the next three days. Mel-
bourne’s weather has a habit of making a fool of the apps’ developers, such that at any time, only

one of the two apps makes a correct prediction.
Despite this, Harald still can use this information to get accurate weather forecasts for the week,
so that they don’t get wet on their commutes to-and-from university. On a Monday, Harald checks
both the Carrot and Parrot apps, where they make the following predictions:
Carrot: “It will rain on Tuesday and Wednesday.”
Parrot: “If it rains on Monday, it will rain on Wednesday.”

Task 1A. Capture, as a single propositional formula, the information that was thereby available
to Harald. You will need to take into account which app makes each prediction. Use propositional
letters as follows:

C ∶ Carrot’s prediction is correct P ∶ Parrot’s prediction is correct
M ∶ It rains on Monday T ∶ It rains on Tuesday W ∶ It rains on Wednesday
Task 1B. Harald tries to determine the weather forecast for the week from those two predictions,
but realises they do not yet have enough information. Determine which truth assignments to
C, P , M, T , W make your formula from Task 1A true.
Task 1C. Harald opens their window blinds and looks outside to check for any chance of rain.
Based on that information, they now knew exactly what the weather would be for Monday, Tuesday
and Wednesday. Given this information, determine, for each of Monday, Tuesday and Wednesday,
whether it rains or not.
Submission and Marking: Your answer should be submitted on Grok. You will find the
submission format explained there. You will receive some feedback from some elementary tests.
These merely check that your input has the correct format; they should not be relied upon for
correctness. We will test your solution comprehensively after the deadline. Task 1A is worth 1
mark, the rest are worth 0.5 marks each.

Challenge 2 – Negative Implications
We have seen that implication can be re-written into an equivalent formula using the following
equivalence

F ⇒ G ≡ ¬(F ∧ ¬G) (2.1)
In this challenge we will generalise this result to rewrite all of the connectives we have seen
using ∧ and ¬. To this effect, we will show that {∧, ¬} is functionally complete as we can represent
all formulas using only ∧ and ¬.
Task 2A. Using the equivalence defined in (2.1), re-write the following formula to remove all
instances of the ⇒ connective. You must not perform any other transformations.

(¬P ⇒ (¬Q ∧ Q ⇒ f)) ⇒ (((P ⇒ ¬(R ⇒ Q)) ∧ ¬P ) ⇒ ¬(P ⇒ ¬(R ⇒ Q))) (2.2)
Task 2B. The formula (2.2) can be simplified. Using only the equivalences (2.1) and (2.3)–
(2.5) you can simplify your answer for Task 2A. Provide the most simplified formula using (2.1)
and (2.3)–(2.5), with no instances of ⇒. This should contain the smallest number of connectives
possible.

F ∧ G ≡ G ∧ F (2.3)
¬F ∧ (F ⇒ G) ≡ ¬F (2.4)
¬¬F ≡ F (2.5)
Task 2C. Generalising the re-writing rule (2.1), we can re-write all other connectives using only
∧ and ¬. Write a Haskell function that can re-write any formula into an equivalent formula that
uses only ∧, ¬, and any variables. Your function should not produce any double-negatives, such
as ¬¬P.
Submission and Marking: Your answer should be submitted on Grok. You will find the
submission format explained there. You will receive some feedback from some elementary tests.
These merely check that your input has the correct format; they should not be relied upon for
correctness. We will test your solution comprehensively after the deadline. Task 2A and Task 2B
are worth 0.5 marks each for a correct answer; Task 2C is worth 1 mark based proportionally on
the number of passed test cases.

Challenge 3 – Logic on Display

i
j k
l

m n

o
p

In this challenge we will consider an unconventional 8-segment display which
is like a 7-segment display, but has an additional diagonal LED from the
top-right to bottom-left of the display. Arrays of such displays are commonly

used to display characters in remote controls, blood pressure monitors, dish-
washers, and other devices. We label each LED i–p, with p being the diagonal

segment, as shown here.
Each LED can be on or off, but in most applications, only a small number
of on/off combinations are of interest (such as the ten combinations that allow
the display of a digit in the range 0–9). In that case, the display can be
controlled through a small number of input wires with four wires providing
2
4
input combinations, enough to cover the ten different digits.
Here we are interested in using an 8-segment display for some Greek letters. We want it to be
able to show eight different letters, namely Α, Β, Γ, Δ, Ε, Ζ, Η, and Λ. For example, to show Α,
all the display segments, except o and p, should be lit up, giving the pattern A. In detail, we want
the eight different letters displayed respectively as:

A, B, C, D, E, F, H, L

Since there are eight letters, we need three input wires, modelled as propositional variables P, Q,
and R. We will need to decide on a suitable encoding of the eight letters. One possibility encoding
of the eight letters is to let A correspond to input 000 (that is, P = Q = R = f), B to 001 (that
is, P = Q = f and R = t), etc. If we do that, we can summarise the behaviour of each input
combination in the table below:

letter P Q R i j k l m n o p display
Α 0 0 0 1 1 1 1 1 1 0 0 A
Β 0 0 1 1 1 1 1 1 1 1 0 B
Γ 0 1 0 1 1 0 0 1 0 0 0 C
Δ 0 1 1 0 0 1 0 0 1 1 1 D
Ε 1 0 0 1 1 0 1 1 0 1 0 E
Ζ 1 0 1 1 0 0 0 0 0 1 1 F
Η 1 1 0 0 1 1 1 1 1 0 0 H
Λ 1 1 1 0 0 1 0 0 1 0 1 L

Each of the eight segments i–p can be considered a propositional function of the variables P, Q,
and R. This kind of display can be physically implemented with logic circuitry, using circuits to
implement a Boolean function for each of the outputs. Here we assume that only three types of logic
gates are available: An and-gate takes two inputs and produces, as output, the conjunction (∧) of
the inputs. Similarly, an or-gate implements disjunction (∨). Finally, an inverter takes a single
input and negates it (¬). We can specify such a circuit by writing down the Boolean equations for
each of the outputs i–p. For example, segment i is turned off (is false) when the input is 011, 110,
or, 111. So, i can be captured as (P ∨ ¬Q ∨ ¬R) ∧ (¬P ∨ ¬Q ∨ R) ∧ (¬P ∨ ¬Q ∨ ¬R).

For efficiency reasons, we often want the circuit to use as few gates as possible. For ex-
ample, the above equation for i shows that we can implement this output using fifteen gates.

But i = ¬(¬P ∧ Q ∧ R) ∧ ¬(P ∧ Q ∧ ¬R) ∧ ¬(P ∧ Q ∧ R) is an equivalent implementation, using
fewer gates. Moreover, the eight functions might be able to share some circuitry. For example,
if we have already implemented a sub-circuit defined by u = Q ∧ R (introducing u as a name for
the sub-circuit), then we can define i = ¬(¬P ∧ u) ∧ ¬(P ∧ Q ∧ ¬R) ∧ ¬(P ∧ u), and we may be
able to re-use u while implementing the other outputs (rather than duplicating the same gates).
In some cases, it may even be feasible to design a circuit that is not minimal for a given function,
but provides a minimal solution when all eight functions are designed.

Task 3A. Design such a circuit, using as few gates as possible. You can define any number of
sub-circuits to help you reduce the gate count (simply give each a name).
Submission and Marking: Your answer should be submitted on Grok. Submit a text file
circuit.txt consisting of one line per definition. This file will be tested automatically, so it is
important that you follow the notational conventions exactly. We write ¬ as - and ∨ as +. We
write ∧ as ., or, more simply, we just leave it out, so that concatenation of expressions denotes
their conjunction. Here is an example set of equations (for a different problem):
# An example of a set of equations in the correct format:
i = -Q R + Q -R + P -Q -R
j = u + P (Q + R)
k = P + -(Q R)
l = u + P i
u = -P -Q
# u is an auxiliary function introduced to simplify j and l
Empty lines, and lines that start with ‘#’, are ignored. Input variables are in upper case.
Negation binds tighter than conjunction, which in turn binds tighter than disjunction. So the
equation for i says that i = (¬Q ∧ R) ∨ (Q ∧ ¬R) ∨ (P ∧ ¬Q ∧ ¬R). Note the use of a helper
function u, allowing j and l to share some circuitry. Also note that we do not allow any feedback
loops in the circuit. In the example above, l depends on i, so i is not allowed to depend, directly
or indirectly, on l (and indeed it does not).
To test your equations and count the number of gates used, you can click Terminal and enter
the command test. To stop the Terminal, click Stop.
There is one mark for a correct solution. An additional 0.5 is awarded if a correct solution uses
26 gates or fewer. A further 0.5 is awarded if a correct solution uses 20 gates or fewer.
Optionally, you can submit your circuit design to a leaderboard. Your position on this board is
not reflective of your final grade and can be used anonymously. The leaderboard site can be found
here https://comp30026.ddns.net/leaderboard.

Challenge 4 – Property-Based Testing
Unlike unit tests that only test a single use case of a program, Property-Based Testing allows
programmers to provide a specification of their programs, as logical properties that should hold if
their program is implemented correctly. There are two types of properties that one can test about
their code. One is data invariants which are light sanity checks and the others are full functional
specifications.
For example, consider the following definition of reverse in Haskell
reverse :: [a] -> [a]
reverse [] = []
reverse (x:xs) = reverse xs ++ [x]
A nice property of a reverse function is that when composed with itself, it is the identity.
That is, to say

∀a reverse (reverse a) = a (4.1)
This property is not enough to show that the reverse is functionally correct, so we say that
this is a data invariant of reverse. For example, consider we replaced the definition of reverse
with id (the identity function), the property (4.1) still holds.
For a full specification of the reverse function, we require a stronger property as follows
∀a length (reverse a) = length a
∧ ∀i 0 ≤ i < length a ⇒ reverse a !! i = a !! (length a − i − 1) (4.2)
This specifies that the length of the reversal of some list, a, is the same as the length of a, and
for each value of the reversal of some list, a, at index i, the value in a at the opposing end of a
must be the same as said value in the reversal of a. Take a moment to convince yourself this is
true for a correct implementation of reverse and some example lists a.
In Haskell, the QuickCheck library1 provides property-based testing. This library is probably
older than many of you, appearing first in 1999, and has been ported to well over 30 languages. In
summary, QuickCheck generates a series of randomised inputs that are tested against the properties
similar to the properties we have previously seen. QuickCheck then reports any test cases that fail
as counter-examples. You do not need to learn or understand QuickCheck to solve this
challenge.
Through this challenge, we will use a simplified model of property-based testing in Haskell. We
are concerned with the functional correctness of sorting functions. To begin, we will be considering
the following merge sort, msort.
msort :: (Ord a) => [a] -> [a]
msort xs@(_:_:_) = msort (take n xs) `merge` msort (drop n xs)
where n = length xs `div` 2
merge [] rs = rs
merge ls [] = ls
merge lls@(l:ls) rrs@(r:rs)
| l < r = l:merge ls rrs
| otherwise = r:merge lls rs

msort xs = xs
As we may have multiple sorting functions to test, the functions we will implement to test for
these properties we will see will parameterise which sorting function is to be tested, and so will
have the following form
1https://hackage.haskell.org/package/QuickCheck

sortProperty :: (Ord a) => ([a] -> [a]) -> [a] -> Bool
sortProperty sort input = {- Boolean expression -}
Testing individual sorting functions is then performed with sortProperty msort, etc.
Task 4A. Implement a function, sortLength, that checks the following property: the result of
sorting some input must have the same length as the input.
Task 4B. Implement a function, sortHead, that checks the following property: if the input is
not empty, then the head element of the sorted input is the least element of the input.
Task 4C. Implement a function, sortIsSorted, that checks the following property: the result of
sorting some input is in sort-order, i.e., the result is a non-decreasing list.
Task 4D. The following is a functional specification of all sorting functions:
sortSpec :: (Ord a) => ([a] -> [a]) -> [a] -> Bool
sortSpec sort input =
elem (sort input) (permutations input) && sortIsSorted sort input
That is to say, a function sorts its input if the output is a permutation of the input and the
output is in sort-order.2
With the following (incorrect) implementation of qsort, provide two values, example and

counterExample, that are an example and a counter-example, respectively, for the sortSpec func-
tional specification with respect to the qsort function.

qsort :: (Ord a) => [a] -> [a]
qsort [] = []
qsort (pivot:rest) = qsort lesser ++ [pivot] ++ qsort greater
where lesser = filter (< pivot) rest
greater = filter (> pivot) rest

That is, sortSpec qsort example should be True and sortSpec qsort counterExample
should be False.
Submission and Marking: Your answer should be submitted on Grok. You will find the
submission format explained there. You will receive some feedback from some elementary tests.
These merely check that your input has the correct format; they should not be relied upon for
correctness. We will test your solution comprehensively after the deadline. Your functions should
hold only for the properties asked. Each of Task 4A, Task 4B, and Task 4C are worth 0.5 marks,
each based proportionally on the number of test cases passed. Task 4D is worth 0.5 marks, with
0.25 marks awarded for each correct value provided.
2Note: this specification depends on the definition of the property from Task 4C that you must implement yourself.

Challenge 5 – Interpreting Resolutions
Consider the following predicate logic formulas.

F ∶ (∀x Q(x)) ∨ ∃x((∀y R(y, x) ∨ Q(x)) ⇒ ∃z∀y P (z, y))
G ∶ ∃x∀y (P (x, y) ∨ (∃z R(y, z) ⇒ ∀w Q(w)))

Task 5A. Show that F is non-valid, by providing an appropriate interpretation I.
Task 5B. Show that F ∨ ¬G is valid using resolution, explicitly stating all substitutions used.
Submission and Marking: Your answers to Challenge 5 and Challenge 6 should be submitted
through Gradescope as a single PDF document, no more than 2 MB in size. Marks are primarily
allocated for correctness, but elegance and how clearly you communicate your thinking will also be
taken into account. The process of resolution should be displayed as a tree.

Challenge 6 – Evenness
The notation we use for first-order predicate logic includes function symbols. This allows a very
simple representation of the natural numbers. Namely, for natural numbers, we use terms built
from a constant symbol (here we choose a, but any other symbol would do) and a one-place function
symbol (we will use s, for “successor”). The idea is that 0 is represented by a, 1 by s(a), 2 by
s(s(a)), and so on. In general, s(x) represents the successor of x, that is, x+1. Logicians prefer this
“successor” notation, because it uses so few symbols and supports recursive definition-—a natural
number is either ‘a’ (the base case), or it is of the form ‘s(x)’, where x is a term representing a
natural number. (Of course, for practical use, we prefer the positional decimal system.)
With successor notation, we can capture addition by introducing a predicate symbol for the
addition relation, letting P (x, y, z) stand for x + y = z:

∀x P (a, x, x) (Identity element) (6.1)
∀x∀y∀z (P (x, y, z) ⇒ P (s(x), y, s(z)) (Recursive relation) (6.2)
∀x∀y (P (x, y, z) ⇒ P (y, x, z)) (Commutativity) (6.3)
Similarly, using the addition relation we can now define the evenness of a number by introducing
the predicate symbol for evenness, letting E(x) stand for x is even:

∀x∃y (E(x) ⇒ P (y, y, x)) (6.4)
∀x∀y (P (y, y, x) ⇒ E(x)) (6.5)
Task 6A. Now, the goal is to prove the well-known property of natural numbers that if n is an
even number then n + 2 is also even. Use resolution to show that

∀x (E(x) ⇒ E(s(s(x)))) (6.6)

is a logical consequence of the axioms (6.1)–(6.5).
Task 6B. We have defined what an even number is and a theorem about even numbers, but we
still don’t know if even numbers exist! Using resolution, show that
∃x E(s(s(s(x))))

is a logical consequence of the axioms (6.1)–(6.5) and the theorem (6.6). The resolution proof
provides a sequence of most general unifiers, one per resolution step, and when these are composed
in the order they were generated, you have a substitution that solves the constraint E(s(s(s(x)))).
Give that substitution and explain what it means in terms of natural numbers.
Submission and Marking: Your answers to Challenge 5 and Challenge 6 should be submitted
through Gradescope as a single PDF document, no more than 2 MB in size. Marks are primarily
allocated for correctness, but elegance and how clearly you communicate your thinking will also be
taken into account. The process of resolution should be displayed as a tree.

==========================================================================================

ECMT2130

1. Portfolio optimisation
Use the data allocated to you to answer the following questions.
Throughout this assignment, all rates of return are simple monthly rates of return (not annualised),
expressed as a decimal rather than a percentage. Thus, for example, a rate of return of 5% would be
expressed in the data set as 0.05.
Unless otherwise stated: the investor can invest in all of the managed funds but cannot invest in the
risk-free asset; and the investor is fully invested.
Use the most recent value of the simple monthly risk-free rate of return as the “current risk-free rate of
return” when analysing the following portfolio optimisation problems.
Compute the excess rates of return for each fund, in each available time period. Add the current risk-free
rate of return to these excess rates of return for each fund to produce simple monthly rates of return
that do not incorporate risk-free rate of return variation as a source of risk.
Compute mean simple monthly rates of return on a fund using these adjusted simple monthly rates of
return (average over the entire available sample). If this is not clear, ask for clarification on Ed!
Also use these adjusted simple monthly rates of return to estimate the variance-covariance matrix for
the funds’ rates of return. Again use the entire available sample.
(a) (2 points) Estimate the mean and the standard deviation of the simple monthly rate of return for
the Global Minimum Variance Portfolio (GMVP). Report the standard deviation as a percentage,
to 2 decimal places.
(b) (2 points) Estimate the mean and the standard deviation of the simple monthly rate of return for
the portfolio with minimum variance when shorting is allowed but no fund is allowed to have a
weight with an absolute value greater than 20%.
(c) (2 points) Estimate the mean and the standard deviation of the simple monthly rate of return for
the portfolio with minimum variance when no shorting is allowed and no fund can have a weight
above 20%.
(d) (2 points) Estimate the standard deviation of the simple monthly rate of return for the minimum
variance portfolio that has an expected simple monthly rate of return that is double that of the
GMVP.
(e) (2 points) Estimate the slope of the optimal Capital Allocation Line when the investor can invest
in the risk-free asset and faces no portfolio weight restrictions (other than the full investment
requirement). Use the current risk-free rate of return when solving this problem.
(f) (2 points) Estimate the mean and the standard deviation of the rate of return for the optimal
portfolio for an investor with expected utility described by the following equation:

E(U) = E(rp) − 2σ
2
p

(1)
Assume that the investor can invest in all of the risky funds and in the risk-free asset and assume
that there are no constraints on their asset weights (aside from the full-investment requirement).
(g) (2 points) Using the asset weights for the optimal portfolio of the investor with expected utility
shown in equation 1, estimate the excess kurtosis of the portfolio’s simple monthly rate of return.
Use all of the available historical data to produce this estimate.
(h) (6 points) Critique the optimal portfolio weights found for the investor in part F, in terms of their
portfolio optimisation methodology. This part must be answered in no more than 300 words.

2 Assessment data
Each dataset is a numbered file that ends with the “.RData” suffix. Each file is an RData file, containing
the name and value for a number of R variables. You need to write an R script to load the variables from
the right dataset file and analyse the data stored in these variables.
All of the RData files are contained in the same ZIP file that is available on Canvas. Download that ZIP
file, extract its contents and then analyse the data in the RData file that is allocated to you. The number of
the data file that you have been allocated is listed beside your name in the assignment page on Canvas.
Download the zip file containing all of the possible data files and extract the contents of the zip file into
the folder where you want to do your work.
Your dataset contains 3 eXtensible Time Series (XTS) variables:
• fundTotalReturnIndices
• riskFreeRateOfReturn
• markexTotalReturnIndex
Each of these variables contains monthly data relating to rates of return from the beginning of 2000, to
the end of 2021.
The total return index for the portfolio managed by each of different fund managers is contained in the
variable named fundTotalReturnIndices. The total return index for the fund manager “i” has column name:
v_i
The data for the monthly risk-free simple rate of return is stored in the riskFreeRateOfReturn variable
and its column name is:
r_i
The monthly data for the market total return index is stored in the markexTotalReturnIndex variable
and its column name is:
v_m
You can load the variables from your RData file by running the R function:
load("Assignment 1 dataset X.RData")
replacing “X” with the number of the data set that has been allocated to you.
Make sure that folder containing your data file and your R script is your working directory in R Studio.

Evidence of work
Write an R script that uses the provided data to compute the data you want to export to Microsoft Excel
for portfolio optimisation. Note that you can make your own choices here. Use R or Excel, for example, to
estimate the relevant means, variances and covariances that you need to use to solve the portfolio optimisation
problems.
You can save data to a CSV file using the R command:
write.csv(variableName, "nameOfCSVFile.csv", sep=",")
The grid of data in the variable called “variableName” will be written out to the named CSV file with
commas separating each item in each row.

================================================================================================

1. Two independent samples have the same sample number (n1 = n2 = 25) and the sample
standard deviation are identical (. The sample information shows that the 95% Confidence
Intervals of the two population means overlap by 1. Is there enough evidence to show the
two population means are different ( =5%)? (2 marks)
All analysis, figures (if any) and results need to be included in the assignment report in a word
document.

2. On January 28 th , 1986, the Challenger space shuttle exploded one minute into flight, killing all 7
astronauts aboard. On the night before the launch, engineers warned NASA that it was suspected that
rocket booster O-rings failed at low temperatures. The forecast for launch day was an unusually cool 30
F (-1 o C). The full NASA data set (in chronological order) is shown below.
If this data was available before lunching, would you have aborted the mission? Use the data to explain
how you make your decision (2 marks).
Write your own python code to answer the question. All figures and analysis need to be included in the
assignment report in a word document. The python code (.ipynb file or .py file) needs to be uploaded to
LMS as a separate file.

Launch temperature (F) Number of O-ring failures

66 0
70 1
69 0
68 0
67 0
72 0
73 0
70 0
57 1
63 1
70 1
78 0
67 0
53 3
67 0
75 0
70 0
81 0
76 0
79 0
75 2
76 0
58 1

3. Write a summary about the conditions for using the different hypothesis tests that you have learnt
through module 1 of GENG2012 (1 mark).

===========================================================================================================

SLE226 DATA ANALYSIS
5 Sept 2022

Question 1. How dependent are animal heart rates on body weights? (7 Marks)
It has long been speculated that the heart rate of animals is dependent on the body weight
of the animals. You are provided with a dataset consisting of 20 different animals ranging in
size. For each group you also have the heart rate of those animals.
Using the data provided, conduct an analysis that determines the degree of dependence of
the heart rate of animals on their body size.
 What is the null hypothesis of this test?
 What test would you use to test this hypothesis?
o Why?
 What assumptions did you test (display results)?
 Display the results of this test with the most appropriate graph.
 Provide formulas that would allow anyone to estimate the heart rate (and 95%
Confidence range) of an animal of a given body weight (they must have all the
detail in it).
o What is the estimated heart rate of a 7.5kg animal? Please also provide
estimates of the upper and lower 95% confidence intervals.

Animal Heart rate (beats/minute) Body weight (g)
Human 60 90000
Cat 150 2000
Small dog 100 2000
Medium dog 90 5000
Large dogs: 75 8000
Hamster 450 60
Chick 400 50
Chicken 275 1500
Monkey 192 5000
Horse 44 1200000
Cow 65 800000
Pig 70 150000
Rabbit 205 1000
elephant 30 5000000
giraffe 65 900000
large whales 20 120000000

Question 2. Is Black rat activity influenced by canopy cover? (5 Marks)
A recent study was undertaken on the Summerland Peninsula of Phillip Island to
assess small mammal distributions and abundance. The Peninsula was chosen as it
supports vast populations of endangered colonial-nesting seabirds and is therefore
of conservation significance.
During the study the introduced black rat (Rattus rattus) was found to be the most
prevalent species. Black rats have been shown to impact endangered seabird
populations, including the Little Penguin (Eudyptula minor) and the Short-tailed
Shearwater (Ardenna tenuirostris) and therefore, an understanding of what is
influencing rodent populations is critical.
Previous studies have found that the percentage of canopy cover has an influence
on rodent activity levels. To determine if this occurs on Phillip Island 62 camera trap
sites were established, these were used to determine black rat activity levels
(number of camera events per night), within broad canopy cover treatments (High,
Moderate, Low). You are provided with black rat activity data for each site within the
different canopy cover treatments.
Conduct an analysis which allows you to determine if canopy cover has an influence
on black rat activity levels.
Provide details on:-
 What test you used and why?
 What is your null hypothesis?
 What assumptions did you test and why?
 Provide a graph which adequately summarises this data.
Please provide detailed answers and show results from all tests and provide
interpretations on each test.

Canopy
Cover
class

Rat activity Canopy
Cover
class

Rat activity Canopy
Cover
class

Rat activity

High 10.83 Moderate 0.75 Low 9.40
High 0.25 Moderate 3.50 Low 2.75
High 9.00 Moderate 20.67 Low 6.56
High 12.00 Moderate 17.80 Low 5.67
High 13.13 Moderate 8.88 Low 0.30
High 8.13 Moderate 3.70 Low 0.50
High 6.50 Moderate 16.20 Low 6.25
High 7.00 Moderate 1.44 Low 4.20
High 13.88 Moderate 12.20 Low 0.38
High 22.60 Moderate 3.29 Low 6.70
High 8.88 Moderate 17.60 Low 7.00
High 4.10 Moderate 8.63 Low 2.13
High 9.50 Moderate 21.60 Low 19.00
High 9.44 Moderate 5.44 Low 13.22
High 23.67 Moderate 14.11 Low 10.33
High 21.00 Moderate 4.50 Low 6.40
High 13.67 Moderate 31.33 Low 19.20
High 9.82 Moderate 17.80 Low 2.55
High 9.27 Moderate 17.80 Low 1.40
High 11.33 Moderate 12.30 Low 0.11
Moderate 12.00 Low 0.13

Question 3. Diet of the Powerful Owl (Ninox strenua)(4 Marks)
A number of studies have examined the diet of the Powerful Owl in Australia. A
recent Deakin study has examined the diet of the Powerful Owl in the Yarra valley
corridor. The Powerful Owl appears to be a specialist predator of arboreal
marsupials. From the data set provided determine if the diet of the Powerful Owl is
similar from season to season. The diet has been determined from regurgitated
pellets (vomit balls). Each pellet contains the remains of only one species.
Therefore, each pellet can be assigned to only one of the different prey items.
Provide:-
 A detailed analysis of the data.
 The null hypotheses tested.
 What statistical test did you use and why.
 If any differences occur suggest where they are.
 Provide a graph that represents this data.
 Remember to present all answers as scientifically as possible.
Prey item Summer Autumn Winter Spring
Common Ringtail Possum 442 413 411 350
Common Brushtail Possum 118 138 152 208
Sugar Glider 50 48 27 47
Greater Glider 20 31 33 38

Question 4. Home range size in western ringtail possum. (6 Marks)
Knowing how much space an individual animal requires is an important part of any species
management/conservation strategy. The western ringtail possum (Pseudocheirus
occidentalis) is an endangered species endemic to southwest Western Australia; however,
not much is known about this species. After a few years of radio-tracking 37 possums, a
researcher has estimated their overall home ranges (entire area required by an animal for its
activities). She now wants to find out whether the size of the estimated home range is
influenced by sex and the type of barriers nearby (a road and a drain).
Using the dataset provided, answer the following questions:
 Does sex influence the home range size of the possum?
 Does the type of nearby barrier influence the home range size of the possum?
 Is there any interaction between sex and nearby barrier on the size of home ranges?

Sex Barrier HR_m2 Sex Barrier HR_m2
F Drain 1648 M Drain 5539
F Drain 1335 M Drain 1848
F Drain 2003 M Drain 6375
F Drain 2117 M Drain 7047
F Drain 1323 M Drain 2934
F Drain 971 M Drain 4804
F Drain 3257 M Drain 2425
F Drain 2464 M Road 1199
F Drain 1550 M Road 1279
F Road 1396 M Road 1521
F Road 2755 M Road 1859
F Road 1831 M Road 3162
F Road 713 M Road 4248
F Road 1357 M Road 4143
F Road 1584 M Road 1200
F Road 1054 M Road 2857
F Road 783 M Road 1124
F Road 497 M Road 902
M Road 4767

Question 5. Can a visual estimation of feather quality determine the level of success
when DNA fingerprinting powerful owls (Ninox strenua). (4 Marks)
The ability to extract DNA from feather samples has revolutionised population
genetic studies. One of the difficulties however is that the DNA in dropped feathers
degrades and can lead to failed results. Micro-satellites are a useful genetic
technique and represent repeat sequences of genetic code that have proven
effective in being able to genetically finger print individuals. The ability to identify
individual birds successfully depends on the number of micro-satellites that work
during genetic analysis. The more micro-satellites that work the better the ability to
determine individuals.
In a recent study 13 powerful owl specific micro-satellites were developed. It is
extremely expensive to screen feather samples for micro-satellites so the
researchers were looking for a way to determine which feathers had a good chance
of producing a large number of successful micro-satellites (the more the better).
The researchers proposed that feathers that looked in good condition (no obvious
environmental damage) would produce better outcomes than those feathers which
were slightly poorer in quality (some signs of environmental damage; classed as
moderate quality). If this was the case they could reduce their costs by only testing
for micro-satellites in feathers that were visually of high quality.
You are provided with data from 40 feather samples of powerful owls. The
researchers screened 20 high quality feathers and 20 moderate quality feathers
against 13 micro-satellite markers. The data provided represents the number of
microsatellite markers which were successfully amplified during genetic analysis.
Feather
quality

Number of
successful
microsatellites

Feather quality Number of
successful
microsatellites
Moderate 2 Good 6
Moderate 3 Good 7
Moderate 3 Good 7
Moderate 4 Good 8
Moderate 4 Good 8
Moderate 4 Good 8
Moderate 4 Good 9
Moderate 5 Good 10
Moderate 7 Good 10
Moderate 7 Good 11
Moderate 8 Good 11
Moderate 8 Good 11
Moderate 8 Good 11
Moderate 9 Good 11
Moderate 9 Good 12
Moderate 9 Good 12
Moderate 10 Good 13
Moderate 10 Good 13
Moderate 11 Good 13
Moderate 12 Good 13

Conduct an analysis which allows you to determine if visually assessing feather
quality has an impact on the number of successful micro-satellite amplifications.
Provide details on:-
 What test you used and why.
 What is your null hypothesis?
 What assumptions did you test and why
 Provide a graph which adequately summarises this data.
 Please provide detailed answers and show results from all tests and provide
interpretations on each test.
Data provided by Dr Fiona Hogan (Federation University)

Fig. 1 Characterizing shed feather condition by visual assessment of degradation. (a)
Good: no visible signs of degradation, trans- parent calamus, intact barbs on the vane;
(b) Moderate: visible signs of degradation and environmental exposure, yellowing
appearance on the calamus, barbs on the vane separated.

For solutions, purchase a LIVE CHAT plan or contact us

Limited time offer: