MAT013 Class Test 2020
Instructions:
- This is an open book test. You are encouraged to use the internet, your notes, etc.
However, no communication with any other student is allowed. - The output of this class test will be: a file containing the SAS program with any comments included and a file containing the R code with any comments included.
Once you have finished the test:
- Call the file for the SAS code:
STUDENTNUMBER-SAS-lastname
(eg. 123456-SAS-Evans) and call the file for the R code:STUDENTNUMBER-R-lastname
(eg. 123456-R-Evans). - Email both files to Andrey Pepelyshev with ‘MAT013-STUDENTNUMBER-lastname’ as the subject.
- The duration of the class test is one hour.
- The email with R+SAS codes must be sent before 5.05pm on April 21, 2020.
The class test contains 4 questions. Each question contains a few tasks.
Questions for the class test:
Question 1
Please, answer this question using R:
-
Write code that will obtain \(k\) random points \((x,y)\) where \(x\) is uniformly sampled between 0 and 3, and \(y\) is uniformly sampled between 0 and 2.
[5]
-
Identify how many of these points satisfy \(1+\sin(x^2+y/49)\leq y\); this number will be referred to as \(N=N(k)\).
[5]
-
Create a scatter plot with these points satisfying \(1+\sin(x^2+0.1y)\leq y\) for \(k=500\).
[5]
-
Plot \(\frac{6N(k)}{k}\) for \(100\leq k\leq 5000\) and comment on the result.
[10]
Question 2
Please, answer this question using both R and SAS:
-
Write code in R and a macro in SAS that will create a separate pdf file of a scatter plot for every data set in the compressed directory scatterdata.zip. All files in the directory contain two columns of numerical data. Use the name of each file as the name of the pdf file.
[10 R + 15 SAS]
Question 3
Please, answer this question using both R and SAS:
The data set tournament.csv contains data of the form:
Player 1 Name,Player 2 Name,P1 Score Rep 1,P2 Score Rep 1,P1 Score Rep 2,P2 Score Rep 2,P1 Score Rep 3,P2 Score Rep 3,P1 Score Rep 4,P2 Score Rep 4,P1 Score Rep 5,P2 Score Rep 5
Suspicious Tit For Tat,ALLCorALLD,10,10,32,27,32,27,32,27,32,27
Defector,Win-Stay Lose-Shift,30,5,30,5,30,5,30,5,30,5
This data set corresponds to a tournament between a number of players (the game itself is not important). Every row corresponds to a match between two players in a tournament. The first two columns are names of the two players. The match between these two players is repeated 5 times and that is what is in the subsequent columns.
The above example shows that a player named Suspicious Tit For Tat
played
a player call AllCorAllD
. In their first match they both scored 10, in their
second match Suspicious Tit For Tat
scored 32 nd AllCorAllD
scored 27. This
score was then repeated for the next 3 repetitions.
-
How many matches are in the data set? How many players are in the data set?
[5 R + 5 SAS]
-
What is the total score for the first player and second player in each game?
[5 R + 5 SAS]
-
Obtain a distribution/histogram of total scores for each player over all games?
[10 R + 10 SAS]
Question 4
Please, answer this question using R:
- For a time series \(x_1,...,x_n\),
let us define the function
where \(\lambda\in[0,\pi]\),
\(m\) is the mean of \((x_1,...,x_n)\).
The function \(P(\lambda)\) is called a periodogram.
Write a function GetPeriodogram for computing and depicting the periodogram and exporting the graph to the pdf-file with specified filename.
Apply the function GetPeriodogram to the time series \(x_j=\sin(2\pi j/14)+e_j\), \(j=1,...,100\), where \(e_j\) is Gaussian white noise; that is, \(e_j\) can be computed as a realization of a random variable with Gaussian distribution.
[10]