MAT013 Coursework

Deadline: April 28, 2020

Instructions

The outputs of this coursework will be:

  • A written report in doc-, pdf- or html-format describing your code (SAS and R) and screenshots with comments to be handed in to Andrey Pepelyshev by email.
  • A file containing the required SAS code. Name this file STUDENTNUMBER-SAS-lastname (eg. 123456-SAS-Evans) and email it to Andrey Pepelyshev with MAT013 as the subject. Note that all operations needed to complete the coursework should be included in the SAS code.
  • A file containing the required R code. Name this file STUDENTNUMBER-R-lastname (eg. 123456-R-Evans) and email it to Andrey Pepelyshev with MAT013 as the subject. Note that all operations needed to complete the coursework should be included in the R code.

Coursework

  1. Using R:

    In number theory, a polite number is a positive integer that can be written as the sum of two or more consecutive positive integers. Other positive integers are impolite.

    Create a function that will give all impolite numbers less than \(k\) (an input). Demonstrate this function with \(k=70\).

    Furthermore write another function which takes also as input the name of a file and writes polite numbers with their sum representations to a csv file with that name. Demonstrate this function with \(k=70\) and the file name “politenumbers.txt”.

    [20]

  2. A perfect number is a natural number that is equal to the sum of its divisors (excluding itself). For example \(1,2,4,7\) and \(14\) divide \(28\) and \(28=1+2+4+7+14\).

    Write code in SAS that allows one to write to a csv file a data set with all natural numbers less than a given parameter \(N\) as well as a boolean variable indicating if the number is perfect or not. For example, for \(N=6\) the csv file would contain the following:

     1, False
     2, False
     3, False
     4, False
     5, False
     6, True
    

    [30]

  3. Using R:

    Write a function that will return the \(n\)th Fibonacci number, \(F(n)\).

    Modify the function so that it returns the \(n\)th number of the sequence defined by:

    where \(a,b,\alpha\) and \(\beta\) are input parameters.

    Write another function so that it will write all numbers of the form \(K(n)\) less than some number \(k\) to a csv file. The name of the csv file must not be an input parameter to the function but include the parameters \(a,b,\alpha\) and \(\beta\) as well as the date on which the code was run. For example: general_fib_a=2_b=3_alpha=10_beta=2_2020-04-24.csv.

    [10]

  4. Using R:

    Load the dataset Seatbelts from library datasets in R. The dataset has 192 observations of monthly data on the number of deaths and serious injuries in the UK, for some months before and after introducing the legislation for seatbelts. Each row represents a month and for each month we measured 8 variables including the number of car driver deaths (DriversKilled), the number of deaths and serious injuries (drivers), the number of deaths and serious injuries in the front seat (front), the number of deaths and serious injuries in the rear seat (rear), distance driven (kms), petrol price (PetrolPrice), number of van drivers killed (VanKilled) and whether the law was in e ect or not that month (law). Open the help le in R to see more details for the dataset.

    Create a histogram demonstrating the number of drivers that died or seriously injured in an accident and histograms for the people in the front and rear seats of the car that died or seriously injured.

    Compute the average number of people in the front seat that die or seriously injured is di erent for the months that the law is not in effect and the months the law it is in effect.

    What is the correlation between variables front and rear.

    [20]

  5. Suppose that we want to compute the integral

    where \(p(x)\) is the density of the Chi-Square distribution with df=3. Consider the sum

    where \(z_i,i=1,...,N,\) are independent identically distributed random variables with the standard lognormal distribution. By the Central Limit Theorem (CLT), we have

    Thus, \(S_N\) is an estimator of \(I\).

    Consider the function \(f(x)=\cos(x^4)/(1+|x|)\). Using R, write a function GetSN with argument \(N\) which returns \(S_N\) . Write a file with 40 evaluations of the function GetSN for \(N=5000\). What can you say statistically about \(I\)?

    [20]