MAT013 Class Test 2018

Instructions:

Once you have finished the test:

  1. Call the file for the SAS code: STUDENTNUMBER-SAS-lastname (eg. 123456-SAS-Evans) and call the file for the R code: STUDENTNUMBER-R-lastname (eg. 123456-R-Evans).
  2. Email both files to Andrey Pepelyshev with ‘MAT013-STUDENTNUMBER-lastname’ as the subject.
  3. The email must be sent before leaving the computer lab.
  4. Show the sent email to the class test invigilator.

The class test contains 3 questions. Each questions contains few tasks.

Questions for the class test:

Question 1

Please, answer this question using R:

The dataset seatbelts.csv has 192 observations of monthly data on the number of deaths and serious injuries in the UK, for some months before and after introducing the legislation for seatbelts. Each row represents a month and for each month we measured 8 variables including the number of car driver deaths (DriversKilled), the number of deaths and serious injuries (drivers), the number of deaths and serious injuries in the front seat (front), the number of deaths and serious injuries in the rear seat (rear), distance driven (kms), petrol price (PetrolPrice), number of van drivers killed (VanKilled) and whether the law was in effect or not that month (law).

  1. Import dataset seatbelts.csv. Compute the average number of DriversKilled and VanKilled per year. Compute the 95% confidence interval for the number of deaths and serious injuries per month. Please, write the computed values as comments in your R program.

    [5]

  2. Create a histogram demonstrating the number of drivers that died or seriously injured in an accident. Also create two histogram for (i) the people in the front of the car that died or seriously injured (ii) a histogram for the people in rear seats of the car that died or seriously injured. Please, describe the shape of histrograms as comments in your R program.

    [7]

  3. Create a scatterplot of two variables front and rear. Is there a significant correlation between front and rear. Please, write your answer as comments in your R program.

    [8]

Question 2

Please, answer this question using SAS:

The dataset crabs.csv has 173 observations of horseshoe crabs and 5 variables are measured. The variables are:  (i) color of the shell with values 1; 2; 3; 4 to represent "light medium", "medium", "dark medium" and "dark", respectively,  (ii) spin which describes the condition of the spines with values 1, 2, 3 to represent "both good", "one works, other is broken" and "both work or broken", respectively,  (iii) width of the hard shell in cm,  (iv) satell which gives the number of satellite crabs around it,  (v) weight of the crab in Kg.

  1. Import dataset crabs.csv. Compute the mean and standard deviation for all variables.

    [5]

  2. Plot the histogram for all variables and export plots to pdf-files.

    [7]

  3. Plot a scatterplot between the width of the shell and the weight of the crab. Compute the significance of the linear relationship for these two variables. Is there a relationship between the width of the shell and the weight of the crab? Please, write your answer as comments in your SAS code.

    [8]

  4. Build a regression model of satell on all other variables. Write a short interpretation of the model as comments in your SAS code.

    [10]

Question 3

Download two datasets fra-15-16.csv and fra-16-17.csv to a folder in your PC. These files contains the football games in the French League 1 in the season 2015/2016 and the season 2016/2017, respectively. Values in these files are separated by comma. Meaning of column names is explained here. You will need to use columns HomeTeam, AwayTeam, FTHG, FTAG and BbMxH.

Please, answer this question using either SAS or R:

  1. Import two datasets fra-15-16.csv and fra-16-17.csv.
    Create a table by concatenating of these two datasets.
    In further tasks, use this table for analysis.

    [10]

  2. Find a team which scored the largest number of goals at home. Please, write your answer as comments in your code.

    [7]

  3. Find a team which cumulatively scored the largest number of goals both at home and away. Please, write your answer as comments in your code.

    [15]

  4. Find a team which conceded the largest number of goals away. Please, write your answer as comments in your code.

    [8]

  5. Consider a betting strategy of placing 1 pound bets on home team when odds on a home team (the column BbMxH) is larger than 1.5 and smaller than 2.5. Compute the profit for this strategy. Recall that the balance decreases by 1 pound as we placed a bet and increases by a BbMxH value (i.e. odds on Home) if a game finished by win of home team. Please, write your answer as comments in your code.

    [10]