MAT013 Group Coursework

Deadline: April 26, 2019

Instructions

The outputs of this coursework will be:

  • A 20 minute group presentation/demonstration to be given on 7th May 2019 followed by 10 minutes for discussion. Potentially, a group can choose another date (say, May 6) due to unavailability on May 7.
  • All relevant files (code, presentation, notes, websites, demo materials, etc) should be passed to Andrey Pepelyshev on or before April 26, 2019.

Marking criteria:

  • Difficulty: [30]
  • Accuracy: [30]
  • Originality: [20]
  • Presentation/demonstration: [20]

Coursework

As a group you are required to present how to solve a particular scientific problem using R. You should use aspects of R that are not given in the notes. The presentation should be viewed as a teaching presentation. You should follow the following strategy:

  • Choose a scientific problem. Some examples of statistical problems are:
    • Logistic/Binary/Poisson regression
    • Classification/Clustering
    • Discrimination analysis
    • Multidimensional scaling
    • Pattern recognition
    • Time series analysis and forecasting
    You may choose a non-statistical problem.
  • Choose a package for R. Some examples are gbm, xgboost, LiblineaR, cluster, MASS, smacof, superMDS, neuralnet, deepnet, Rssa.
  • Find a dataset for demonstrating how to use the chosen package for solving the chosen problem. Usually, each package contains references to few suitable datasets.
  • Your group should write the choice of a problem, a package and a dataset in "Discussion" at Learning Central in order to avoid the same choice by other groups. On selection of a topic, it is advisable to ask Andrey Pepelyshev whether or not it is suitable.
  • In your group coursework, (i) explain a problem, (ii) explain certain technical aspects of a package, (iii) explain solution of a particular problem for a dataset.

You are not constrained by the use of slides (although you are welcome to). Feel free to be imaginative.