MIS41120 Statistical Learning

Academic Year 2019/2020

Broadly speaking, we think of Statistical and Machine Learning as computational methods that use (learn from) experience to improve performance or prediction accuracy. They arose in different research communities but have significant overlap. Statistical Learning focusses more on linear models, for which there is stronger theoretical foundation, and (to an extent) on inference; Machine Learning focusses more on nonlinear methods, founded more on experimental evidence, and is often more associated with prediction.
This Statistical Learning course discusses these, and also investigates the foundations of these methods: how well they work, error estimates, tradeoffs involved, etc: the principles underpinning algorithmic learning - the methods used in Knowledge Discovery and Data Mining.
Statistical learning refers to supervised and unsupervised learning, especially regression, classification, clustering, and especially with structured numerical data. These are the most common techniques used for modelling, with the goals of inference and prediction in business (and elsewhere); hence, their statistical theory is well-developed.
This module aims to develop both theory and practice to expert level.

Show/hide contentOpenClose All

Curricular information is subject to change

Learning Outcomes:

On completion of the module students should be able to:
● Distinguish between supervised and unsupervised learning and define regression, classification and clustering problems formally
● Describe bias, variance and the bias-variance trade-off
● Describe common loss functions and performance measures
● Define the problem of overfitting and how to overcome it
● Distinguish among common models, from linear regression to artificial neural networks to generalised linear models, and execute them with the help of a software library
● Describe the main ideas of statistical learning theory, including the theory of the VC dimension.

Indicative Module Content:

Topics of the course are drawn from:
● Motivation: goals of prediction and inference/understanding
● Supervised and unsupervised learning: Regression, Classification, Clustering
● Measuring performance: accuracy and interpretability
● Bias, variance and the bias-variance tradeoff
● Generalisation and stability
● Model selection
● Loss functions
● The problem of Overfitting: Regularisation
● Sparse models including the lasso, elastic net and support vector machine
● Generalised linear models
● Artificial neural networks
● Deep nets
● Model capacity, shattering and VC dimension

Student Effort Type Hours
Lectures

36

Specified Learning Activities

40

Autonomous Student Learning

100

Total

176

Requirements, Exclusions and Recommendations

Not applicable to this module.


Module Requisites and Incompatibles
Not applicable to this module.  
Assessment Strategy  
Description Timing Open Book Exam Component Scale Must Pass Component % of Final Grade
Assignment: Review and critically analyse an academic paper in the field Week 4 n/a Graded No

25

Examination: Main Examination, held online at end of trimester Coursework (End of Trimester) Yes Standard conversion grade scale 40% No

50

Group Project: Project work on data analysis Week 8 n/a Graded No

25


Carry forward of passed components
Yes
 
Resit In Terminal Exam
Autumn Yes - 2 Hour
Feedback Strategy/Strategies

• Feedback individually to students, post-assessment
• Group/class feedback, post-assessment

How will my Feedback be Delivered?

Feedback on strengths and weaknesses of assignment submission

Name Role
Professor Javier Faulin Subject Extern Examiner
Summer
     
Lecture Offering 51 Week(s) - 38, 39, 40, 41, 42, 43, 44, 45, 46 Fri 10:00 - 11:50
Lecture Offering 51 Week(s) - 38, 40, 41, 42, 43, 44, 45, 46, 47 Mon 10:00 - 11:50
Summer
     

Discover our Rankings and Accreditations