SC4/SM4 Data Mining and Machine Learning
Term: Hilary Term 2017, Jan 16 - Mar 10
Lectures: HT weeks 1-8: Tue 2pm, Thu 12pm, LG.01
MSc Classes: HT weeks 3,5,7,8: Mon 11am, LG.01
MSc Practicals: HT week 5: Fri 2-4pm; HT week 8: Fri 2-4pm (group assessed)
Part C Class Tutors: Jovana Mitrovic and Leonard Hasenclever
Part C Classes: HT weeks 3,5,7,8: Wed 2:30-4pm, Wed 5-6:30pm, Fri 4:30-6pm
Part C Problem Sheet Deadlines: HT weeks 3,5,7,8: Mon 10am
Part C Revision Classes: TT week 3: Thu 11am, week 4: Thu 4pm LG.01

Course Materials

The course materials will consist of slides, summary notes and Jupyter notebooks. Summary notes are not exhaustive and should be used in conjunction with the slides. All materials are frequently updated and are thus best read on screen. Please email me any typos or corrections.

Revision

Much of the material was part of previous courses called Statistical Data Mining and Statistical Data Mining and Machine Learning. So there are relevant old questions.

  • MSc Paper (II) questions on Statistical Data Mining/Statistical Data Mining and Machine Learning.
  • Part C questions for many past years – Paper SC4 (which was called MS1b up to 2014).
  • Some specific questions (to be covered in Part C revision classes in TT weeks 3,4):
    • Part C 2016 Q3, Q2 (without (a-ii) )
    • Part C 2015 Q2 (d), Q3 (c)
    • Part C 2014 Q1 (a,c,d), Q3 (c)
    • MSc 2016 Q6

MSc Practicals

Textbooks and Background Reading

  • Hastie, Tibshirani and Friedman, The Elements of Statistical Learning, Springer. ebook
  • Bishop, Pattern Recognition and Machine Learning, Springer.
  • Murphy, Machine Learning: A Probabilistic Perspective, MIT Press.
  • Shalev-Shwartz and Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press.

Background Review Aids:

Software

R

Python