Lectures

The schedule below is tentative and subject to change, depending on time and class interests. We will move at a pace dictated by class discussions. Please check this page often for updates.

Week Date Content
1 9/24 – 9/26 Intro to Machine Learning; Nearest Neighbours; Bias-Variance Trade-Off
2 10/1 – 10/3 Cross Validation
3 10/8 – 10/10 Decision Trees; Bagging and Random Forests; Boosting and Boosted Additive Models
4 10/15 – 10/17 Categorical Outcomes and Classification Models
5 10/22 – 10/24 Logistic regression and Intro to Neural networks
6 10/29 – 10/31 Neural Networks
7 11/5 – 11/7 Recommender Systems
8 11/12 – 11/14 Networks
9 11/19 – 11/21 Naive Bayes; Probabilistic Graphical Models
10 12/3 – 12/5 Hidden Markov Models (If time permits: anomaly detection in time series)
11 12/11 Final Project due

Weeks 1-2

Lecture Slides:
Overview
Introduction to Predictive Models and kNN

R code:
docv.R
bias-variance-illustration.R
BostonHousing_KNN_BiasVarTradeOff_CrossValid.Rmd

Python code:
BostonHousing_KNN_BiasVarTradeOff_CrossValid.ipynb

Homework assignments:

Homework 01

Homework 02

Optional textbook reading:
An Introduction to Statistical Learning: Section 2, Section 5.1, Section 8.1

Additional reading:

Machine Learning: Trends, Perspectives, and Prospects
M. I. Jordan and T. M. Mitchel
A Science review article from two leading experts in Machine Learning

Week 3

Lecture Slides:
Trees, Bagging, Random Forests and Boosting

R code:
knn-bagging.R
boosting_demo_1D.R
boosting_demo_2D.R
BostonHousing_Trees_RandomForests_BoostedAdditiveModels.Rmd

Python code:
BostonHousing_Trees_RandomForests_BoostedAdditiveModels.ipynb

Homework assignment: See Problem 9.1 in Lecture notes for this week.

Optional textbook reading: An Introduction to Statistical Learning: Chapter 8

Week 4

Lecture Slides:
Classification
Perceptron
Perceptron -- R Markdown Script to recreate slides

R code:
fglass.R
04_kaggle_logit_rf_boost.R
04_simulation_logit_rf_boost.R
04_tabloid_logit_rf_boost.R
KaggleCreditScoring_usingCaretPackage.Rmd
TabloidMarketing_usingCaretPackage.Rmd

Python code:
KaggleCreditScoring.ipynb
TabloidMarketing.ipynb

Homework assignment:
04_hw.pdf
Start early.

Optional textbook reading: An Introduction to Statistical Learning: Chapter 4 (we will not talk about linear discriminant analysis)

Pedro Domingos: A Few Useful Things to Know about Machine Learning PDF

D. Sculley et al.: Machine Learning: The High Interest Credit Card of Technical Debt PDF

Week 5

Lecture Slides:
Logistic regression
RMarkdown -- Logistic regression

R code:
lr_decision_surface.R
we8there.R

Optional textbook reading:_ An Introduction to Statistical Learning: Chapter 4, Section 6.2

Midterm Exam

Midterm Exam Questions

Week 6

Lecture Slides:
Neural networks
MNIST example

ALVINN video

R code:

See our GitHub.
We suggest you to clone the folder "Lecture06" or download all of its content, as the folder contains some pretrained models, which may take a long time to train again.

In order to install h2o package, go to http://h2o-release.s3.amazonaws.com/h2o/master/3232/index.html, click on "INSTALL IN R", and follow instructions.

Alternatively, you can type the following in R:

source("https://raw.githubusercontent.com/ChicagoBoothML/HelpR/master/booth.ml.packages.R")

MNISTDigits_NeuralNet.Rmd

Python code
MNISTDigits_NeuralNet_KerasPackage.ipynb

Homework assignment:

06_hw.pdf
ParseData.R

To load data use:

source("ParseData.R")

data <- parse_human_activity_recog_data()

Due Sunday, November 8.

Optional textbook reading: The Elements of Statistical Learning: Sections 11.3 - 11.5

Some h2o resources:

h2o package
Deep Learning
GLM

Week 7

Lecture Slides:
Recommender Systems

R code:

simpleScript.R This is a toy example illustrating how to compute similarities between users, recommend items and predict ratings.

MovieLens_MovieRecommendation.Rmd

In this lecture, we will be using recommenderlab package.

recommenderlab: Reference manual
recommenderlab: Vignette

Python code
MovieLens_LatentFactorRec.ipynb

Homework assignment:

Assignment
Data: videoGames.json.gz
Starter script: starterScript.R

Optional reading:

Amazon.com Recommendations
Cold Start Problem
Matrix Factorization Techniques For Recommender Systems
All Together Now: A Perspective on the Netflix Prize

Week 8

Lecture Slides:
Networks
Ego nets Slides from KDD tutorial on Graph-Based User Behaviour Modeling

R code:

See our GitHub.

Homework assignment:
Assignment
Data: wikipedia.gml
Starter script: starterScript.R

Optional reading:

See Chapters 3 and 4 of "Statistical Analysis of Network Data with R" (PDF available through UChicago library)

Weeks 9-10

Lecture Slides:
Probabilistic Graphical Models
Example PGM
Hidden Markov Models

R code:

NB_reviews.R.
Large Movie Review Dataset can be downloaded from here.
Direct link to data: aclImdb_v1.tar.gz

Homework assignment:
Assignment
Data: emails.cvs
Starter script: starterScript.R

Optional reading:

Andrew Moore's basic probability tutorial
Rabiner's Detailed HMMs Tutorial
Text mining package
Graphical Models with R
HMM Tutorial
Animated HMM Tutorial