Inference and Representation (DS-GA-1005, CSCI-GA.2569)

Course staff:

Name E-mail (@cs.nyu.edu)
Instructor David Sontag dsontag
Instructor Joan Bruna bruna
TA Rahul Krishnan rahul
Grader Aahlad Manas apm470 {@/at} nyu.edu
Grader Alex Nowak anv273 {@/at} nyu.edu

Syllabus

This graduate level course presents fundamental tools of probabilistic graphical models, with an emphasis on designing and manipulating generative models, and performing inferential tasks when applied to various types of data.

We will study latent variable graphical models (Latent Dirichlet Allocation, Factor Analysis, Gaussian Processes), state-space models for time series (Kalman Filter, HMMs, ARMA), Gibbs Models, Deep generative models (Variational autoencoders, GANs), and causal inference, covering both the methods (exact/approximate inference, sampling algorithms, exponential families) and modeling applications to text, images and medical data.

Lecture Location

Monday, 5:10-7:00pm, in Warren Weaver Hall 1302

Recitation/Laboratory (required for all students)

Wednesdays, 7:10-8:00pm in Meyer Hall of Physics 121

Office hours

DS: Mondays, 10:00-11:00am. Location: 715 Broadway, 12th floor, room 1204.

JB: Thursdays, 4:00-5:00pm. Location: 60 5th ave, 6th floor, room 612.

Grading

problem sets (45%) + midterm exam (25%) + final project (25%) + participation (5%).

Piazza

We will use Piazza to answer questions and post announcements about the course. Please sign up here. Students' use of Piazza, particularly for adequately answering other students' questions, will contribute toward their participation grade.

Online recordings

Most of the lectures and labs' videos will be posted to NYU Classes. Note, however, that class attendance is required.

Schedule

Week Lecture Date Topic Reference Deliverables
2 9/12 Lec1 Intro and Logistics. Bayesian Networks. Slides Murphy Chapter 1 (optional; review for most)

Notes on Bayesian networks (Sec. 2.1)

Algorithm for d-separation (optional)
PS1, due 9/19
3 9/19 Lec2 Undirected Graphical Models. Markov Random Fields. Ising Model. Applications to Statistical Physics. Slides Notes on MRFs (Sec. 2.2-2.4)

Notes on exponential families
PS2 [data], due 9/26
4 9/26 Lec3 Introduction to Inference. Variable elimination. Slides Murphy Sec. 20.3

Notes on variable elimination (optional)
PS3 [data], due 10/3
5 10/3 Lec4 Modeling Text Data. Topic Models. Latent Dirichlet Allocation. Gibbs sampling. Slides Barber 27.1-27.3.1

Murphy Sec. 24.1-24.2.4

Introduction to Probabilistic Topic Models

Explore topic models of: politics over time, state-of-the-union addresses, Wikipedia
PS4, due 10/17

Project Proposal, due 10/24
6 10/10 No lecture (there is lab).
7 10/17 Lec5 Modeling Survey Data. Factor Analysis. PCA. ICA. Slides Elements of Statistical Learning, Ch.14

Finding Structure in Randomness (...), Halko, Martinsson, Tropp
PS5, due 10/24
8 10/24 Lec6 Clustering. EM. Markov Chain Monte-Carlo (MCMC). slides MIT Lecture 18 Notes

Elements of Stat. Learning 14.5 and 8.5

Hamilton Monte-Carlo (optional)
9 10/31 Midterm Exam
10 11/7 Lec7 Variational Inference. Revisiting EM. Mean Field. slides Graphical Models, Exponential Families and Variational INference, Chapter 3 Variational INference with Stochastic Search
11 11/14 Lec8 Modeling Time Series Data. Spatial and Spectral models. GPs, ARMA, HMMs, RNNs. slides Shumway&Stoffer, Chapters 2, 3 and 6.

Lecture notes Stat 153 JB
PS6, due 11/21
12 11/21 Lec9 Modeling Structured Outputs and Conditional Random Fields (CRFs). Exponential families, moment matching, pseudo-likelihood. slides

No lab this week
Murphy, Secs. 19.5 & 19.6
Notes on pseudo-likelihood
An Introduction to Conditional Random Fields (section 4; optional)
Approximate maximum entropy learning in MRFs (optional)
13 11/28 Lec10 Structured SVMs and (deep) structured prediction. Dual decomposition for MAP inference. slides Murphy Secs. 19.7 and 22.6
Tutorial on Structured Prediction (optional)
Original paper introducing structured perceptron (optional)
Cutting-Plane Training of Structural SVMs (optional)
Block-Coordinate Frank-Wolfe Optimization for Structural SVMs (optional)
Fully Connected Deep Structured Networks (optional)
Introduction to Dual Decomposition for Inference (optional)
PS7, due 12/5
14 12/5 Lec11 Causal Inference. slides Murphy Sec. 26.6 (learning causal DAGS)
Hill & Gelman Ch. 9 ( p.167-175, 181-188)
ICML 2016 tutorial (optional)
Jonas Peters causality book (optional)
15 12/12 Lec12 Modeling Images and high-dimensional data. Boltzmann Machines. Autoencoders. Variational Autoencoders. slides references in slides
12/13 Lec13 Modeling Images and high-dimensional data (contd). Deep Auto-regressive Models. Generative Adversarial Networks (GANs). slides references in slides Project writeup, due 12/16.
16 12/19 Final Day Poster Presentations of Final Projects

Location: Center for Data Science, 60 5th ave, in the 7th floor open space

Bibliography

There is no required book. Assigned readings will come from freely-available online material.

Core Materials

Background on Probability and Optimization

Further Reading

Academic Honesty

We expect you to try solving each problem set on your own. However, when being stuck on a problem, we encourage you to collaborate with other students in the class, subject to the following rules:

Late submission policy

During the semester you are allowed at most two extensions on the homework assignment. Each extension is for at most 48 hours and carries a penalty of 25% off your assignment.