Inference and Representation (DSGA1005, CSCIGA.2569)
Course staff:
Name  Email (@cs.nyu.edu)  

Instructor  David Sontag  dsontag 
Instructor  Joan Bruna  bruna 
TA  Rahul Krishnan  rahul 
Grader  Aahlad Manas  apm470 {@/at} nyu.edu 
Grader  Alex Nowak  anv273 {@/at} nyu.edu 
Syllabus
This graduate level course presents fundamental tools of probabilistic graphical models, with an emphasis on designing and manipulating generative models, and performing inferential tasks when applied to various types of data.
We will study latent variable graphical models (Latent Dirichlet Allocation, Factor Analysis, Gaussian Processes), statespace models for time series (Kalman Filter, HMMs, ARMA), Gibbs Models, Deep generative models (Variational autoencoders, GANs), and causal inference, covering both the methods (exact/approximate inference, sampling algorithms, exponential families) and modeling applications to text, images and medical data.
Lecture Location
Monday, 5:107:00pm, in Warren Weaver Hall 1302
Recitation/Laboratory (required for all students)
Wednesdays, 7:108:00pm in Meyer Hall of Physics 121
Office hours
DS: Mondays, 10:0011:00am. Location: 715 Broadway, 12th floor, room 1204.
JB: Thursdays, 4:005:00pm. Location: 60 5th ave, 6th floor, room 612.
Grading
problem sets (45%) + midterm exam (25%) + final project (25%) + participation (5%).
Piazza
We will use Piazza to answer questions and post announcements about the course. Please sign up here. Students' use of Piazza, particularly for adequately answering other students' questions, will contribute toward their participation grade.
Online recordings
Most of the lectures and labs' videos will be posted to NYU Classes. Note, however, that class attendance is required.
Schedule
Week  Lecture Date  Topic  Reference  Deliverables 

2  9/12  Lec1 Intro and Logistics. Bayesian Networks. Slides  Murphy Chapter 1 (optional; review for most) Notes on Bayesian networks (Sec. 2.1) Algorithm for dseparation (optional) 
PS1, due 9/19 
3  9/19  Lec2 Undirected Graphical Models. Markov Random Fields. Ising Model. Applications to Statistical Physics. Slides 
Notes on MRFs (Sec. 2.22.4) Notes on exponential families 
PS2 [data], due 9/26 
4  9/26  Lec3 Introduction to Inference. Variable elimination. Slides  Murphy Sec. 20.3 Notes on variable elimination (optional) 
PS3 [data], due 10/3 
5  10/3  Lec4 Modeling Text Data. Topic Models. Latent Dirichlet Allocation. Gibbs sampling. Slides  Barber 27.127.3.1 Murphy Sec. 24.124.2.4 Introduction to Probabilistic Topic Models Explore topic models of: politics over time, stateoftheunion addresses, Wikipedia 
PS4, due 10/17 Project Proposal, due 10/24 
6  10/10  No lecture (there is lab).  
7  10/17  Lec5 Modeling Survey Data. Factor Analysis. PCA. ICA. Slides 
Elements of Statistical Learning, Ch.14 Finding Structure in Randomness (...), Halko, Martinsson, Tropp 
PS5, due 10/24 
8  10/24  Lec6 Clustering. EM. Markov Chain MonteCarlo (MCMC). slides 
MIT Lecture 18 Notes Elements of Stat. Learning 14.5 and 8.5 Hamilton MonteCarlo (optional) 

9  10/31  Midterm Exam  
10  11/7  Lec7 Variational Inference. Revisiting EM. Mean Field. slides  Graphical Models, Exponential Families and Variational INference, Chapter 3 Variational INference with Stochastic Search  
11  11/14  Lec8 Modeling Time Series Data. Spatial and Spectral models. GPs, ARMA, HMMs, RNNs. slides 
Shumway&Stoffer, Chapters 2, 3 and 6. Lecture notes Stat 153 JB 
PS6, due 11/21 
12  11/21 
Lec9 Modeling Structured Outputs and Conditional Random Fields (CRFs). Exponential families, moment matching, pseudolikelihood. slides No lab this week 
Murphy, Secs. 19.5 & 19.6 Notes on pseudolikelihood An Introduction to Conditional Random Fields (section 4; optional) Approximate maximum entropy learning in MRFs (optional) 

13  11/28  Lec10 Structured SVMs and (deep) structured prediction. Dual decomposition for MAP inference. slides  Murphy Secs. 19.7 and 22.6 Tutorial on Structured Prediction (optional) Original paper introducing structured perceptron (optional) CuttingPlane Training of Structural SVMs (optional) BlockCoordinate FrankWolfe Optimization for Structural SVMs (optional) Fully Connected Deep Structured Networks (optional) Introduction to Dual Decomposition for Inference (optional) 
PS7, due 12/5 
14  12/5  Lec11 Causal Inference. slides  Murphy Sec. 26.6 (learning causal DAGS) Hill & Gelman Ch. 9 ( p.167175, 181188) ICML 2016 tutorial (optional) Jonas Peters causality book (optional) 

15  12/12  Lec12 Modeling Images and highdimensional data. Boltzmann Machines. Autoencoders. Variational Autoencoders. slides  references in slides  
12/13  Lec13 Modeling Images and highdimensional data (contd). Deep Autoregressive Models. Generative Adversarial Networks (GANs). slides  references in slides  Project writeup, due 12/16.  
16  12/19 
Final Day Poster Presentations of Final Projects Location: Center for Data Science, 60 5th ave, in the 7th floor open space 
Bibliography
There is no required book. Assigned readings will come from freelyavailable online material.
Core Materials
 Kevin Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012. You can read this online for free from NYU Libraries. We recommend the latest (4th) printing, as earlier editions had many typos. You can tell which printing you have as follows: check the inside cover, below the "Library of Congress" information. If it says "10 9 8 ... 4" you've got the (correct) fourth print.
 Daphne Koller and Nir Friedman, Probabilistic Graphical Models: Principles and Techniques, MIT Press, 2009.
 Mike Jordan's notes on Probabilistic Graphical Models
 MIT lecture notes on algorithms for inference.
 Probabilistic Programming and Bayesian Methods for Hackers by Cam Davidson Pilon
 Trevor Hastie, Rob Tibshirani, and Jerry Friedman, Elements of Statistical Learning, Second Edition, Springer, 2009. (Can be downloaded as PDF file.)6
 David Barber, Bayesian Reasoning and Machine Learning , Cambridge University Press, 2012. (Can be downloaded as PDF file.)
Background on Probability and Optimization
 Review notes from Stanford's machine learning class
 Sam Roweis's probability review
 Convex Optimization by Stephen Boyd and Lieven Vandenberghe.
Further Reading
 Mike Jordan and Martin Wainwright, Graphical Models, Exponential Families, and Variational Inference
 Shumway and Stoffer Time Series Analysis and its Applications: with R examples
Academic Honesty
We expect you to try solving each problem set on your own. However, when being stuck on a problem, we encourage you to collaborate with other students in the class, subject to the following rules:
 You may discuss a problem with any student in this class, and work together on solving it. This can involve brainstorming and verbally discussing the problem, going together through possible solutions, but should not involve one student telling another a complete solution.
 Once you solve the homework, you must write up your solutions on your own, without looking at other people's writeups or giving your writeup to others.
 In your solution for each problem, you must write down the names of any person with whom you discussed it. This will not affect your grade.
 Do not consult solution manuals or other people's solutions from similar courses.
Late submission policy
During the semester you are allowed at most two extensions on the homework assignment. Each extension is for at most 48 hours and carries a penalty of 25% off your assignment.