COS597C: Advanced Methods in Probabilistic Modeling
Fall, 2011
M/W, 11:00AM - 12:20PM
Friend 004
David M. Blei
Piazza Site (discussion and announcements)
Description
We will study some advanced methods in probabilistic modeling that
are central to modern machine learning and statistics. We will focus
on four subjects:
- posterior inference with variational methods
- hierarchical modeling for grouped data
- model selection, specification, and checking
- Bayesian nonparametric modeling
We will emphasize algorithms and applications as well as the
theoretical underpinnings of these subjects.
Prerequisites and requirements
This course is appropriate for students who have
taken COS513
"Foundations of Probabilistic Modeling" or who are familiar with
the material from that course. Contact David Blei if you are unsure
about whether this is the right course for you to take.
The course will consist of lectures and "practical" lectures.
During practical lectures, we will implement and explore the
properties of algorithms as a class. (We will learn and
use R.)
The requirements are
- Brief reading response papers (less than one page)
- A midterm report and project proposal due Friday October 28
- A final project due Monday January 16 (latex template)
- Class attendance and participation
Reading assignments
- For the week of 9/26, read one of
- Blei, D.
Graphical models and approximate posterior inference, 2004.
- Jordan, M., Z. Ghahramani, T. Jaakkola, and L. Saul,
An introduction to variational methods for graphical
models. Machine Learning , 37: 183–233, 1999.
- Braun, M. and J. McAuliffe,
Variational inference for large-scale models of discrete
choice. Journal of the American Statistical Association
, 105: 324–335, 2010.
- For the week of 10/3, read one of
- A paper you didn't read for the week of 9/26.
- Beal, M. and Ghahramani, Z. The variational Bayesian EM
algorithm for incomplete data: with Application to Scoring
Graphical Model Structures. Bayesian Statistics 7,
2002.
- Minka, T. Chapter 3:
Expectation propagation. In A family of algorithms for
approximate Bayesian inference, PhD Dissertation, 2001.
- Wainwright, M. and Jordan, M. Chapter 3: Graphical
models as exponential families In Graphical models,
exponential families, and variational inference. Foundations
and Trends in Machine Learning, 2008.
- Wainwright, M. and Jordan, M. Chapter 5: Mean field
methods In Graphical models, exponential families, and
variational inference. Foundations and Trends in Machine
Learning, 2008.
- For the week of 10/10, read
- Gelman, A. Carlin, J. Stern,
H. and Rubin, D. "Chapter 5: Hierarchical Models." In Bayesian
Data Analysis, Chapman and Hall, 2004.
- For the week of 10/17, read one of
- Gelman, A. and Hill, J. "Chapter 11: Multi-level
Structures." In Data Analysis Using Regression and
Multilevel/Hierarchical Models , Cambridge Press, 2007.
- Blei, D., Ng, A. and Jordan, M. Latent
Dirichlet Allocation.   Journal of Machine
Learning Research , 2003. (Skip the appendix.)
- For the week of 10/24, read:
- For the week of 11/7, read one of:
- The latent Dirichlet allocation paper from the week of 10/17.
- Pritchard, J., Stephens, M., and Donnelly, P. Inference of
Population Structure Using Multilocus Genotype Data. 
Genetics. 155:945–959, 2000.
- Airoldi, E., Blei, D., Fienberg, S., Xing, E. Mixed Membership
Stochastic Blockmodels. Journal of Machine Learning
Research, 9:1981–2014, 2008.
- Blei, D. and Lafferty, J. A
Correlated Topic Model of Science ,  Annals
of Applied Statistics. 1:1 17–35, 2007.
- For the week of 11/14, read a paper you didn't read from
last week.
- For each of the weeks of 11/28, 12/5 read one of
- S. Gershman and D. Blei. A tutorial on Bayesian
nonparametric models. Journal of Mathematical
Psychology, to appear.
- R. Neal. Markov chain
sampling methods for Dirichlet process mixture
models. Journal of Computational and Graphical
Statistics, 9[2]:249–265, 2000.
- D. Blei and M. Jordan. Variational inference for
Dirichlet process mixtures. Journal of Bayesian
Analysis, 1[1]:121–144, 2006.
- J. Sethuraman. A
constructive definition of Dirichlet priors. 
Statistica Sinica, 4:639–650, 1994.
- Y. Teh and M. Jordan Hierarchical Bayesian
nonparametric models with applications. In Bayesian
Nonparametrics, Cambridge University Press, 2010.
- T. Griffiths and Z. Ghahramani. The Indian buffet process: An
introduction and review. Journal of Machine Learning
Research ,12:1185–1224, 2011.
- S. Goldwater, T. Griffiths, and M. Johnson. Producing power-law
distributions and damping word frequencies with two-stage
language models Journal of Machine Learning Research
,12:2335–2382, 2011.
- For the final reading report, read one or more of the following.
- Gelman, A., Meng, X., and Stern, H. Posterior predictive
assessment of model fitness via realized
discrepancies. Statistica
Sinica,6:733–807, 1996.
- Box, G. Sampling and Bayes' inference in
scientific modelling and robustness. Journal of the
Royal Statistical Society, Series A
(General),143:4,383–430, 1980.
- Rubin, D. Bayesianly
justifiable and relevant frequency calculations for the applied
statistician. The Annals of Statistics,
4:1151–1172, 1984.
- Gelman, A. and Shalizi, C. Philosophy and the practice of
Bayesian statistics. 2010.
- Blei, D. Stochastic
variational inference. 2011.
Syllabus
Introduction and review
- 9/19:
Introduction, course overview, course requirements
- 9/21:
Review (graphical models, posterior inference, computation)
Variational inference
Hierarchical
modeling
- [PDF of notes]
- 10/17: Introduction to hierarchical modeling
- 10/19, 10/24: Hierarchical generalized linear models
- 10/26: James-Stein estimation and empirical Bayes
Mixed-membership models
- [PDF of notes]
- 11/7: Introduction to mixed-membership models
- 11/9: Probabilistic topic models
- 11/14: Gibbs sampling in topic models (Guest lecturer: David Mimno)
- 11/16: Variational inference in topic models
Bayesian nonparametrics
- [PDF of notes]
- 11/28: Chinese restaurant processes
- 11/30: Chinese restaurant process mixtures and Gibbs sampling
- 12/5: Gibbs sampling (cont.) and demonstration
- 12/7: Dirichlet processes and random measures
- 12/12: Stick breaking constructions; hierarchical Dirichlet processes
processes
Scalable inference
- 12/14: Stochastic variational inference
Model assessment
R code