Princeton University
|
Computer Science 511
|
|
Numbers in brackets under "reading" refer to chapters or sections of Kearns & Vazirani, available from the library's electronic reserves, or blackboard. Note that this schedule is continually being updated as the semester progresses, and therefore is subject to change.
# |
Date |
Topic |
Core reading |
Other (optional) readings and links |
1 | M 2/4 | General introduction to machine learning; consistency model | scribe notes | |
2 | W 2/6 | Consistency model; review of probability; begin PAC model | scribe notes |
for a very brief review of probability, see, for instance, Appendix C.2 and C.3 in Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein. |
3 | M 2/11 | PAC model; simple special cases; begin Occam's razor | scribe notes [1], [2.1-2.2] |
Valiant's original "Theory of the Learnable" paper, which first got theoretical computer scientists working on machine learning |
4 | W 2/13 | Prove Occam's razor; begin sample complexity for infinite hypothesis spaces | scribe notes [3.1] |
original "Occam's Razor" paper |
5 | M 2/18 | Growth function and sample complexity for infinite hypothesis spaces; VC-dimension | scribe notes [3.5] |
|
6 | W 2/20 | Sauer's lemma; upper bound on sample complexity; lower bounds on sample complexity | scribe notes [3.2-3.4], [3.6] |
|
7 | M 2/25 | Finish lower bound; handling inconsistent hypotheses; begin Chernoff bounds | scribe notes | |
8 | W 2/27 | Finish Chernoff bounds; McDiarmid's inequality; error bounds for inconsistent hypotheses; overfitting | scribe notes | |
9 | M 3/3 | Boosting; training error bound | scribe notes Robert E. Schapire. |
|
10 | W 3/5 | Generalization error bounds: naive and margins-based | scribe notes Robert E. Schapire, Yoav Freund, Peter Bartlett
and Wee Sun Lee. |
|
11 | M 3/10 | Finish boosting; begin support-vector machines | scribe notes | |
12 | W 3/12 | Support-vector machines | scribe notes Excerpt from Vapnik's The nature of statistical learning theory. |
Chris Burges's tutorial on svm's |
13 | M 3/24 | Finish support-vector machines; on-line learning; learning with expert advice; weighted majority algorithm | scribe notes
Avrim Blum. |
"mind-reading" game based on on-line learning |
14 | W 3/26 | Finish weighted majority algorithm; perceptron algorithm | scribe notes | |
M 3/31 | (no class) | |||
15 | W 4/2 | Winnow; begin regression | scribe notes | |
16 | M 4/7 | linear regression; Widrow-Hoff algorithm | scribe notes Jyrki Kivinen and Manfred K. Warmuth. |
|
17 | W 4/9 | finish Widrow-Hoff; EG; converting on-line to batch | scribe notes | |
18 | M 4/14 | modeling probability distributions; maximum likelihood; maximum entropy modeling of distributions | scribe notes The duality theorem given in class is taken from:
Stephen Della Pietra, Vincent Della Pietra and John Lafferty. |
There are a number of excellent papers and other materials on maximum entropy here. |
19 | W 4/16 | finish maxent; begin on-line log loss | scribe notes | |
20 | M 4/21 | on-line log loss and universal compression; Bayes algorithm | scribe notes | |
21 | W 4/23 | shifting experts | scribe notes Mark Herbster and Manfred K. Warmuth. |
|
22 | M 4/28 | portfolio selection | scribe notes Avrim Blum and Adam Kalai. |
For an opposing view to this approach, see the
following short paper (and try to detect what is unusual about the writing):
Paul A. Samuelson |
23 | W 4/30 | game theory and learning | scribe notes Yoav Freund and Robert E. Schapire. |
For more details, see: Yoav Freund and Robert
E. Schapire. |