MIT Course 6.858/18.428: Machine Learning
This machine learning course covered the following topics:
- Formal models of machine learning
- Learning concepts from examples
- Learnable classes of concepts
- PAC learning
- VC-dimension
- Bayesian Inference
- Neural Nets
- Learning from queries
- Learning with noise
- Learning finite automata
- Hidden Markov Models
Available Lecture Notes Fall 1994
- Lecture 1:
Introduction to the course. Defining models for machine learning.
Learning conjunctions in the mistake-bounded model.
- Lecture 2:
Review of on-line mistake-bounded model. The halving algorithm.
Winnow algorithm for learning linearly separable boolean functions.
-
Lecture 3: Review of the Winnow Algorithm. Perceptron
Convergence Theorem. Relationship between VC-dimension and mistake
bounds.
- Lecture 4:
Probably Approximately Correct (PAC) Learning. PAC learning
conjunctions.
- Lecture 5:
Intractability of learning 3-term DNF by 3-term DNF. Occam
Algorithms. Learning 3-term DNF by 3-CNF.
- Lecture 6:
Learning k-decision lists. Occam's Razor (general case). Learning
conjunctions with few literals.
- Lecture 7:
VC-dimension and PAC learning
- Lecture 8: PAC
learnability of infinite concept classes with finite VC-dimension.
-
Lecture 9: Estimating error rates. Uniform convergence and
VC-dimension.
- Lecture 10: Lower
bound on sample complexity for PAC learning. Cover's coin problem.
-
Lecture 11: Bayesian learning. Minimum description length
principle.
- Lecture 12:
Definition of weak learning. Confidence boosting. Accuracy boosting.
-
Lecture 13: Finish proof that weak learnability implies strong
learnability.
- Lecture 14:
Finish discussion of weak learnability. Freund's boosting method.
Introduction to neural networks.
- Lecture 15:
Neural networks and back propagation.
- Lecture 16:
Applications of neural nets. Sejnowski and Rosenberg's NETtalk system
for learning to pronounce English text. Gorman and Sejnowski's network
for learning how to classify sonar targets.
- Lecture 17:
Computational complexity of training neural nets. Training a 3-Node
Neural Node is NP-complete. Expressive power of continuous valued
neural networks with only two hidden layers.
- Lecture 18:
VC-dimension of neural networks. Asymptotic error rates of neural
networks.
- Lecture 19:
Learning in the presence of noise. Malicious noise. Classification
noise. Minimizing disagreements to handle
classification noise. Minimizing disagreements for conjunctions in NP-hard.
- Lecture 20:
Statistical query model. Statistical query algorithm for conjunctions.
Statistical query learnability implies PAC learnability in the
presence of classification noise.
- Lecture 21:
Learning decision trees using the fourier spectrum (in the membership
query model, with respect to the uniform distribution).
- Lecture 22:
Finish algorithm for learning decision trees.
Learning DNF with membership queries with respect to the uniform
distribution.
- Lecture 23:
Learning finite automata with membership and equivalence queries.
- Lecture 24:
Finish algorithm for learning finite automata with membership and
equivalence queries. Learning finite automata without resets using
homing sequences.
- Lecture 25:
Hidden Markov models and an application to speech recognition.
- Lecture 26:
Piecemeal learning of unknown environments.
Ron Rivest (rivest@theory.lcs.mit.edu)
Mona Singh (mona@cs.princeton.edu)