Princeton University
|
Computer Science 511
|
|
Course summary
Staff, lectures, office hours
Prerequisites
Attendance and the use
of electronic devices
Textbook
Grading and workload
Turning in assignments
Late policy
Collaboration
Scribe notes
How to be a
scribe
Auditors
Getting help
Communication
Machine learning studies automatic methods for learning to make accurate predictions or useful decisions based on past observations. This course introduces theoretical machine learning, including mathematical models of machine learning, and the design and rigorous analysis of learning algorithms. Likely topics include: bounds on the number of random examples needed to learn; learning from non-random examples in the on-line learning model; how to boost the accuracy of a weak learning algorithm; support-vector machines; maximum-entropy modeling; portfolio selection; game theory.
Here is a tentative list of topics, which is subject to change. (Bullets do not correspond precisely to lectures.)
Lectures:
Monday and Wednesday, 11-12:20; Computer Science Building, room 104 (large auditorium).
Instructor:
Rob Schapire: 421 CS Building (Y. Singer's office), schapire@cs
Office hours: To schedule an appointment, visit
wase.princeton.edu (okay to schedule more than one slot if you think you'll need it). Send me email if no convenient times are available.
Assistant Instructors:
Yichen Chen:
219 Sherrerd,
yichenc@cs
Office hours: Tuesday 11am-12
Zhiyuan Li:
315 CS Building,
zhiyuanli@cs
Office hours: Friday 2:30-3:30
Seyed Sobhan Mir Yoosefi:
431 CS Building,
syoosefi@cs
Office hours: Monday 2-3
Graduate Coordinator:
Nicki Mahler: 310 CS Building, x8-5387, ngotsis@cs
This class has no formal prerequisites. However, it is assumed that students will have a general background in computer science (including theoretical computer science), and experience with rigorous problem solving and mathematical proof techniques, since these are central to the course. Background in basic probability is recommended (at least at the level used in COS340). Some calculus and linear algebra and perhaps a tiny bit of analysis will be used.
Attendance is expected at all lectures. The use of laptops, smartphones, tablets, etc. creates a distraction for yourself and your classmates, as well as for me as your instructor. I therefore would be grateful if you did not use these during class. If you must, you can use a laptop or other device for note-taking, although I still recommend turning off your connection to the internet. But because the material is highly mathematical, you will probably find it easier just to use an ordinary pen and notebook.
The core readings for this course will follow two different tracks. For the "textbook" track, the readings will mainly come from this optional textbook:
Foundations of Machine Learning
by Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar
MIT Press, 2018 (second edition)
The readings for the alternative "eclectic" track will come from more of a mix of materials that I will provide (including a different textbook and some primary sources). You can choose to do readings from either one of the tracks. As these readings tend to heavily overlap one another, there is usually no reason to do readings from both tracks, although you are always welcome to do additional readings from both tracks for further enrichment. Or you can switch back and forth between the tracks. In general, the textbook provides consistency in terminology, notation, approach, etc., but the eclectic track will provide more variety, and might include more primary-source materials. It is your choice.
In some cases, especially toward the end of the course, these tracks are likely to merge since some of the topics are not covered in the textbook.
I may also post additional optional readings.
Note that, to access some of the readings, it may be necessary to be on the Princeton intranet. And some readings are available through blackboard by clicking on "reserves."
The Mohri et al. textbook (second edition) is available at Labyrinth Books, but will mostly be returned to the publisher over spring break. Other buying options as well as a free PDF version of the book are available here.
Grading will be based on homework assignments (65-70%) and a final project (30-35%). There will be 7 or 8 homework assignments, which will be given roughly every week or two, and will each consist of a small number of problems (no programming). In all cases, failure to complete any significant component of the course may result in a grade of D or F, regardless of performance on the other components. Final grades may be adjusted slightly upward for regular and positive class participation, or slightly downward for absence from lecture.
For the final project, you can pick any topic you want for further study. Your project could involve implementing an algorithm, running experiments, doing further reading, or trying to theoretically extend a result of interest. In all cases, the project must have a theoretical component, and the end product will be a written report. More details will be given later in the semester.
Some of the homeworks will include optional problems that can be completed for extra credit, intended as an opportunity for students who are looking for additional challenges. These are entirely optional, and are in no way required for completing the course; furthermore, I guarantee that it will be possible to get an A in the course without doing any extra credit. Note that extra credit points will not necessarily be treated on the same scale as the "regular" points used in grading homeworks.
You can track how well you are doing using blackboard, bearing in mind that these posted grades do not include late penalties or extra credit points. As a rough guide, if the total number of points you get on your homeworks falls close to or below 65% of the total points possible, then you may be heading for a final grade of D or F, and you should certainly seek assistance.
Homeworks can be submitted either electronically or in hard copy.
If submitting electronically, follow the "submit" link by each individual homework on the assignments page to submit using the CS department's "TigerFile" upload system. You must submit a single PDF file called "homework.pdf".
Hard-copy homeworks can be submitted immediately before or after class, or you can personally hand them to a TA in charge at another time, or they can be submitted by placing in the appropriate box located on a shelf near the vending machines in the basement of the Computer Science Building, right under the main entrance. If placing in a box, be sure to write down the day and time of submission (see late policy below). It is strongly recommended that you keep a copy of your work; if handwritten, you might, for instance, take a quick picture of your homework before submitting.
In all cases, be sure to include your name and netid as part of your write-up.
It is certainly okay to handwrite your homeworks (even if submitting electronically), but they must be neat and easily readable by the graders. Please do not write in cursive. Illegible homeworks may need to be rewritten.
All assignments are due at 11:59pm on the due date.
Each student will be allotted seven free days which can be used throughout the semester to turn in homework assignments late without penalty. For instance, you might choose to turn in HW#1 two days late, HW#4 three days late and HW#6 two days late. Once your free days are used up, late homeworks will be penalized 20% per day. (For instance, a homework turned in two days late will receive only 60% credit.) Homeworks will not receive credit more than five days past the deadline, whether or not free days are being used. Even so, all homeworks must be completed and turned in, even if this five-day limit has passed. As noted above, failure to do so may result in a final grade of D or F.
Exceptions to these rules will of course be made for serious illness or other emergency circumstances; in these cases, please contact me as soon as you are aware of the problem.
In counting late days, a weekend (that is, Saturday and Sunday together), count as a single "day". For instance, a homework that is due on Thursday but turned in on Sunday would be considered two days late, rather than three.
Late days cannot be used for anything related to the final project, nor can they be used to turn in anything beyond dean's date.
If you are turning in a late homework after hours when no one is around to accept it, please indicate at the top that it is late, and clearly mark the day and time when it was turned in. Failure to do so may result in us considering the homework to be submitted at the time when we pick it up (which might be many hours, or even a day or two after when you actually submitted it).
It is your own responsibility to keep track of how many late days you have used. We will post our own record of late days used on blackboard, but the numbers posted will not always be fully up-to-date.
Note that we do not generally mark late penalties on your returned assignments, but will deduct them from the recorded grade in computing your final grade for the semester. Knowing how many late days you have used, you can do the same calculation on your own.
Collaboration on the problem sets is allowed in this course, subject to the following guidelines:
Before working with someone else, you should first spend a substantial amount of time trying to arrive at a solution by yourself. Some of the problems are easier and should be solved individually from start to finish.
Discussing and solving harder problems with fellow students (who have not already solved it) is allowed to the extent that it leads all participants to a better understanding of the problem and the material. Following such discussions, you should only take away your understanding of the problem; you should not take notes, particularly on anything that might have been written down. This is meant to ensure that you understand the discussion well enough to reproduce its conclusions on your own. For the same reason, written "discussions" (via text, email, etc.) with other students should generally be avoided; if they do occur, you should delete them or refrain from referring back to them. In all cases, please be sure to note on your solution who you worked with.
Needless to say, simply telling the solution to someone else is prohibited, as is showing someone a written solution.
The final writing of the problem set must be done individually and strictly on your own.
If you happen to have already seen one of the assigned problems elsewhere, you should nevertheless re-solve it without referring to earlier solutions, even if they are your own. Also, please indicate that you had previously seen the problem when you write up your solution; we won't take off credit, we just would like to know.
Although background reading is always encouraged, you should not attempt to "solve" these problems by looking for the solutions in the literature. Try to solve the problems on your own, or ask an instructor or fellow students for help if you can't.
Copying of any sort is not allowed. You may not use or consult solutions taken from any student, from the web, from prior year solutions, from any other course, or from any other source. Consulting any kind of website, blog, forum, mailing list or other service for discussing, purchasing, downloading or otherwise obtaining or sharing homework solutions is strictly forbidden (except for piazza, as discussed below).
Because there is no textbook or set of readings that perfectly fits this course, students will be asked to take turns preparing "scribe notes" for posting on the course web site (specifically, on the Schedule and Readings page). Each class, one student will be the designated "scribe", taking careful notes during class, writing them up, and sending them to me for posting on the web. Here is more information on how to be a scribe. I will not grade the scribe notes, but I will give credit for doing them (about equal to having completed perfectly one additional problem set).
Auditors are welcome (provided the room is large enough), and are encouraged to take part in class discussions. If you wish to receive official "credit" for auditing this course, you must attend the vast majority (say, at least 85%) of the lectures.
Each homework will be assigned to one or more TA's, as listed on the assignments page. These TA's are in charge of all aspects of that homework (including grading it), and are the ones you should contact with all questions regarding that particular homework (whether before or after it has been turned in or graded). Of course, you can also contact me with questions about the homework (although questions about grading should be directed first to the TA's in charge). General questions regarding material covered in class or the readings can be directed to any of us. If you cannot make it to office hours, you can set up an appointment via email. In general, don't hesitate to contact us or come see us if you are having trouble.
An email list is maintained by the university for all students who are registered for the course. This list will be used by the course staff for important general announcements, such as last minute corrections to the homeworks, changes in due date, etc. To avoid missing these announcements, it is imperative that you be registered for the course.
We will also use "Piazza", an online forum that can be used by students for discussing course material, general-interest questions about the homeworks, etc. The course staff will attempt to monitor and respond to questions posted on Piazza. Students should also monitor these discussions since clarifications and corrections of problems may be posted as part of our responses. With regard to homeworks, Piazza is best used for clarifications and general understanding, rather than for discussing the specifics of how to solve particular problems. Certainly, you should avoid giving away the solution to a problem in a general posting. If you have a question that is specific to your own work, you will probably want to contact a staff member directly. To sign up for Piazza, visit: https://piazza.com/princeton/spring2019/cos511.