COS 484: Natural Language Processing
Fall 2019
- Lectures: Tue/Thurs 1:30pm-2:50pm in Computer Science Building 104
- Office hours:
Danqi Chen: Tuesday 3-4pm, Computer Science 412
Karthik Narasimhan: Thursday 3-4pm, Computer Science 422
Willie Chang: Wednesday 1-2pm, Computer Science 402 (except Sept 18 12:30-1:30pm, CS301)
Pranay Manocha: Monday 4-5pm, Computer Science 402
Runzhe Yang: Monday 4-5pm, Computer Science 402
You can check this
calendar for updated times of all lectures, sections, OHs and due dates.
How to contact us:
Please use
Piazza
for all questions related to lectures, homeworks, and projects, and to
find announcements. For external queries, emergencies, or personal matters, you can use a private Piazza post visible only to Instructors.
Recent advances have ushered in exciting developments in natural language processing (NLP), resulting in systems that can translate text, answer questions and even hold spoken conversations with us. This course will introduce students to the basics of NLP, covering standard frameworks for dealing with natural language as well as algorithms and techniques to solve various NLP problems, including recent deep learning approaches. Topics covered include language modeling, representation learning, text classification, sequence tagging, syntactic parsing, machine translation, question answering and others.
- Required: COS 226, knowledge of probability, linear algebra, multivariate calculus.
- COS 324 (or similar Machine Learning class) is strongly recommended.
- Proficiency in Python: programming assignments and projects will require use of Python, Numpy and PyTorch.
There is no required textbook for this class, and you should be able to
learn everything from the lectures and assignments.
However, if you would like to pursue more advanced topics or get another
perspective on the same material, here are some books (all of them can be read free online):
Readings for future lectures are tentative and subject to change.
*: All assignments are due 11:59pm.
- Assignments (40%):
There will be four assignments with both written and programming parts.
Each homework is centered around an application and will also deepen your understanding of the theoretical concepts.
- Mid-term exam (25%): The exam is an in-class written exam that will
test your knowledge and problem-solving skills on all preceding lectures and assignments.
You cannot use any external aids except one single-sided page of notes.
Date: Thursday, Oct 24
Location: COS 104
- Final project (35%): The final project offers you a chance to apply your newly acquired skills towards an in-depth application. All the final projects will be completed in teams of 2-3 students (find your teammates early!).
You are required to turn in a project proposal (due on Nov 11) and complete a paper written in the style of a conference (e.g., ACL) submission (due on Jan 14). There will be also project presentations at the end of the semester.
- Extra credit (5%): For participation in class and Piazza and bonus points from assignments. Limited to overall score of max 100%.
All assignments are due at 11:59pm on the due date (usually Mondays). There are no late days. Late submissions incur a penalty of 10% for each day, up to a maximum of 4 days beyond which submissions will not be accepted. The only exception to this rule is if you have a note from your Dean of Studies. In this case, you must notify the instructors via email. For students with a dean’s note, the weight of their missed/penalized assignment will be added to the midterm and your midterm score will be scaled accordingly (for homeworks 1,2 and 3) (e.g. if you are penalized 2 points overall, your midterm will be worth 27 and your score will be multiplied by 27/25). Missing homework 4 (after the midterm) can only be compensated by arranging an oral exam on the pertinent material.
Writeups:
Homeworks should be written up clearly and succinctly; you may lose points if your answers
are unclear or unnecessarily complicated.
Using LaTeX is recommended (here's a
template), but not a requirement. Hand-written assignments must be scanned and uploaded as a pdf.
Collaboration policy and honor code:
You are free to form study groups and discuss homeworks and projects.
However, you must write up homeworks and code from scratch independently, and you must acknowledge in your submission all the students you discussed with.
The following are considered to be honor code violations (in addition to the Princeton honor code):
- Looking at the writeup or code of another student.
- Showing your writeup or code to another student.
- Discussing homework problems in such detail that your solution (writeup or code) is almost identical to another student's answer.
- Uploading your writeup or code to a public repository (e.g. github, bitbucket, pastebin) so that it can be accessed by other students.
When debugging code together, you are only allowed to look at the input-output behavior
of each other's programs (so you should write good test cases!).
It is important to remember that even if you didn't copy but just gave
another student your solution, you are still violating the honor code, so please be careful.
If you feel like you made a mistake (it can happen, especially under time
pressure!), please reach out to Danqi/Karthik;
the consequences will be much less severe than if we approach you.
The final project offers you the chance to apply your newly acquired skills towards an in-depth NLP application. Students are required to complete the final project in teams of
2-3. You can use Piazza to
search for teammates.
Deliverables: The final project is worth 35% of your course grade. The deliverables include:
- Proposal (0%): You need to turn in a one-page proposal on Nov 11. The proposal should outline what you propose to do and a rough plan for how you will pursue the project. We will then provide feedback and guidance on the direction to maximize the project’s change
of succeeding. The proposal is not graded.
- Project presentation (10%): At the end of the semester, we will schedule project presentations for all the projects in the class. More details TBA.
- Final paper (25%): You need to complete a final report in the style of a conference (e.g. ACL) submission. It should begin with an abstract and introduction, clearly describe the proposed idea or exploration, present technical details, give results, compare to baselines, provide analysis and discussion of the
results, and cite any sources you used.
Policy and honor code:
- The final projects are required to implement in Python. You can use any deep learning framework such as PyTorch and Tensorflow.
- More generally, you may use any existing code, libraries, etc. and consult and any papers, books, online references, etc. for your project. However, you must cite your sources in your writeup and clearly indicate which parts of the project are your contribution and which parts were implemented by others.
- You are free to discuss ideas and implementation details with other teams. However, under no circumstances may you look at another team's code,
or incorporate their code into your project.
- Do not share your code publicly (e.g. in a public GitHub repo) until after after the class has finished.
Electronic Submission:
Assignments and project proposal/paper are to be submitted as pdf files through
Gradescope (we will send you the signup code for the class through Blackboard).
If you need to sign up for a Gradescope account, please use your @princeton.edu email address.
You can submit as many times as you'd like until the deadline: we will only grade the last submission.
Submit early to make sure your submission uploads/runs properly on the Gradescope servers.
If anything goes wrong, please ask a question on Piazza or contact a TA.
Do not email us your submission.
Partial work is better than not submitting any work.
For assignments with a programming component, we may automatically
sanity check your code with some basic test cases, but we will grade
your code on additional test cases.
Important: just because you pass the basic test cases, you are by no
means guaranteed to get full credit on the other, hidden test cases, so you should
test the program more thoroughly yourself!
Regrades: If you believe that the course
staff made an objective error in grading, then you may submit a
regrade request. Remember that even if the grading seems harsh to
you, the same rubric was used for everyone for fairness, so this
is not sufficient justification for a regrade.
It is also helpful to cross-check your answer against the released solutions.
If you still choose to submit a regrade request, click the corresponding
question on Gradescope, then click the "Request Regrade" button at the
bottom.
Any requests submitted over email or in person will be ignored. Regrade requests for a
particular assignment are due by Sunday 11:59pm, one week after the
grades are returned. Note that we may regrade your entire submission,
so depending on your submission you may actually lose more points
than you gain.