Feb 4 |
Introduction
1. Computational Linguistics and
Deep Learning
|
1. A case for deep learning in semantics |
Danqi Chen [slides] |
|
Feb 6 |
Word Embeddings
1. Distributed Representations of Words and Phrases and their Compositionality
2. Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors
|
1. J & M, Chapter 6
2. Notes on Noise Contrastive Estimation and Negative Sampling
3. GloVe: Global Vectors for Word Representation
4. Improving Distributional Similarity with Lessons Learned from Word Embeddings
|
Danqi Chen [slides] |
|
Feb 11 |
Contextualized Word Embeddings
1. Deep contextualized word representations
2. Learned in Translation: Contextualized Word Vectors
|
1. Contextual Word Representations: A Contextual Introduction
|
Danqi Chen [slides] |
|
Feb 13 |
Pre-training and fine-tuning I
1. Improving Language Understanding by Generative Pre-Training
2. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
|
1. To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks
2. Universal Language Model Fine-tuning for Text Classification
|
Haochen Li, Daniel Wang [slides] [discussion] |
Zexuan Zhong, Jace Lu, Jinyuan Qi |
Feb 18 |
Pre-training and fine-tuning II
1. XLNet: Generalized Autoregressive Pretraining for Language Understanding
2. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer |
1. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
2. RoBERTa: A Robustly Optimized BERT Pretraining Approach
|
Andrew Or, Ksenia Sokolova [slides] |
Liwei Song, Haochen Li, Xinyi Chen |
Feb 20 |
Semantic Role Labeling
1. Deep Semantic Role Labeling: What Works and What’s Next
2. Linguistically-Informed Self-Attention for Semantic Role Labeling |
1. J & M, Chapter 20 |
Zhongqiao Gao, Chong Xiang [slides] |
Elisabetta Cavallo, Seyoon Ragavan, Kun Lu |
Feb 25 |
Machine Translation
1. Sequence to Sequence Learning with Neural Networks
2. Non-Autoregressive Neural Machine Translation |
1. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
2. Mask-Predict: Parallel Decoding of Conditional Masked Language Models
|
Elisabetta Cavallo, Ben Dodge [slides] |
Hao Lu, Daniel Wang, Paula Gradu |
Feb 27 |
NO CLASS |
Mar 3 |
Semantic Parsing
1. Language to Logical Form with Neural Attention
2. Learning to Map Context-Dependent Sentences to Executable Formal Queries |
1. Data Recombination for Neural Semantic Parsing
2. A Syntactic Neural Model for General-Purpose Code Generation
|
Hsuan-Tung Peng, Andy Su [slides] |
Eve Fleisig, Zhongqiao Gao, Ben Dodge |
Mar 5 |
Reading Comprehension
1. Teaching Machines to Read and Comprehend
2. Bidirectional Attention Flow for Machine Comprehension |
1. A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task
2. Natural Questions: a Benchmark for Question Answering Research
3. DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
|
Arjun Sai Krishnan, Seyoon Ragavan [slides] |
Zhenyu Song, Hao Gong, Andy Su |
Mar 10 |
Open-domain Question Answering
1. Reading Wikipedia to Answer Open-Domain Questions
2. Latent Retrieval for Weakly Supervised Open Domain Question Answering |
1. Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
2. REALM: Retrieval-Augmented Language Model Pre-Training
|
Kun Lu, Chris Sciavolino [slides] |
Chong Xiang, Ameet Deshpande, Michael Hu |
Mar 10 |
Project proposal due |
Mar 12 |
Relation Extraction
1. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures
2. Matching the Blanks: Distributional Similarity for Relation Learning |
1. Graph Convolution over Pruned Dependency Trees Improves Relation Extraction
2. A General Framework for Information Extraction using Dynamic Span Graphs
|
Shunyu Yao, Zexuan Zhong [slides] |
Arjun Sai Krishnan, Chris Sciavolino, Andrew Or |
Mar 17 |
NO CLASS (Spring Recess) |
Mar 19 |
NO CLASS (Spring Recess) |
Mar 24 |
Summarization I
1. Get To The Point: Summarization with Pointer-Generator Networks
2. A Deep Reinforced Model for Abstractive Summarization |
1. A Neural Attention Model for Abstractive Sentence Summarization
2. Neural Text Summarization: A Critical Evaluation |
Ameet Deshpande [slides] |
Shunyu Yao, Sonia Murthy, May Jiang |
Mar 26 |
Summarization II
1. Neural Summarization by Extracting Sentences and Words
2. Generating Wikipedia by Summarizing Long Sequences |
1. Text Summarization with Pretrained Encoders
2. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization |
Sonia Murthy, May Jiang [slides] |
Hsuan-Tung Peng, Ksenia Sokolova, Daniel Wang |
Mar 31 |
Dialogue I
1. A Neural Conversational Model
2. Deep Reinforcement Learning for Dialogue Generation |
1. Towards a Human-like Open-Domain Chatbot 2. DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation |
Paula Gradu, Xinyi Chen [slides] |
Hsuan-Tung Peng, Zhongqiao Gao, Hao Gong |
Apr 2 |
Dialogue II
1. Personalizing dialogue agents: I have a dog, do you have pets too?
2. What makes a good conversation? How controllable attributes affect human judgments |
1. The Second Conversational Intelligence Challenge (ConvAI2) |
Jace Lu, Zhenyu Song [slides] |
Ameet Deshpande, Paula Gradu, Haochen Li |
Apr 7 |
Task-oriented Dialogue
1. A Network-based End-to-End Trainable Task-oriented Dialogue System
2. Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems |
1. POMDP-based Statistical Spoken Dialogue Systems: a Review 2. Global-locally self-attentive encoder for dialogue state tracking |
Michael Hu, Jinyuan Qi [slides] |
Zexuan Zhong, Zhenyu Song, Seyoon Ragavan |
Apr 9 |
Bias in Language
1. Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
2. On Measuring Social Biases in Sentence Encoders |
1. Gender Bias in Coreference Resolution 2. Mitigating Gender Bias in Natural Language Processing: Literature Review |
Eve Fleisig, Liwei Song [slides] |
Elisabetta Cavallo, Michael Hu, Arjun Sai Krishnan |
Apr 14 |
Annotation Artifacts in NLP Tasks
1. Annotation Artifacts in Natural Language Inference Data
2. Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference |
1. How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks
2. The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task |
Hao Lu, Hao Gong [slides] |
Eve Fleisig, Ksenia Sokolova, Jinyuan Qi, Shunyu Yao |
Apr 16 |
Adversarial Examples
1. Adversarial Examples for Evaluating Reading Comprehension Systems
2. Adversarial Example Generation with Syntactically Controlled Paraphrase Networks |
1. HotFlip: White-Box Adversarial Examples for NLP
2. Generating Natural Adversarial Examples
3. Semantically Equivalent Adversarial Rules for Debugging NLP Models |
Elisabetta Cavallo, Seyoon Ragavan [slides] |
Chong Xiang, Liwei Song, Andy Su, Jace Lu |
Apr 21 |
Interpretability
1. AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
2. Designing and Interpreting Probes with Control Tasks |
1. Pathologies of neural models make interpretations difficult
2. Attention is not Explanation
3. Attention is not not Explanation |
Ksenia Sokolova, Michael Hu [slides] |
Kun Lu, Andrew Or, Ben Dodge, Hao Lu |
Apr 23 |
Generalization
1. BAM! Born-Again Multi-Task Networks for Natural Language Understanding
2. Learning and Evaluating General Linguistic Intelligence |
1. The Natural Language Decathlon: Multitask Learning as Question Answering
2. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
3. MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension |
Changyan Wang, Ben Dodge [slides] |
Chris Sciavolino, Sonia Murthy, Xinyi Chen, May Jiang |
Apr 28 |
Guest lecture: Jesse Thomason - Language Grounding with Robots |
|
Jesse Thomason [slides] |
|
Apr 30 |
Guest lecture: Diyi Yang - Language Understanding in Social Context |
|
Diyi Yang [slides] |
|
May 12 |
Final paper due |