COS597C: Machine Learning for Health Care (Fall 2018)
Machine learning is quickly becoming a powerful tool in health care to analyze and personalize treatments, assist in diagnoses, understand the underlying biology of disease, and support decision making and medical interventions. In this seminar, we will read papers on the topic of ML for health care and discuss these papers in class\
, focusing on the models and methods, and adaptations necessary to apply these methods to health care data. There will be a final project, which is an opportunity to apply machine learning approaches to an existing health care data set.
Course Logistics
- Lectures: Monday, Wednesday, 11:00-12:20
- Instructor: Barbara Engelhardt (bee@princeton.edu)
- office hours Mondays 12:30-1:30 in COS 322
- TA: Diana Cai (dcai@cs.princeton.edu)
- office hours Thursdays 11-12 in COS Tea Room
- Piazza webpage
- Up-to-date reading for the following week
- Discussion questions for specific readings
- Course project materials
- Course announcements
Grading
- Course grade will be made up of:
- 70% class participation, including leading and participating in discussions of papers
- 30% final project, which is an eight-page write up of your application of ML approaches to a health care data set
Lectures
- L1 W Sept 12: Welcome and Introduction (bee leads discussion)
- L2 M Sept 17: Ethics 1 GL (Arvind Narayanan)
- L3 W Sept 19: Ethics 2 GL (?)
- L4 M Sept 24: Survival analysis 1 (Uthser Chitra)
- L5 W Sept 26: Survival analysis 2 (Archit Verma)
- L6 M Oct 1: Computer vision and healthcare 1 (Antti Valkonen)
- L7 W Oct 3: Computer vision and healthcare 2 (Felix Yu)
- L8 M Oct 8: Time series modeling 1 (bee)
- L9 W Oct 10: Time series modeling 2 (Yuan Wang)
- L10 M Oct 15: Natural language processing and healthcare 1 (Sonali Mahendran)
- L11 W Oct 17: Natural language processing and healthcare 2 (Daniel Suo)
- L12 M Oct 22: Causal inference and interventions 1 (Alexander Strzalkowski)
- L13 W Oct 24: Interpretability in health care (Yomjinda)
- L14 M Nov 5: Missing data in EHR (Matthew Yeh)
- L15 W Nov 7: Reinforcement learning in healthcare 1 (Qasim Nadeem)
- L16 M Nov 12: Reinforcement learning in healthcare 2 (Sayan Hassantabar)
- L17 W Nov 14: Reinforcement learning in healthcare 3 (Allison Chang)
- L18 M Nov 19: Reinforcement learning in healthcare 4 (Sinong Geng)
- L19 M Nov 26: ML and safety (Jay Lee)
- L20 W Nov 28: Generalization and transfer learning (Jonathan Lu)
- L21 M Dec 3: Adaptive learning in healthcare (Greg Gundersen)
- L22 W Dec 5: Policy, privacy, and access (Mohamed El-Dirany)
- L23 M Dec 10: Final project presentations
- L24 W Dec 12: Final project presentations
Presentation questions
When preparing to present a paper for the class, consider the following questions:
- what problem in healthcare is this paper addressing?
- what is the corresponding category of problem in ML that we can map this problem onto?
- what machine learning approaches are the authors proposing/using?
- do standard ML methods do the trick?
- What is special about the HC problem that the methods need to adapt to?
- were appropriate checks in place for model testing or avoiding model misspecification?
- what type of confounders are present in the health care data? how does the model address those confounders?
- how does the approach quantify uncertainty? Is uncertainty important here?
- are there limited numbers of samples? how are those addressed?
- how can I determine whether a specific patient's sample is similar to others, or unique? (the n of 1 problem)
- how is patient/patient group heterogeneity addressed?
- how can you quantify the most important features of the data? Are the methods interpretable?
- how are doctor/caregiver mistakes accounted for?
- are there possible biases introduced by the data or the ML method?
-what are ways to combat those biases?
- how would you explain to a doctor how the method worked, or why it arrived at a class label/decision?
- what is the distance that needs to be covered before this method is deployed in a healthcare setting?