Robustness in Machine Learning (CSE 599-M)
- Instructor: Jerry Li
- TA: Haotian Jiang
- Time: Tuesday, Thursday 10:00—11:30 AM
- Room: Gates G04
- Office hours: by appointment, CSE 452
As machine learning is applied to increasingly sensitive tasks, and applied on noisier and noisier data, it has become important that the algorithms we develop for ML are robust to potentially worst-case noise.
In this class, we will survey a number of recent developments in the study of robust machine learning, from both a theoretical and empirical perspective.
Tentatively, we will cover a number of related topics, both theoretical and applied, including:
- Learning in the presence of outliers. Techniques for learning when our training dataset is corrupted by worst-case noise. This includes robust statistics, list learning, and watermarking and data poisoning attacks.
- Adversarial examples. Famously, neural network image classifiers can be fooled at test time by perturbing a test image by an imperceptible amount. We will discuss how such attacks work, empirical defenses for these attacks (e.g. PGD), and certifiable defenses which yield provable robustness.
- Private machine learning. How can we develop algorithms for ML that respect the privacy of the users providing the data?
Our goal (though we will often fall short of this task) is to devise theoretically sound algorithms for these tasks which transfer well to practice.
The intended audience for this class is CS graduate students in Theoretical Computer Science and/or Machine Learning, who are interested in doing research in this area.
However, interested undergraduates and students from other departments are welcome to attend as well.
The coursework will be light and consist of some short problem sets as well as a final project.
For non-CSE students/undergraduates: If you are interested in this class, please attend the first lecture. If the material suits your interests and background, please request an add code from me afterwards.
We will assume mathematical maturity and comfort with algorithms, probability, and linear algebra. Background in machine learning will be helpful but should not be necessary.
Please turn in (or email) a one page
project proposal by November 12th
Projects can be reading projects, where you survey the literature on some area that we didn't cover, or research projects, where you try (but not necessarily succeed at) tackling an open problem in the area.
Projects can be either theoretical or applied.
If you're short of ideas, please feel free to ask the instructor!
Homework problems will be added as we cover the appropriate material. Each homework will be due one to two weeks after we finish the corresponding unit.
- Homework 1. Due date:
Nov. 5 Nov. 12 pdf
- Homework 2. Due date: Dec. 6 pdf
Lecture notes are work in progress. Feedback is welcome!
- Lecture 0: Syllabus / administrative stuff (slightly outdated). notes
- Lecture 1 (9/26): Introduction to robustness. notes
- Lecture 2 (10/1): Total variation, statistical models, and lower bounds. notes
- Lecture 3 (10/3): Robust mean estimation in high dimensions. notes
- Lecture 4 (10/8): Spectral signatures and efficient certifiability. notes
- Lecture 5 (10/10): Efficient filtering from spectral signatures. notes
- Lecture 6 (10/15): Stronger spectral signatures for Gaussian datasets. notes
- Lecture 7 (10/17): Efficient filtering from spectral signatures for Gaussian data. notes
- Lecture 8 (10/22): Additional topics in robust statistics. notes
- Lecture 9 (10/24): Introduction to adversarial examples. notes
- Lecture 10 (10/29): Empirical defenses for adversarial examples. notes
- Lecture 11 (10/31): The four worlds hypothesis: models for adversarial examples. notes
- NO CLASS (11/05) to recover from the STOC deadline
- Lecture 12 (11/07): Certified defenses I: Exact certification. notes
- Lecture 13 (11/12): Certified defenses II: Convex relaxations. notes
- Lecture 14 (11/14): Certified defenses III: Randomized smoothing. notes
- Lecture 15 (11/19): Additional topics in robust deep learning. notes
- Lecture 16 (11/21): Basics of differential privacy. notes
- Lecture 17 (11/26): Differentially private estimation I: univariate mean estimation. notes
- NO CLASS 11/28: Thanksgiving
- Lecture 18 (12/3): (Guest lecture by Sivakanth Gopi) Differentially private estimation II: high dimensional estimation. notes
- Lecture 19 (12/5): Additional topics in private machine learning. notes
Please refer to university policies regarding