- Instructor: Jerry Li
- TA: Haotian Jiang
- Time: Tuesday, Thursday 10:00—11:30 AM
- Room: Gates G04
- Office hours: by appointment, CSE 452

As machine learning is applied to increasingly sensitive tasks, and applied on noisier and noisier data, it has become important that the algorithms we develop for ML are robust to potentially worst-case noise. In this class, we will survey a number of recent developments in the study of robust machine learning, from both a theoretical and empirical perspective. Tentatively, we will cover a number of related topics, both theoretical and applied, including:

**Learning in the presence of outliers.**Techniques for learning when our training dataset is corrupted by worst-case noise. This includes robust statistics, list learning, and watermarking and data poisoning attacks.**Adversarial examples.**Famously, neural network image classifiers can be fooled at test time by perturbing a test image by an imperceptible amount. We will discuss how such attacks work, empirical defenses for these attacks (e.g. PGD), and certifiable defenses which yield provable robustness.**Model misspecification.**Understanding when algorithms designed for a specific generative model will still work when the true data may not come from something else. This includes topics such as distributional shift and semi-random adversaries.

Our goal (though we will often fall short of this task) is to devise theoretically sound algorithms for these tasks which transfer well to practice.

The intended audience for this class is CS graduate students in Theoretical Computer Science and/or Machine Learning, who are interested in doing research in this area. However, interested undergraduates and students from other departments are welcome to attend as well. The coursework will be light and consist of some short problem sets as well as a final project.

**For non-CSE students/undergraduates:** If you are interested in this class, please attend the first lecture. If the material suits your interests and background, please request an add code from me afterwards.

We will assume mathematical maturity and comfort with algorithms, probability, and linear algebra. Background in machine learning will be helpful but should not be necessary.

- Lecture 0: Syllabus / administrative stuff. notes
- Lecture 1 (9/26): Introduction to robustness. notes
- Lecture 2 (10/1): Total variation, statistical models, and lower bounds. notes
- Lecture 3 (10/3): Robust mean estimation in high dimensions. notes
- Lecture 4 (10/8): Spectral signatures and efficient certifiability. notes
- Lecture 5 (10/10): Efficient filtering from spectral signatures. notes
- Lecture 6 (10/15): Stronger spectral signatures for Gaussian datasets. notes
- Lecture 7 (10/17): Efficient filtering from spectral signatures for Gaussian data. notes
- Lecture 8 (10/22): Additional topics in robust statistics. notes
- Lecture 9 (10/24): Introduction to adversarial examples. notes
- Lecture 10 (10/29): Empirical defenses for adversarial examples. notes
- Lecture 11 (10/31): The four worlds hypothesis: models for adversarial examples. notes
- NO CLASS (11/05) to recover from the STOC deadline
- Lecture 12 (11/07): Certified defenses I: Exact certification. notes
- Lecture 13 (11/12): Certified defenses II: Convex relaxations. notes
- Lecture 14 (11/14): Certified defenses III: Randomized smoothing. notes

- Principled Approaches to Robust Machine Learning and Beyond. My Ph.D thesis.
- Robust Learning: Information Theory and Algorithms Jacob Steinhardt's Ph.D thesis.
- Jacob is also teaching a similar class at Berkeley this semester: link