Article Source
Graph Nets: The Next GenerationAdversarial Examples and Human-ML Alignment
- Speaker: Shibani Santurkar, MIT
Machine learning models today achieve impressive performance on challenging benchmark tasks. Yet, these models remain remarkably brittle—small perturbations of natural inputs, known as adversarial examples, can severely degrade their behavior.
Abstract
In this tutorial, we take a closer look at this question, and demonstrate that the observed brittleness can be largely attributed to the fact that our models tend to solve classification tasks quite differently from humans. Specifically, viewing neural networks as feature extractors, we study how features extracted by neural networks may diverge from those used by humans, and how adversarially robust models can help to make progress towards bridging this gap.
Additional tutorial info:
The tutorial will include demos—we will use Colab notebooks so please bring laptops along. In these demos, we will explore the brittleness of standard ML models by crafting adversarial perturbations, and use these as a lens to inspect the features models rely on.
Github link for demos: https://github.com/MadryLab/AdvEx_Tut…
Suggested reading (in order of importance):
- Adversarial examples https://arxiv.org/abs/1412.6572
- Training robust models https://arxiv.org/abs/1706.06083
- ML models rely on imperceptible features https://arxiv.org/abs/1905.02175
- Robustness as a feature prior https://arxiv.org/abs/1805.12152, https://arxiv.org/abs/1906.00945.