Abstract
We consider the problem of improving the adversarial robustness of neural networks while retaining natural accuracy. Motivated by the multi-view information bottleneck formalism, we seek to learn a representation that captures the shared information between clean samples and their corresponding adversarial samples while discarding these samples’ view-specific information. We show that this approach leads to a novel multi-objective loss function, and we provide mathematical motivation for its components towards improving the robust vs. natural accuracy tradeoff. We demonstrate enhanced tradeoff compared to current state-of-the-art methods with extensive evaluation on various benchmark image datasets and architectures. Ablation studies indicate that learning shared representations is key to improving performance.
Original language | English (US) |
---|---|
Article number | 109054 |
Journal | Pattern Recognition |
Volume | 134 |
DOIs | |
State | Published - Feb 2023 |
Keywords
- Adversarial robustness
- Information bottleneck
- Multi-view learning
- Shared information
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence