
EigenShield: Fortifying Vision-Language Models
A Novel Defense Against Adversarial Attacks Using Random Matrix Theory
EigenShield introduces a computationally efficient, architecture-independent defense mechanism that protects Vision-Language Models (VLMs) from adversarial attacks by filtering out malicious signals.
- Leverages Random Matrix Theory to identify and neutralize adversarial disruptions in model representations
- Works at inference-time without requiring expensive retraining or model modifications
- Demonstrates superior performance compared to existing defenses against adaptive attacks
- Provides a theoretical foundation for understanding adversarial vulnerabilities in multimodal models
This research significantly advances AI security by addressing critical vulnerabilities in increasingly deployed multimodal systems, offering a practical defense that can be implemented across various model architectures.