EigenShield: Fortifying Vision-Language Models

EigenShield: Fortifying Vision-Language Models

A Novel Defense Against Adversarial Attacks Using Random Matrix Theory

EigenShield introduces a computationally efficient, architecture-independent defense mechanism that protects Vision-Language Models (VLMs) from adversarial attacks by filtering out malicious signals.

  • Leverages Random Matrix Theory to identify and neutralize adversarial disruptions in model representations
  • Works at inference-time without requiring expensive retraining or model modifications
  • Demonstrates superior performance compared to existing defenses against adaptive attacks
  • Provides a theoretical foundation for understanding adversarial vulnerabilities in multimodal models

This research significantly advances AI security by addressing critical vulnerabilities in increasingly deployed multimodal systems, offering a practical defense that can be implemented across various model architectures.

Original Paper: EigenShield: Causal Subspace Filtering via Random Matrix Theory for Adversarially Robust Vision-Language Models

67 | 104