Backdoor Vulnerabilities in AI Vision Systems

Backdoor Vulnerabilities in AI Vision Systems

Detecting poisoned samples in CLIP models with 98% accuracy

This research reveals critical security vulnerabilities in CLIP (Contrastive Language-Image Pretraining) models and introduces a novel detection method to identify backdoor attacks.

  • CLIP models are vulnerable to poisoning attacks with just 0.01% contaminated training data
  • Researchers identified unique patterns in poisoned samples' representations
  • Their detection framework achieves 98% accuracy in identifying backdoor samples
  • The work highlights serious security concerns for large-scale AI models trained on unscreened web data

Why it matters: As organizations deploy more vision-language models trained on public data, these backdoor vulnerabilities could be exploited by adversaries to manipulate model behavior in targeted ways.

Detecting Backdoor Samples in Contrastive Language Image Pretraining

54 | 104