
Security Vulnerabilities in SSM Models
Clean-Label Poisoning Can Undermine Generalization
Research reveals how structured state space models (SSMs), increasingly used as transformer alternatives in LLMs, can be compromised through clean-label poisoning attacks.
- SSMs have an implicit bias that typically helps with generalization
- Adversaries can exploit this bias using strategically crafted poisoned examples
- Even with clean labels, attackers can manipulate the model to memorize arbitrary patterns
- The vulnerability persists even with small amounts of poisoned data
This research is crucial for AI security as it exposes fundamental weaknesses in models positioned as efficient replacements for transformers. Organizations deploying SSM-based systems should implement robust defenses against these subtle but damaging attacks.
The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels