Security Vulnerabilities in SSM Models

Research reveals how structured state space models (SSMs), increasingly used as transformer alternatives in LLMs, can be compromised through clean-label poisoning attacks.

SSMs have an implicit bias that typically helps with generalization
Adversaries can exploit this bias using strategically crafted poisoned examples
Even with clean labels, attackers can manipulate the model to memorize arbitrary patterns
The vulnerability persists even with small amounts of poisoned data

This research is crucial for AI security as it exposes fundamental weaknesses in models positioned as efficient replacements for transformers. Organizations deploying SSM-based systems should implement robust defenses against these subtle but damaging attacks.

The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels