
Emoji-Based Attacks on Language Models
Invisible Vulnerabilities in Modern NLP Systems
This research reveals how zero-perturbation attacks using emoji sequences can successfully manipulate NLP systems without altering original text content.
- Demonstrates emoji sequences can be appended to legitimate text to cause misclassification
- Achieves high success rates across multiple models while remaining undetectable to human readers
- Bypasses traditional defense mechanisms designed for text-based attacks
- Highlights critical security gaps in widely-deployed language models
This work exposes significant security concerns for real-world NLP applications, suggesting urgent need for new defense strategies against these stealthy attack vectors that leverage Unicode characters.
Emoti-Attack: Zero-Perturbation Adversarial Attacks on NLP Systems via Emoji Sequences