
Breaking Black-Box AI Models
A simple attack approach achieving over 90% success rate against GPT-4.5/4o/o1
This research introduces a highly effective yet surprisingly simple method for creating adversarial attacks against commercial large vision-language models (LVLMs).
- Achieves over 90% success rate against leading black-box models like GPT-4.5, GPT-4o and o1
- Reveals critical vulnerabilities where adding semantic information to perturbations dramatically improves attack success
- Demonstrates that even the most advanced commercial LVLMs have exploitable security weaknesses
- Provides a baseline attack method that outperforms complex approaches with simpler implementation
These findings highlight urgent security concerns for AI deployment in sensitive applications, suggesting commercial LVLMs remain vulnerable to adversarial manipulation despite their sophistication.