Breaking Black-Box AI Models

This research introduces a highly effective yet surprisingly simple method for creating adversarial attacks against commercial large vision-language models (LVLMs).

Achieves over 90% success rate against leading black-box models like GPT-4.5, GPT-4o and o1
Reveals critical vulnerabilities where adding semantic information to perturbations dramatically improves attack success
Demonstrates that even the most advanced commercial LVLMs have exploitable security weaknesses
Provides a baseline attack method that outperforms complex approaches with simpler implementation

These findings highlight urgent security concerns for AI deployment in sensitive applications, suggesting commercial LVLMs remain vulnerable to adversarial manipulation despite their sophistication.

A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1