
Cognitive Biases in Large Language Models
Comparative analysis of bias patterns across GPT-4o, Gemma 2, and Llama 3.1
This research systematically evaluates cognitive biases in leading LLMs through 1,500 experiments across nine established bias categories.
- GPT-4o demonstrated superior performance in recognizing and mitigating cognitive biases
- Gemma 2 showed inconsistent results with strengths in addressing sunk cost fallacy but varied performance elsewhere
- Llama 3.1 consistently underperformed, relying heavily on flawed heuristics
- Results highlight critical security implications for responsible AGI development
Understanding these biases is essential for developing safer AI systems that can make reliable decisions without perpetuating human cognitive errors.
Heuristics and Biases in AI Decision-Making: Implications for Responsible AGI