
The Hidden Pattern of AI Bias
Revealing surprising similarities in bias across different LLM families
This study examines bias patterns across 13 different Large Language Models, revealing that bias similarities persist even across different model families and after fine-tuning.
- Models from the same family show highly similar bias patterns (up to 0.99 correlation)
- Even models from different families show significant bias correlations
- Fine-tuning has minimal impact on reducing underlying bias distribution
- Models with similar architectures exhibit similar bias patterns regardless of training data
From a security perspective, these findings suggest that bias transfer between models creates persistent vulnerabilities that current debiasing techniques fail to address effectively.