Detecting Bias in LLMs: A New Benchmark

SAGED(bias) is the first holistic benchmarking pipeline designed to overcome limitations in current bias detection methods for large language models.

Implements a five-stage process: scraping materials, assembling benchmarks, generating responses, extracting features, and diagnosing with disparity metrics
Addresses critical shortcomings of existing benchmarks including limited scope and contamination
Provides customizable fairness calibration to establish appropriate baselines
Enables comprehensive security assessment by identifying potential biases that could lead to harmful AI outputs

This research is vital for security professionals as it offers tools to detect hidden biases that may create vulnerabilities in AI systems and helps ensure responsible AI deployment.

SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration