Building Value Systems in AI

This research introduces a generative psycho-lexical framework for constructing and understanding value systems in Large Language Models.

Applies well-established psychological theories like Schwartz's Basic Human Values to analyze AI systems
Creates a structured approach to identify, measure, and understand LLM value hierarchies
Enhances LLM alignment capabilities through better value understanding
Improves safety prediction by clarifying how LLMs prioritize different values

Security Implications: By making LLM values explicit and measurable, this approach addresses core security concerns around AI alignment, helping organizations develop more predictable, transparent, and safer AI systems that align with human values.

Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models