AI ALIGNMENT
Kutatók ütemtervet javasolnak a plurális összehangoláshoz az AI rendszerekben
New research from the University of Washington, Stanford, MIT, and AllenAI lays out a framework for 'Pluralistic Alignment.' The motivating idea is that as a broader set of people rely on AI, systems need to be capable of representing a diverse set of human values and perspectives rather than a single moral lens. The researchers define three types of alignment and three distinct evaluation approaches to measure how well AI systems cater to diverse needs.
- Overton pluralistic: AI provides comprehensive responses acknowledging multiple viewpoints.
- Steerably pluralistic: AI can be faithfully steered to represent particular attributes or normative frames.
- Distributionally pluralistic: AI represents the values of a specific target population or group.
- Evaluation methods include multi-objective measures, trade-off steerable frontiers, and jury-pluralistic benchmarks.
Miért fontos?
AI systems are political artifacts so we need to measure their politics. Frameworks like this help us understand how we can examine the political tendencies of AI systems—an increasingly tough and important task, especially as AI systems are deployed more widely.