A Taxonomy for More Comprehensive AI Hazard Identification
Identifying the diverse and complex ways advanced AI systems might cause harm requires moving beyond selective or ad-hoc lists of potential risks. The Aspect-Oriented Taxonomy of AI Hazards provides the essential structure for systematically exploring the hazard space and identifying consequential threats. This avoids reliance on ad-hoc lists, ensuring more comprehensive hazard discovery tailored to the AI system under study.
A First-Principles Approach
This taxonomy decomposes AI risk based on a systems-thinking perspective, analyzing the AI entity, its environment, and their interaction. It organizes potential hazards across four high-level aspect categories (TL0):
- Capabilities: The inherent abilities of the AI system (e.g., reasoning, learning, agency).
- Domain Knowledge: Specific expertise possessed by the AI that could enable harm (e.g., cybersecurity vulnerabilities, biology).
- Affordances: How the AI interacts with its environment (e.g., API access, deployment context, system interfaces).
- Impact Domains: The sociotechnical areas where harms ultimately manifest (e.g., individuals, society, the biosphere).
Hierarchical Structure
These categories are broken down hierarchically through five Taxonomy Levels (TL0 to TL4), moving from broad categories to specific aspect-adjacent hazards:
- Aspect Categories (TL0): The four main categories (Capabilities, Domain Knowledge, Affordances, Impact Domains).
- Aspect Groups (TL1): Major subdivisions within the Aspect Categories.
- Aspects (TL2): Specific system characteristics or elements within each group.
- Hazard Clusters (TL3): Related groups of hazards, potentially spanning multiple Aspects.
- AI Hazards (TL4): Individual potential harms or vulnerabilities identified through analysis.
Linking System Properties to Real-World Harms
Conceptually, the taxonomy connects the system's inherent properties—termed 'source aspects' (within Capabilities, Domain Knowledge, Affordances)—to the contexts where harms ultimately manifest, known as 'terminal aspects' (within Impact Domains).
This structure guides the identification of specific 'aspect-adjacent hazards'. These are the potential harms or vulnerabilities originating directly from, or enabled by, the system's source aspects, or the vulnerabilities within terminal aspects through which risks manifest before final impact. By explicitly mapping these connections, the taxonomy enables assessors using bottleneck analysis and risk pathway modeling to systematically probe potential failure modes from both ends of the causal chain (i.e., from system capabilities towards societal impacts, and tracing back from potential impacts to enabling system characteristics).
Benefits
This taxonomy:
- Provides a comprehensive index for exploring the hazard space.
- Guides assessors systematically, reducing blind spots.
- Facilitates the integration of findings from specialized analyses (e.g., bias, security, misuse).
- Enables structured comparison of risks across different systems and contexts.
- Forms the backbone of the Risk Detail Table used in the Workbook Tool.
Explore the Taxonomy (TL0-TL2)
The taxonomy of aspect-oriented AI Hazards is presented below. In this visualization:
- Dark-colored boxes represent Aspect Categories (TL0)
- Medium-colored boxes indicate Aspect Groups (TL1)
- Light-colored boxes show individual Aspects (TL2)
Examples of Hazard Clusters (TL3) and AI Hazards (TL4) are expected to be added to this webpage over time.