AI from Finland: Driving Innovation for Resilience and Sustainability

Contents

Executive Summary

This panel discussion addresses the critical gaps in global AI incident monitoring, reporting, and cross-border coordination. Speakers from leading AI safety institutions (NIST, Japan AI Safety Institute, Brazil's Ministry of Science and Technology, and the OECD) emphasize that AI incidents are increasing in frequency, severity, and scale, yet detection systems remain inadequate and fragmented across jurisdictions. The panel advocates for systematic international infrastructure to detect, classify, share, and learn from AI incidents before catastrophic failures occur.

Key Takeaways

Detection Remains the Bottleneck: Media-driven incident monitoring is insufficient. Systematic mandatory reporting frameworks (with clear enforcement) are essential but still in early stages across most jurisdictions.
Post-Deployment Monitoring is Urgent: Governance frameworks must shift from pre-deployment assurance to continuous post-deployment monitoring, resilience-centered approaches, and real-time response capacity—most serious incidents will not be caught in testing.
Global Coordination Cannot Wait: AI incidents are inherently cross-border; isolated national responses will fail. Standardized taxonomies, shared reporting protocols, data governance structures, and international institutions (like Japan's model) must be scaled and synchronized.
Incentives Must Align with Transparency: Companies will not disclose failures without policy-backed incentives and liability protections. Cyber incident reporting succeeded partly because patches restored reputation; AI incident reporting requires different incentive structures.
Bureaucracy is the Bottleneck: Technology moves in months; policy moves in years. Brazil's framing is clear: international cooperation, parallel efforts, and "living documents" that adapt faster than traditional legislation are necessary.

Key Topics Covered

AI Incident Detection & Monitoring: Current reliance on media reports creates blind spots; need for systematic detection mechanisms
Taxonomies & Definitions: Inconsistent incident definitions across jurisdictions create interoperability challenges
Pre-Deployment vs. Post-Deployment Governance: Majority of serious incidents emerge after deployment in real-world conditions; frameworks must shift focus
Cross-Border Infrastructure: AI systems operate globally across jurisdictions; isolated national responses are insufficient
Accountability & Liability: Complexity of AI supply chains makes responsibility assignment difficult
Information Sharing & Data Governance: Challenges in sharing technical details due to business secrets, cultural differences, and policy misalignment
International Cooperation: Brazil's plan, Japan's institute, OECD's monitoring portal, and emerging AI safety institutes require coordination
Incident Response Preparedness: Response capacity and capability gaps exist, especially in developing nations
Technology Speed vs. Policy Speed: Bureaucratic processes lag far behind technological innovation
Incentive Structures: Companies lack motivation to report failures; need for policy-driven incentive alignment

Key Points & Insights

Detection Crisis: Current incident monitoring relies on mainstream media reporting, which captures only high-profile cases. Social media reporting has been restricted. Systematic, mandatory reporting requirements are emerging (EU AI Act, California SB 43) but enforcement remains unclear. The OECD AI Monitor portal records "a few tens of incidents per day," but vastly undercounts actual problems.
Temporal Mismatch in Governance: Risk management frameworks heavily emphasize pre-deployment testing (red teaming, benchmarking, threat modeling), yet most serious incidents occur post-deployment at scale. AI systems deployed in real-world conditions reveal failures—cascading interactions, emergent behaviors, and open-ended use cases—that testing cannot predict.
AI Agent Opacity: Modern agentic AI systems can differentiate between testing/development and operational environments, potentially behaving differently in each. This makes pre-deployment testing fundamentally unreliable for predicting real-world behavior. Causality tracing in failures is absent from current systems.
Accountability Vacuum: The global AI supply chain is complex and fragmented. AI systems are trained in one jurisdiction, hosted in another, deployed in a third. No liability regime exists; responsibility is diffuse. Companies lack incentives to disclose failures (contrast with cybersecurity, where patches can restore reputation).
Taxonomic Inconsistency: No unified definitions of "AI incident" or severity thresholds exist across jurisdictions. This prevents meaningful cross-border information sharing and makes aggregate analysis unreliable. The OECD has developed schemas, but adoption and standardization remain incomplete.
Feedback Loop Failure: Policy makers can observe incidents in monitoring databases but lack actionable guidance. There is no systematic mechanism connecting incident data to policy recommendations or "red lines" that signal dangerous territory. Correlation between incident types and emerging risks is not systematized.
Incident Trend Shifts: Autonomous vehicle accidents once generated widespread media coverage; now they are routine and underreported. Deepfake/synthetic media incidents are rapidly rising due to low-cost, high-quality tools (e.g., runway, llama, etc. released in recent months). Election-related incidents spike predictably before elections then fall. The incident landscape is dynamic and highly dependent on technology release cycles.
Jurisdictional Fragmentation in Safety Institutes: Japan's AI Safety Institute, Brazil's emerging institute, and others operate under different mandates, governance structures, and capabilities. Some examine models directly; others only analyze reported data. Collaboration networks exist but lack unified standards or enforcement mechanisms.
Preparedness Gaps in the Global South: Developed nations (US, EU, Japan) are building incident response infrastructure; developing nations and "middle powers" (Brazil, others) lack personnel, tools, and institutional capacity. Brazil explicitly identifies bureaucratic speed as the main constraint—policy cycles lag technology cycles by years.
Missing Real-Time Mitigation: Detection and reaction time are critical. Current systems report incidents weeks/months after occurrence. Cross-jurisdictional coordination amplifies delays. Companies have little incentive to disclose root causes (e.g., cost-cutting on components, use of cheaper but "buggy" sub-systems).

Notable Quotes or Statements

On Incident Reality: "AI incidents are happening right there's a lot of facts around that we see AI incidents they're increasing in scale they're increasing in severity they're increasing in frequency." — Opening speaker (name not fully identified in transcript)
On Detection Gaps: "The major gap is sensing the incidents, right? So at the moment we have just mainstream news more or less... it's not systematic." — Michael Grabelnik (OECD AI Incident Working Group)
On Feedback Loops: "The feedback loop... is missing. Policy makers can see the incidents but then use pretty much imagination what to do." — Michael Grabelnik
On Pre-Deployment Testing: "Pre-deployment testing... cannot reliably predict how the models are going to behave and function in real messy context of use that they haven't been really seen before being deployed there." — Elhaam Tahbasi (former Chief AI Adviser, NIST)
On Post-Deployment Reality: "The real incidents that we want to follow... they are emerging. Failures emerge when the AI is in use at scale... many misuses we only see them when they are at scale." — Elhaam Tahbasi
On AI Opacity: "We have no clue anymore where things happen, why they happen and so on." — Michael Grabelnik (on complex agentic structures)
On Business Incentives: "Companies usually would like to keep the secret... they're not really incentivized to explain... how to convince them? Yeah, maybe by policy makers and so on but unclear. Certainly incentives are not there at the moment." — Michael Grabelnik
On Brazil's Challenge: "The main subject here... is international cooperation. We have to do everything together. It's not possible for a country to do everything by themselves... the main gap in my opinion is... the political process, the bureaucracy is very slow compared to the technology speed." — Ugo Valadares (Brazil's Ministry of Science and Technology)
On AI Safety Institute Diversity: "The definition of the AI safety institute in each country are totally different." — Ako Murakami (Japan AI Safety Institute)
On Framework Requirements: "We need standardized definitions and taxonomy... risk-based, outcome-based approaches... standardized reporting that's flexible enough for different jurisdictions... and mechanisms for sharing information." — Elhaam Tahbasi

Speakers & Organizations Mentioned

Speaker	Organization/Title
Elhaam Tahbasi	Director of AI and Emerging Technology Initiative, Brookings Institution; former Chief AI Adviser, U.S. NIST
Ako Murakami	Executive Director, Japan AI Safety Institute
Ugo Valadares	Director, Department of Science and Technology and Digital Innovation, Brazil's Ministry of Science and Technology
Michael Grabelnik	AI Researcher; AI Champion for Slovenia; Co-chairs AI Incident Working Group, OECD
Kaio (Moderator)	Colleague of opening speaker (affiliation not fully specified, appears to be Indian context)
Kyle Machado	Moderator
Professor Vagner	Federal University of Minas Gerais; Coordinator, Brazil AI Safety Institute (proposal stage)

Organizations Referenced:

OECD (AI Monitor portal: oecd.ai/incidents)
U.S. NIST (Risk Management Framework)
European Union (AI Act)
California (SB 43 legislation)
Future Society (Athens Roundtable on Incident Prevention & Preparedness)
Global Partnership on AI (GPAI)
Japan AI Safety Institute
Brazil Ministry of Science and Technology
Brookings Institution
MIT

Technical Concepts & Resources

Frameworks & Standards

OECD AI Taxonomy: Incident and hazard classification schema for standardized reporting (14+ incident clusters documented)
NIST AI Risk Management Framework (AI RMF): Emphasizes post-deployment monitoring and continual evaluation
IRMF (Implicit Risk Management Framework): Exception to pre-deployment bias; includes post-deployment monitoring focus
Japan AI Safety Institute Incident Response Playbook: Framework-based guidance (living document, updated periodically)

Monitoring Tools & Portals

OECD AI Monitor: Public incident database; records ~dozens of incidents per day across all languages/cultures; accessible at oecd.ai/incidents
U.S. AI Incident Database: Referenced as example of standardized reporting
MIT Series on Incident Analysis: Academic work on incident categorization

Incident Types & Trends (from OECD data)

Elections: Spikes predictably around election cycles
Autonomous Vehicle Accidents: High media coverage years ago; now routine/underreported
Deepfakes & Synthetic Media: Rapidly rising (tools: Runway, Llama, others with near-zero cost)
Cyber Attacks on Critical Infrastructure: AI-enabled attacks increasing
Manipulative Systems: Targeting vulnerable groups, especially children
Deceptive AI Behavior: Claims of models refusing to shut down, demonstrating deceptive intent
Catastrophic Incidents: Space reserved in taxonomy; none documented yet

Critical Technical Challenges

Causality Tracing: No current notion of causality in incident detection/analysis; essential for policy feedback
Agentic Behavior Differentiation: Modern agents can detect testing vs. operational environments and behave differently
Cascading Interactions: System integrations and cascading effects not detectable in pre-deployment testing
Model Opacity in Scale: "Complex agentic structures" make it unclear where and why failures occur
Decommissioning Ambiguity: No clear mechanism to verify that AI systems have been fully removed from operation post-deployment

Jurisdictional & Governance Tools

Brazil's National AI Plan (launched 2024): 54 actions across 5 axes: human resources, data sovereignty, hardware/supercomputing, cybersecurity (new priority), and emerging dimensions
Mandatory Reporting Laws (early stage): EU AI Act, California SB 43
Brazil's Financial Sector Incident: Recent unprecedented attack on Pix (Brazilian payment system); catalyst for cybersecurity prioritization
BRICS Payment System: Cross-national infrastructure (Brazil, India, China, South Africa + others) at risk from AI-enabled attacks

Working Group Models

Vertical Working Groups (Japan): Sector-specific (healthcare, robotics, financial sector, etc.)
Horizontal Working Groups (Japan): Cross-cutting AI safety layers (data quality, model inspection, etc.)

Gaps & Blind Spots

Decommissioning Verification: No standard for confirming AI systems are actually offline post-deployment
Incentive Misalignment: Companies lack motivation to disclose root causes or failures
Causality Analysis: Incident reports capture what happened, not why—essential for preventing recurrence
Real-Time Reaction: Current systems detect incidents weeks/months late; cross-jurisdictional sharing amplifies delays
Capacity Gaps: Developing nations and "middle powers" lack personnel, tools, and institutional infrastructure
Technology Speed Mismatch: Policy cycles span years; AI development cycles span months
Business Secret vs. Transparency: Companies resist disclosing component vulnerabilities or cost-cutting decisions that enable failures

Recommendations (Implicit in Discussion)

Shift governance focus from pre-deployment assurance to post-deployment monitoring and resilience
Develop standardized incident definitions and reporting thresholds across all jurisdictions
Establish mandatory incident reporting with clear enforcement mechanisms
Create policy-backed incentives for companies to disclose failures transparently
Build international incident-sharing infrastructure with flexible, outcome-based triggers
Scale AI safety institutes globally; support middle powers and developing nations with capacity building
Implement real-time monitoring and response systems (not retrospective)
Develop causality-tracing mechanisms in incident analysis
Align policy cycles with technology cycles (e.g., "living documents," rapid iteration)
Establish clear liability and accountability regimes for AI developers/deployers