All sessions

Open-Source Tools for Safe and Secure AI

Contents

Executive Summary

This panel discussion at an AI summit explored how open-source tools and approaches can democratize trust in AI systems, particularly in underrepresented regions and the Global South. The panelists—representing government agencies, nonprofit organizations, model builders, and academia—emphasized that open-source tooling (rather than just open models) is critical for enabling safety evaluation, trustworthiness assessment, and equitable AI development globally. Success requires not only technical infrastructure but sustained investment in community ecosystems, inclusive governance, and culturally tailored solutions.

Key Takeaways

  1. Open-source tools are the immediate priority: Focus on benchmarking, evaluation, and safety assessment tooling accessible to developers everywhere—not just on model open-sourcing. This is more practical and democratizing in the near term.

  2. The Global South needs targeted investment and co-creation: Generic open-source initiatives won't address the specific gaps in low-resource languages, regional datasets, and culturally grounded AI evaluation. Dedicated effort, funding, and community participation from India, Africa, and other regions is essential.

  3. Engineers respond to practical tools, not policy documents: Regulatory frameworks are ineffective without actionable technical solutions. Pairing policy intent with developer-friendly tooling (like Inspect) achieves compliance and adoption simultaneously.

  4. Sustainability requires systemic ecosystem investment: Open-source AI safety cannot succeed with project-based funding alone. Governments and enterprises must fund community infrastructure, conferences, representative organizations, and cross-border collaboration mechanisms.

  5. Trust-first design changes everything: If the AI community shifts to embedding trustworthiness, safety, and fairness into system design rather than auditing after development, the entire field becomes more robust and the open-source tooling becomes more meaningful.

Key Topics Covered

  • Open-source AI tooling landscape: Current state, gaps, and the difference between open-weight models and truly open-source approaches
  • Democratizing trust through tools: How open-source tools enable broader participation in AI safety and evaluation
  • Global South representation: Critical gaps in benchmarking, testing, and dataset diversity for non-English, low-resource languages and regions
  • Policy vs. technical solutions: The importance of giving engineers actionable tools rather than regulatory documents
  • State capacity building: The role of government safety institutes and international networks in supporting open-source adoption
  • Open-source ecosystem sustainability: Investment, funding mechanisms, and community infrastructure needed for long-term viability
  • Trust as design principle: Shifting from post-hoc safety verification to trust-first design approaches
  • Multilingual and cultural diversity: Ensuring AI models and benchmarks reflect linguistic and cultural contexts beyond English and Western markets

Key Points & Insights

  1. Open-source tools ≠ open models: The panel distinguished between open-weight models and truly open-source tooling. While model access matters, the most immediate need is open-source evaluation, benchmarking, safety, and testing tools that developers can freely access and modify.

  2. The OECD Catalog initiative: Over 900 tools now catalogued at ocd.ai/catalog to help AI actors identify resources for fair, transparent, explainable, robust, secure, and safe AI development—though maintaining currency is a major challenge.

  3. Inspect as a model: The UK's Inspect platform (developed by the AI Security Institute) exemplifies effective open-source tooling—it allows developers to self-certify AI safety without requiring regulatory compliance documents, making safety achievable and practical for developers.

  4. Global South is systematically underserved: Significant gaps exist in benchmarks, datasets, and testing for Indian languages, low-resource languages, and regional contexts. Example: Delhi Police discontinued facial recognition because accuracy was inadequate for Indian faces. Agricultural AI bots are tested without open, community-verified benchmarks.

  5. Hardware accessibility remains a bottleneck: Even with open-source tools and models, frontier model development requires enormous computational investment. True democratization requires cheaper hardware, distributed computing stacks, and fine-tuning approaches—not just open weights or code.

  6. Trust must be a design-first principle, not post-hoc verification: Current practice treats safety and trustworthiness as evaluation concerns added after development. The shift needed is embedding trust into the entire AI design chain from inception.

  7. Open-source success requires infrastructure investment: Conference funding, representative organizations, international collaboration mechanisms, and public-private funding are essential. China's coordinated 5-year and 10-year open-source strategies demonstrate this requires sustained state and enterprise commitment.

  8. Engagement and community co-design matter more than top-down deployment: The "submarine under the digital economy" metaphor—open-source communities are often invisible to their own governments. Success requires shifting communication channels to reach developers where they work, not imposing tools from above.

  9. Content moderation and safety tooling already exist at scale: Roost's open-source libraries enable small startups (e.g., Bluesky) to perform content moderation, CSAM detection, and policy enforcement on par with large platforms (45 million events/day) without massive proprietary infrastructure.

  10. Transparency and provenance of training data is critical: Beyond open weights, the field needs visibility into what data went into model pre-training, how it was sourced, and how models can be audited—essential for trustworthiness in regulated domains.


Notable Quotes or Statements

"When you have a risk or a legal challenge around technology, the answer isn't regulation and it isn't law—it's usually a software or a technical solution." — Amanda Brookke, Open UK

"If I said 'here's a 2,000-page PDF of the EU AI Act, go build some AI,' you're not going to do it. But if I say 'here's the UK's Inspect platform that lets you self-certify the safety of your LLM outputs,' you might well do it." — Amanda Brookke

"We have an opportunity to see much more diversity in the production, economics, and power structures of how AI works." — Mark Surman, Mozilla Foundation

"If such a transformative technology is just in the hands of a few largely American companies, it ultimately becomes a problem for government autonomy, citizen data rights, and enterprise trust." — Odre Erblan Stoop, Mistral AI

"Delhi Police stopped using facial recognition because the recognition rate is low for Indian faces, and they don't want to rely on it as an investigative tool." — Bala Raman Ravindran, IIT Madras (example of Global South gap)

"A lot of this open-source community isn't recognized in our home geographies—that's why we're in small rooms. We are an internationally collaborative community." — Amanda Brookke

"Success involves recognition that there will be remaining gaps, and there is room for discussion at national and international levels about resilience measures and policy responses." — Oliver Jones, UK Department for Science, Innovation and Technology

"We can't just pick the upbeat narrative—we can't pick one over the other. We need to consider both opportunities and remaining risks." — Oliver Jones


Speakers & Organizations Mentioned

SpeakerTitle/RoleOrganization
Amanda BrookkeCEOOpen UK
Mark SurmanPresidentMozilla Foundation; also represents Roost (open-source online safety tooling)
Odre Erblan StoopSenior Vice President, Global Public Affairs & CommunicationsMistral AI
Oliver JonesDeputy Director, International AI PolicyUK Department for Science, Innovation and Technology
Bala Raman RavindranProfessor & Head, Department of Data Science and AI; Chair, Working Group on Safe and Trusted AIIIT Madras (Indian Institute of Technology, Madras)
Moderator (Karina)OECD / Global Partnership on AI

Additional organizations/initiatives referenced:

  • OECD (Organization for Economic Co-operation and Development)
  • Global Partnership on AI
  • UK AI Security Institute (AISI)
  • Roost (open-source online safety tooling)
  • Center for Responsible AI (IIT Madras)
  • Working with ML Commons
  • Blue Sky (uses Roost's Osprey libraries)
  • Meta, formerly Twitter (referenced for trust/safety infrastructure investment)

Technical Concepts & Resources

Tools & Platforms

  • Inspect: UK AI Security Institute's open-source tool for self-certifying LLM safety; flagship example of actionable open-source tooling
  • Roost Osprey: Open-source libraries for content moderation, CSAM detection, and policy enforcement (used by Blue Sky and others)
  • Guardrails: Open-source tooling for developers to deploy and compare safety/evaluation mechanisms
  • OECD Catalog of Tools and Metrics for Trustworthy AI: 900+ tools indexed at ocd.ai/catalog

Datasets & Benchmarks

  • India-specific benchmarks for AI deployment (under development by IIT Madras and Center for Responsible AI)
  • Audio deepfake detection benchmarks (mentioned as misaligned with real-world deployment outcomes)
  • Multilingual model testing across low-resource languages (identified as inadequate)
  • Agricultural AI bot testing (noted as lacking open, community-verified benchmarks)

Methodologies & Concepts

  • Trust-first design: Embedding safety and trustworthiness into design phase rather than post-hoc evaluation
  • AI control and steering vectors: Mentioned as safety methodologies (referenced in audience Q&A)
  • Frontier AI evaluation: Methodologies for assessing cutting-edge model behavior
  • Bug bounties and zero-day reporting: Security practices proposed as models for broader trust/safety verification in open-source AI
  • Provenance tracking: Understanding data sources and training processes in model pre-training

Models/Languages Mentioned

  • Mistral models in German, Italian, and Spanish (launched 2023; Mistral expanding to multiple languages)
  • Issues with multilingual models in low-resource languages being easy to jailbreak

Policy Frameworks Referenced

  • EU AI Act (cited as an example of regulatory approach without technical tooling)
  • OECD AI Principles (adopted 2019; emphasized transparency, accountability, robustness, inclusive governance)
  • China's open-source strategy (5-year plan 2020, 10-year plan announced late 2023)

Note: The transcript contains some audio artifacts, repetitions, and unclear passages typical of live conference recordings. This summary prioritizes verified information and direct quotes while excluding speculative reconstructions of corrupted audio segments.