Digital Public Goods for Global AI Equity

Contents

Executive Summary

This panel discussion at the AI Impact Summit examines how digital public goods (DPGs), open-source AI, and community-led initiatives can create an alternative AI ecosystem that serves the global majority rather than concentrating power among a few corporations. The speakers argue that while openness and data accessibility are essential, they must be paired with governance structures, contextual evaluation, and deliberate power-shifting to prevent existing corporate players from capturing these initiatives.

Key Takeaways

Openness is a necessary but insufficient condition. Digital public goods and open-source AI must be paired with governance structures, contextual evaluation, and intentional power-shifting strategies to prevent corporate capture and ensure genuine equity.
Community involvement must be co-creation, not extraction. Effective solutions emerge when developers, researchers, policymakers, and end-users (farmers, women, language speakers) collaborate from the start on defining problems and priorities—and when contributors' time and needs are honored contextually.
Deployment ease and local control are as important as model capability. Open-source AI will not scale unless it is easier to use and deploy than proprietary alternatives. Supporting local, privacy-preserving models on personal devices is critical to individual and community sovereignty.
Benchmarks and evaluations must be contextual, not universal. One-size-fits-all benchmarks (MMLU, HELM) cannot capture the cultural nuance, linguistic diversity, and functional requirements of non-English, non-Western contexts. Evaluation frameworks must be co-created with communities.
Current AI and DPGs represent a coalition strategy, not a silver bullet. Scaling public-interest AI requires simultaneous action on funding (grants), building (infrastructure and tools), investing (in community backbone), and policy leverage—while deliberately forming cross-sector alliances that include unlikely partners (e.g., frontier labs committing to open-source the previous generation of models).

Key Topics Covered

Data Deserts & Infrastructure Gaps: Challenges of accessing, organizing, and utilizing quality data in Global South contexts (India, Africa)
Language Models & Benchmarking: The absence of contextual benchmarks for African and underrepresented languages limits AI system evaluation and usefulness
Market Concentration & Surveillance: How tech monopolies (Amazon, Meta, Ring) control AI deployment and extract behavioral data at scale
Digital Sovereignty: National, community, and individual sovereignty in AI—from hardware to application layers
Community-Led Development: Masakane's approach to co-creating language resources with African researchers and practitioners
Open-Source as Methodology: Using open-source ideology for cross-border collaboration and knowledge-sharing
Contextual Safety & Evaluation: Why lab-based AI safety testing is insufficient; real-world deployment risks differ in low-resource contexts
Deployment Challenges: Usability gaps between closed proprietary models (easy to use) and open-source alternatives
DPGs as Alternative Pathway: Defining digital public goods standards and scaling community-driven tech solutions
Financing & Compensation Models: How to honor contributor time and support community needs (beyond cash payments)

Key Points & Insights

India's "Data Desert" Paradox: Despite being data-rich, India lacks high-quality, well-organized, contextually appropriate datasets. Public data sits in silos with inconsistent formatting, missing metadata, and unclear provenance; private data is locked behind platforms and inaccessible.
Benchmarking as a Power Structure: Existing AI benchmarks (HELM, MMLU) do not reflect contextual nuance of African languages or the 2,000+ languages on the continent. Without culturally appropriate benchmarks, AI systems cannot be properly evaluated for usefulness in these contexts—perpetuating exclusion.
Openness Alone Is Not Enough: Open systems are "prone to capture." Companies with more resources, compute, data across domains, and existing market dominance can easily absorb and outcompete open initiatives. Guard rails, governance, and political economy analysis must accompany data openness.
Surveillance & Control as Business Model: Tech giants unilaterally deploy features (Amazon Alexa always-on, Meta Glasses facial recognition, Ring surveillance ads) affecting millions without consent. Market concentration enables this behavioral extraction at population scale.
Software-Layer Sovereignty Is Achievable; Hardware Is Not: While full sovereignty from hardware to application layer is unrealistic for most countries, software-layer sovereignty is possible through coalitions (e.g., India can compete at frontier level; UK must partner with others). Individual sovereignty (local models, personal devices, data privacy) is equally important.
Deployment Usability Is Critical: Open-source AI loses adoption because it is harder to use than closed models (ChatGPT, Perplexity). If open-source AI is to compete, it must prioritize user experience, ease of deployment, and local model accessibility—not just model parity.
Community-Led Means Consulting on Priorities: Top-down approaches fail. Masakane's success stems from asking communities what they want built, not imposing predetermined solutions. This includes addressing marginalized groups intentionally (e.g., partnering with women's sanitary-wear organizations to ensure female representation in voice datasets).
Contextual Safety ≠ Lab Safety: Real-world safety risks emerge from everyday interaction in low-resource contexts with vulnerable populations—not from malicious attack. Functional safety depends on local rules, institutions, and alternatives available. Contextual evaluation in deployment is non-negotiable.
Civil Society as Connective Tissue: NGOs and civil society organizations provide unique vantage points between tech companies, governance, and communities. They understand on-the-ground impact, broader development arcs, and structural issues—making them essential for evaluating AI usefulness and safety.
Power Redistribution Requires Structural Action: Representation without power-shifting risks being "more harmful and more exploitative." Frontier labs must increase transparency on business models, vertical integration of the AI marketplace must be addressed, and new alliances (south-to-south collaboration, nonprofits-to-government-to-for-profit) must deliberately form to prevent corporate capture.

Notable Quotes or Statements

"Open systems are prone to capture and actors with more resources, more compute capacity, more talent...are in a much better position. Unless we think about the kind of political economy of that ecosystem while we're thinking about how to make data open, I think there is actually a risk that we undermine the very objectives for which we're trying to make data more publicly available."
— Uvashi (on data openness without governance)

"The entire R&D budget for the UK is like half of what Amazon spends on R&D in a year...the UK, a country like that is going to have to partner with other countries if they want to be able to compete."
— John (Mozilla AI) on national AI capacity and sovereignty

"My humanity is dependent on you, or I am because you are—that's an African practice, African philosophy...when we're thinking about openness and how community is involved, it's seeing the shared value of trying to build up on a resource that's missing."
— Chennai (Masakane) on Ubuntu philosophy and community-led development

"The biggest risks actually come from deploying these systems in low resource contexts among vulnerable populations where there isn't alternatives...safety risks come from use, not necessarily from malicious attack."
— Uvashi on contextual safety in AI deployment

"We as a society are still paying the price of that opaque product [iPhone] that we are all consuming and is consuming us. We get a chance not to participate in the iPhoneification of AI by just not purchasing it when it comes out."
— Io (Current AI) on consumer choice and avoiding centralized AI products

"If everyone is getting along we're not going to put effort where there are gaps and where there's marginalization. Tension actually surfaces what needs to be done."
— Chennai on productive tension in inclusive tech development

"Representation without shifting the balance of power can actually be more harmful and more exploitative...we need to address the vertical integration of the AI marketplace."
— Uvashi on structural power dynamics in AI

Speakers & Organizations Mentioned

Uvashi — Data governance and AI policy (contextual evaluation and safety)
Chennai Elango — Masakane (African NLP, language benchmarking, community-led research)
Io — Current AI (CEO; formerly open-source hardware, Lebanon-based; three-pillar approach: fund, build, invest)
John — Mozilla AI (open-source community development, deployment challenges, sovereign AI stack)
Leah — Moderator; Digital Public Goods Alliance
Mozilla — Open-source advocacy, Common Voice dataset, healthy internet
Masakane — African NLP hub; 3,000-person community platform (researchers, sociologists, policymakers, non-Africans)
Current AI — New entity launched at 2023 Paris AI Action Summit; fund, build, and invest in public-interest AI
Digital Public Goods Alliance — 50+ member alliance (includes Mozilla, French government, etc.)
Global South Network for Trust Toward AI — Launch announced for Friday of summit
OpenAI, Amazon, Meta, Ring — Examples of proprietary/surveillance AI deployments
Frontier labs — Generic reference to leading AI companies (Sam Altman/OpenAI mentioned as attendee)
ASML, TSMC, Nvidia — Hardware manufacturers (context for why countries cannot be fully sovereign)

Technical Concepts & Resources

AI Kosh — India's national datasets platform (noted as underutilized; most-downloaded dataset has ~400 downloads)
Benchmarks: HELM, MMLU, Sahara (for African languages)
Mozilla Common Voice — Open-source multilingual voice dataset (DPG since 2022); includes east African language work
Masakane RFP (2024) — Request for proposals to benchmark 40 African languages (speech and text perspectives)
Contextual Evaluation — Real-world, community-involved testing of AI systems (vs. lab-based evaluation)
Functional Safety — Safety that depends on local context, rules, and institutions (contrasted with engineering-only safety)
Open-Source Hardware — Hardware as part of sovereign AI stack (Io's 15-year background)
N-minus-first Model Release — John's proposal: frontier labs commit to open-sourcing the previous generation when releasing new model (e.g., GPT-n-1 when GPT-n launches)
Small Language Models — John's thesis that smaller, community-buildable models will matter more in coming year
Local/On-Premises Deployment — Running models on personal hardware without cloud services (sovereignty, privacy)
Digital Public Goods (DPG) Standard — Requires open licensing, adherence to privacy laws, designed to do no harm, with built-in safeguards
Data Provenance & Metadata — Lacking in most public datasets in Global South
South-to-South Collaboration — Masakane's emphasis on learning and knowledge-sharing between Global South countries (not just North-South)

Document Metadata

Event: AI Impact Summit
Topic: Digital Public Goods, Open-Source AI, Global Equity
Panel Format: Multi-speaker discussion with rapid-fire closing
Duration: ~60 minutes
Language: English
Transcript Completeness: Full panel discussion (minor formatting standardization applied)