Inclusive AI for Citizen Services: Language and Last-Mile Impact

Contents

Executive Summary

This panel discussion addresses the critical challenge of democratizing AI across India's 1.4 billion citizens speaking 200+ languages by treating language as digital infrastructure rather than a feature. Panelists emphasize that inclusive AI requires collaboration across government, private sector, and academia to build sovereign yet globally-connected systems that serve the last-mile population—particularly in healthcare, agriculture, governance, and education.

Key Takeaways

Treat Language as Critical Infrastructure
- Language access is not a feature—it is foundational infrastructure for digital inclusion in multilingual nations. Design systems with data standards, model governance, and platform SLAs from day one.
Collaboration Over Competition
- No single entity (government, company, or nation) should build everything alone. Shared digital infrastructure (roads), open-source models, and open datasets are more efficient and equitable than proprietary full-stack solutions.
Start with the Hardest-to-Serve Population
- Inclusion by design means solving for edge cases first (homeless, disabled, illiterate, geographically remote). When you serve them, the majority automatically benefits.
Government Accountability Requires Different Confidence Thresholds
- Public sector AI deployment demands 90-99.9%+ accuracy, clear liability frameworks, and human accountability. Consumer-grade tolerance (80% correctness) is unacceptable for citizen services.
Sovereign + Global: A False Binary
- Build local capacity and data governance (sovereignty) while leveraging global open models and participating in global supply chains (global integration). A hybrid, pragmatic approach beats ideological purity.

Key Topics Covered

Language as Digital Infrastructure: Treating multilingual capability as foundational infrastructure (data, models, applications, platform) rather than an add-on feature
Digital Public Infrastructure (DPI) for AI: How India's approach (similar to Aadhaar) can scale AI inclusively through shared, open-source capabilities
Sovereignty vs. Global Integration: Balancing "sovereign AI" (national resilience and local control) with participation in global AI ecosystems
Data as Public Good: The role of government data collection, curation, and open-sourcing in enabling equitable AI development
Last-Mile Inclusion: Designing for edge cases and marginalized populations from inception, not as afterthoughts
Trust & Accountability in Government AI: Government's custodial role in responsible AI deployment for citizen services
Capacity Building & Change Management: Upskilling bureaucrats and citizens to work with AI systems
Fine-tuning vs. Pre-training Strategy: Pragmatic approach of leveraging existing open models rather than building from scratch
Public-Private Collaboration: Multi-stakeholder ecosystems (government, startups, tech giants, academia, NGOs)
Policy-to-Implementation Gap: Converting AI strategies into scaled, tangible outcomes at grassroot level

Key Points & Insights

Diversity Must Be a Standard, Not an Exception
- Kalista Redmond (Nvidia): "Diversity is the new standard." Rather than retrofitting inclusion, systems must be designed for linguistic, economic, geographic, and ability diversity from inception.
Language Infrastructure Requires Multi-Layer Architecture
- Amitab (Digital India Bhashini): Language as infrastructure spans data curation, standards for annotation/labeling, model building (addressing bias, sovereignty, context), glossaries, and application layers with defined SLAs—not a simple translation feature.
Minimal Viable Infrastructure Model
- Shankar Marada (EkStep Foundation): Government should build "roads" (shared infrastructure), not "cars" (full-stack solutions). Bhashini, India AI Mission, and data platforms are public goods; startups and private sector innovate on top. This prevents duplication and enables scale.
Edge Cases Must Drive Design
- Design for the hardest problems first: homeless people without addresses, individuals with biometric disabilities, people unaware of birth dates. When these are solved, the majority benefit automatically.
Representative Data is Non-Negotiable for Equitable AI
- Ashwini Kumar (Assam Government): "AI is basically a mirror of society. We want representative AI." Without high-quality, locally-collected data, government AI will embed and perpetuate existing inequities.
The "Pilotitis" Problem in Government
- Stefano (KPMG EU): Governments worldwide are caught in endless pilots rather than scaling. The core obstacle is fragmented, siloed data across ministries and government levels—a "big cleanup" is prerequisite to intelligent systems.
Accountability Must Precede Deployment at Scale
- Government officials require 90-99.9%+ confidence thresholds (vs. consumer tech's 80/20 tolerance) because lives and rights are at stake. Someone must legally own responsibility for AI-driven decisions.
Open Models + Local Fine-tuning is the Pragmatic Path
- Harsh (Google Research): Building foundational models from scratch for every country/language is economically wasteful. Instead: use open-source models (e.g., Gemma), fine-tune on domain/language-specific data (millions of tokens, not billions), and keep sovereign sectors (healthcare, finance) air-gapped.
Trust Between Public and Private Sectors is Foundational
- Shankar: "Private sector should earn the trust of the public sector." Collaboration only works when both sides reliably share data and co-create solutions—not when either sector attempts to monopolize capability.
Cultural & Semantic Diversity Beyond Language Tokens
- Stefano: Language is not just words; it includes literacy levels, educational backgrounds, cultural context, and voice-first accessibility needs. A street-level illiterate user must be addressed differently than a university graduate, yet served equally.

Notable Quotes or Statements

Amitab (Bhashini): "Diversity has to become a standard. Digital systems are known to be standard-prone; here, diversity has to become a standard."
Kalista Redmond (Nvidia): "Diversity is the new standard… No one pursues an AI strategy because they're excited to build a data center. They pursue it because they want to increase GDP, opportunity for their nation."
Shankar Marada (EkStep): "The day we think we have all pieces of the puzzle is a bad day. Collaboration keeping the 1.501 billionth person in mind—that should be the target."
Shankar Marada: "If you want to solve transportation, build roads and let others build cars, fuel, and pit stops. If you try to do everything, you will not scale."
Ashwini Kumar (Assam Government): "AI is basically a mirror of the society. We want representative AI… We want AI to be like us and it can reflect the true society only when true data is collected."
Stefano (KPMG EU): "There's a disease called 'pilotitis'—you pilot use case after use case but never roll out at scale with accountability."
Harsh (Google Research): "In sovereign sectors like healthcare and finance, the model should come to the data, not the data go to the model. This is 100% possible with air-gapped environments."
Harsh (Google Research): "Can India afford a hundred companies each spending $100M to develop models from scratch when we're in the agentic era? Use your resources judiciously."

Speakers & Organizations Mentioned

Role	Name	Organization
Moderator	Bridge (full name not provided)	KPMG, Government Technology Practice
Panelist	Amitab	Digital India Bhashini Division (CEO)
Panelist	Kalista Redmond	Nvidia, Global AI Initiatives (VP)
Panelist	Shankar Marada	EkStep Foundation (CEO, Co-founder)
Panelist	Ashwini Kumar	Directorate of Information Technology, Government of Assam (Director)
Panelist	Stefano (surname not fully provided)	KPMG, EU Institutions & Government/Public Sector (Head)
Panelist	Harsh (full name not provided)	Google Research, GenAI & Health AI, APAC (Head)

Organizations/Initiatives Referenced:

Digital India Bhashini: Multilingual AI and real-time translation platform
India AI Mission: National AI infrastructure and capability building
EkStep Foundation: Technology for social impact, large-scale digital public infrastructure
AI4Bharat, IIT Madras: Language technology and datasets
Aadhaar: India's national digital identity system (referenced as DPI model)
Google Research (Project Vani): Open-source speech data collection for Indian languages
KPMG: Consulting support across government transformation and AI initiatives
Nvidia: Sovereign AI infrastructure and partnerships
Google, Microsoft: LLM providers and collaborators in Bhashini ecosystem
World Bank: Funding for Assam governance programs
Government of Assam: State-level digital transformation and AI implementation

Technical Concepts & Resources

Key Technical Approaches

Fine-tuning vs. Pre-training: Emphasis on fine-tuning existing open models (e.g., Gemma) on domain/language-specific data (millions of tokens) rather than pre-training from scratch (billions of tokens)
Air-gapped AI Environments: Models and data remain within national boundaries; output is owned by data controller (used for sovereign sectors like healthcare and finance)
Agentic Architecture: Modular systems that can swap models dynamically—allowing future upgrades without retraining from scratch
Voice-First, Indic-Language-First Design: India targeting voice interfaces by 2025 as primary interaction mode for inclusive access
Multilingual NLP: Supporting 36+ Indic languages plus English, with attention to dialectical variation

Datasets & Platforms Mentioned

Project Vani (Google): Open-source speech and cultural data collection for Indian languages (hosted on AI4Bharat/AIKosh)
AIKosh: Central repository for Indian AI datasets and models
Data Corpus Standards: Emphasis on data annotation, labeling, and governance frameworks (not mentioned by specific tool, but referenced as critical infrastructure)
Bhashini APIs: Real-time translation and multilingual services as public APIs
India State Data Platform: Government data sharing and policy frameworks

Models & Frameworks Referenced

Gemma (Google): Open-source LLM referenced as suitable for fine-tuning in resource-constrained settings
Medma, Edugemma, TranslateX: Domain-specific fine-tuned models mentioned (Google initiatives)
Large Language Models (LLMs): Referenced generally; emphasis on leveraging existing frontier models rather than building proprietary alternatives
Frontier Models: Large-scale, closed-source models (ChatGPT, Gemini, etc.) used as starting points

Governance & Standards

Service Level Agreements (SLAs): Platform governance concept—committing to defined performance, uptime, and accuracy standards
Liability Frameworks: Discussion of accountability for AI hallucinations—should start at application layer and flow down
Data Privacy & Sovereignty: Air-gapping, data residency, and consent frameworks for handling citizen and health data
Bias & Fairness Audits: Ensuring models reflect and do not perpetuate societal inequities

Policy & Implementation Tools

Digital Public Infrastructure (DPI) Model: Creating shared, interoperable infrastructure (roads) on which multiple vendors/startups can build (cars)
Data-Driven Policy Making: Using collected government data to inform policy rather than using siloed, outdated information
Capacity Building Programs: Training government officials and citizens to use and oversee AI systems
Collaborative Governance: Multi-stakeholder ecosystems involving government, private sector, academia, and civil society

Conclusion

This panel articulates a coherent vision for inclusive AI: leverage global open models and public-good infrastructure while building local capacity, data governance, and accountability frameworks. Success hinges not on technology alone but on sustained collaboration, trust-building between sectors, and an unwavering commitment to serving the last-mile, hardest-to-reach populations from day one.