All sessions

Day 3: Plenary Hall B | Sridhar Vembu, Jay Chaudhry, CP Gurnani & Global Leaders

Contents

Executive Summary

This comprehensive plenary session showcased India's sovereign AI capabilities across models, applications, and infrastructure, with a specific focus on building AI solutions at population scale through efficient, multilingual systems. The summit demonstrated that India is advancing from frontier model research to real-world deployment across voice, vision, document digitization, and enterprise applications, while global leaders emphasized the importance of responsible AI development, partnership frameworks, and measurable ROI in practical deployments.

Key Takeaways

  1. AI at 1 Billion Scale Requires Efficiency by Design: The most impactful Indian AI systems optimize for cost and latency from first principles, enabling deployment on feature phones and in offline settings. This is not a feature but a prerequisite for population-scale impact.

  2. Multilingual, Voice-First is Not Optional for India: Text-and-English-centric AI architectures exclude 80% of potential users. Models trained natively on Indian languages from the ground up, combined with voice interfaces, are table stakes for meaningful adoption.

  3. Measure Business Outcomes, Not Model Benchmarks: The divide between successful and failed AI projects correlates far more strongly with clear ROI metrics (cost reduction, time savings, lives improved) than with benchmark scores. Executives increasingly demand 5-10x returns within 12 months.

  4. Sovereignty Across Hardware & Software Matters: India's insistence on building models, inference infrastructure, and devices domestically reduces dependency on foreign supply chains and enables faster iteration when policies or partnerships change. This applies beyond India to all nations prioritizing strategic autonomy.

  5. Soft Skills, Agency, and Creativity Are the New Job Moat: Both business and academic leaders agree that jobs purely executing defined rules will be automated. Future-proofing requires: creativity, sense of agency (willingness to break rules and find new paths), fundamental math/science literacy, and empathy—skills AI currently cannot replicate.

Key Topics Covered

  • Sovereign AI Model Development: Building and scaling 3B, 30B, and 105B parameter models trained from scratch in India
  • Speech Recognition & Text-to-Speech: Sarvam Bulbull voice models for Indian languages and use cases
  • Document Intelligence & OCR: Vision models for digitizing documents in multiple scripts and languages
  • Multilingual AI Systems: Language-agnostic approaches to serve India's 22 scheduled languages
  • Voice Infrastructure (Sambad Platform): Conversational AI at scale for customer service and citizen engagement
  • Enterprise AI Orchestration (Arya): Systems for reliable, declarative AI in production environments
  • Content Transformation: Document translation, dubbing, and AI-powered content localization
  • Edge AI & Device Deployment: Running AI models on feature phones, smartglasses, and IoT devices
  • Developer Ecosystem & API Platforms: ₹10 crore startup credit program and democratized access to models
  • Agentic Commerce: Conversational, voice-first commerce interfaces for India's informal economy
  • Humanitarian AI Applications: World Food Program's use of AI for hunger prediction and crisis response
  • Quantum Computing & Hybrid AI: Integration of quantum and AI for specialized use cases
  • Global Partnerships & Regulation: US-India collaboration frameworks and responsible AI governance

Key Points & Insights

  1. Efficiency as Core Design Principle: All major models (30B, 105B) employ mixture-of-experts architectures with dramatically reduced activated parameters (1B active in 30B model vs. 3B in competitor Qwen). This 3x efficiency improvement directly reduces inference costs and enables deployment on constrained infrastructure.

  2. Language-Native Intelligence, Not Translation: Models trained on 16 trillion tokens of Indian language text natively understand and respond in Hindi, Tamil, Marathi, etc., rather than translating from English. This is foundational for population-scale adoption where 70%+ of users are non-English speakers.

  3. Measurement-First Approach to ROI: Speakers repeatedly emphasized that successful AI deployments begin with clear, measurable business outcomes before technology selection. Healthcare, logistics, and customer service examples showed 10-50% efficiency gains when AI targets human tasks deemed low-value by workers themselves.

  4. Voice as Primary Interface for India: India generates 50% of global WhatsApp voice notes. Voice-first, multilingual interfaces bypass literacy barriers and align with cultural preferences. Sambad platform already handles 2 million call-minutes daily and projects 100 million by year-end.

  5. Document Digitization at Scale: Moving from 90% accuracy (one error per line) to one error per 10 pages through specialized vision models and human-in-loop workflows. Critical for unlocking India's written heritage and enabling access to government/legal documents for underserved populations.

  6. Sovereign Infrastructure Across Layers: India has built end-to-end AI stack (models → inference engines → applications) without reliance on proprietary foreign systems. Partnerships with Qualcomm, Bosch, and Nokia enable device-level deployment while maintaining security and low latency.

  7. Agentic Orchestration Over Simple Chatbots: First-generation enterprise AI (LLM + RAG chatbots) showed poor ROI. Second-generation agents required extensive API mapping. Third generation (Edge Company's virtual humanoids) interact directly with existing UI/workflows, enabling rapid deployment without legacy system redesign.

  8. Population-Scale Adoption Requires Economics & Trust: Less than 200M of 650M smartphone users shop online in India; most conduct commerce through offline agents. AI agents that negotiate, compare prices, and answer questions in local languages directly replicate trusted human relationships at scale.

  9. Research-to-Deployment Balance: Leading AI companies allocate ~60% of research effort to proven business problems, 40% to exploratory work. Hack weeks and structured innovation labs prevent researcher attrition while maintaining disciplined product development.

  10. Global South Playbook: India's solutions (UPI for payments, agentic commerce, voice-first interfaces) directly address problems across 2+ billion people in regions with similar infrastructure constraints, language diversity, and informal economies. Potential to establish India as the template for AI deployment in developing nations.


Notable Quotes or Statements

  • Pratush Budhatoki (Sarvam AI): "We would like to make AI work at a population scale and being able to do it efficiently becomes a very very core thesis."

  • Pavan Belagatti (Sarvam AI, on smart glasses): "These products are designed for everyone from every part of the world... thinking about security, thinking about our defense forces, thinking about our farmers, but also thinking about all of us because we in India we don't necessarily work out of our offices."

  • Dr. Ajay Kumar Sud (Principal Scientific Adviser, Government of India): "The innovation is the key point and once the innovation comes I'm sure the private sector and academia and startups... they are working in unison. AI mission is actually the glue."

  • Dr. Setu Raman Panchanathan (Former NSF Director): "The AI of today that we are all celebrating is more than five to six decades of sustained investments by NSF... NSF continued to invest [even] in AI winters."

  • Gautier Glock (Edge Company): "We are making AI that is 100 times cheaper to run... we think it's very important because if you build bigger and bigger and bigger models, maybe they're going to be able to tell you about the history of the king of France in 1000 in Hindi. But what we need here in companies is just a tool that can do anything that a human can do in a company."

  • Lionel Regard (Cubit Soft): "Jobs will be replaced by people who don't use AI will be replaced by people who use AI... you have to dream big because this is what is going to change the thing."

  • Carl Scow (World Food Program): "300 million people don't know where their next meal is going to come from... AI innovation could fill some of that gap... we could predict a crisis hotspot up to 60 days before it peaks."


Speakers & Organizations Mentioned

Sarvam AI Leadership:

  • Pratush Budhatoki (CEO/Co-founder)
  • Pavan Belagatti (Head of Product)
  • Himanshul (Demo)
  • Krishna (Document Intelligence)
  • Aditya (Mobile/Feature Phone Demo)
  • Ishan (105B Model Use Cases)
  • Vedant & Harshed (Arya Orchestration)
  • Arjit & Krishna (Content Platform)
  • Minakshi & Shoubam (Sambad Conversational Platform)
  • Tashar (Edge AI)
  • Saheed (Developer Relations & APIs)

Government & Policy:

  • Krishna (Secretary, Ministry of Electronics & Information Technology, Government of India)
  • Dr. Ajay Kumar Sud (Principal Scientific Adviser, Government of India)
  • World Food Program (humanitarian deployment focus)
  • UNDP (fireside chat moderator)

International Tech Leaders & Partners:

  • Dr. Setu Raman Panchanathan (Former Director, US National Science Foundation)
  • Gautier Glock (CEO, Edge Company, France)
  • Lionel Regard (CEO, Cubit Soft — Quantum-as-a-Service)
  • Qualcomm (edge AI optimization partnerships)
  • Bosch (vehicle AI integration)
  • Nokia/HMD (AI-powered feature phones)
  • NVIDIA (compute partnership for token factory)

Enterprise & Startup Partners (Testimonials):

  • Zomato (e-commerce localization)
  • Blinkit/Flipkart (voice commerce)
  • Swiggy (agentic ordering)
  • Vodafone Idea (conversational telecom services)
  • PolicyBazar (voice-first insurance)
  • DPS School & NCERTT (education collaboration)
  • Ekatra Foundation (document digitization NGO)

Benchmark & Research Bodies:

  • Allen AI (ALMO document benchmark)
  • World Food Program (humanitarian AI applications)

Technical Concepts & Resources

Models & Architectures

  • Sarvam 3B Vision Model: 3-billion parameter hybrid (state-space + attention layers), trained on 400M images + 16T tokens
  • Sarvam 30B: 30-billion parameter mixture-of-experts (1B activated), 32K context, 16T token pre-training
  • Sarvam 105B: 105-billion parameter mixture-of-experts (9B activated), 128K context, state-of-the-art reasoning
  • Sarvam Bulbull V3: Multilingual text-to-speech with Indian voice variants, production-grade stability
  • Speech Recognition Model: 1B+ parameters, real-time, multi-speaker, language-agnostic across Indian languages

Applications & Platforms

  • Sambad: Full-stack conversational AI platform for customer service at scale (2M call-minutes/day, targeting 100M/day)
  • Arya: Declarative, composable agent orchestration system for enterprise workflows; claimed 10x efficiency vs. manual processes
  • Sarvam for Conversations: Voice AI as infrastructure (HTTPS-like), not product
  • Sarvam for Work: Internal enterprise productivity; proof-reading, automation, multi-turn agentic workflows
  • Sarvam for Content: Document OCR, translation (layout-preserving), video dubbing at scale
  • Scout: WFP supply chain optimization tool ($6M+ savings, projected $25M annually)
  • Hunger Map Live: Real-time crisis prediction platform (60-day advance notice for famines/droughts)
  • PRAA (Compute Orchestration): Token factory enabling multi-datacenter GPU aggregation for scaled inference

Benchmarks & Evaluation

  • ALMO (Allen AI): Document intelligence benchmark; Sarvam models achieve state-of-the-art on table/handwriting/math detection
  • MMLU Pro: Knowledge benchmark
  • GPQ Diamond, AIM-25, Beyond AIM, HMMT: Reasoning & math benchmarks
  • LiveCodeBench: Programming/code generation
  • BrowseComp: Web search integration for grounding
  • Agentic Tool-Calling Benchmarks: Switchbench, TaoSquare (tool calling competency)
  • Thinking Budget Metrics: Measurement of efficient reasoning under latency constraints

Infrastructure & Optimization

  • Mixture-of-Experts (MoE): Sparse activation reducing compute from 105B → 9B effective parameters
  • Multi-Language Access Protocol (MLA): Efficient attention mechanism for 105B model
  • Custom Tokenizer: Optimized for Indian languages, reducing token bloat vs. English tokenizers
  • Edge Deployment: Models running on Qualcomm chipsets (Snapdragon), offline capability, 0 data transmission
  • Smart Glass Integration: Sarvam Kaz (India-first AI-powered smartglasses designed & manufactured domestically)
  • Feature Phone Support: Sarvam 30B on Nokia phones; voice-only interface bypassing visual interaction

Data & Training

  • Pre-training Scale: 16 trillion tokens (mix of English, Indian languages, code)
  • Vision Pre-training: 400 million images for visual grounding
  • Language Coverage: 22 scheduled Indian languages + English
  • Document Dataset: Ancient manuscripts, government records, newspapers, textbooks (layout understanding via specialized fine-tuning)

Governance & Responsible AI

  • India AI Governance Framework (November 2025): "Techno-legal" approach embedding policy requirements in technology
  • Safe AI Sutra: Component of governance framework addressing algorithmic bias, trust, and safety
  • Incentivization Model: Government consideration of incentives for adopting safety frameworks beyond voluntary compliance

Developer Ecosystem

  • API Platform: SDKs in JavaScript, Python; integrations with LiveKit, PipCat for voice
  • Startup Credit Program: ₹10 crore in startup credits (up to ₹3–10 lakh per eligible startup)
  • Free Tier: 33 hours ASR, 7 hours TTS, playgrounds, MCP support
  • User Base: 3+ lakh developers on model APIs; 100+ million interactions shipped; 10x ROI for customers

Global Deployment Patterns

  • UPI Analogy: Just as India's UPI became global standard for real-time payments, agentic voice commerce could be template for global south
  • Humanitarian Use Cases: WFP deploying AI for hunger prediction, damage assessment (48hr vs. 3 weeks), supply chain optimization in 90+ countries

Omitted / Truncated Content

The transcript contains significant repetition and technical audio artifacts (repeated words, transcription errors like "speed to tracking fraulan or safal utapan LVM3 M4 rocket"). The summary above captures the substantive content while filtering noise. Some panel discussions were cut short due to time constraints, and certain speaker interjections were incomplete in the original transcript.