All sessions

Implementing AI in Public Healthcare: Data, Ethics & System Readiness | India AI Impact Summit 2026

Contents

Executive Summary

This panel discussion examines the real-world implementation of AI tools in India's public healthcare systems, specifically focusing on maternal health, TB screening, and community health worker support. The core insight is that algorithmic sophistication means nothing without addressing data representation gaps, infrastructure constraints, and the broader healthcare ecosystem—emphasizing that effective AI deployment requires grounding in local context, existing workflows, and measurable health outcomes rather than technological capability alone.

Key Takeaways

  1. Technology is Not the Bottleneck—Context Is: Successful AI in healthcare requires deep knowledge of local infrastructure, supply chains, protocols, staffing, and cultural norms before any model is deployed. Without this groundwork, even sophisticated AI fails to translate to health outcomes.

  2. Representativeness Requires Active Intent: Underrepresented populations will remain underrepresented unless intentional, non-extractive efforts are made to include them in data collection. Data solidarity and legislative protections against misuse are as important as data gathering itself.

  3. Human Oversight Is Non-Optional: LLM-based tools in healthcare must be bounded (RAG-based, protocol-grounded) and monitored by humans. "Tech plus touch" is not a nice-to-have—it's a requirement for safety.

  4. Continuous Local Validation Beats Centralized Rollout: Rather than scaling a single solution nationally, design platforms that allow local teams to test, evaluate, and adapt solutions to their specific populations and constraints (PMAN framework / massively online AB testing).

  5. Voice & Language Accessibility Are Equity Issues: Solutions that assume smartphone ownership, app literacy, and English/Hindi proficiency will exclude the poorest and most vulnerable. Voice-based interfaces and multilingual support are technical requirements, not luxuries.

Key Topics Covered

  • Data Representation & Bias: Mismatches between training data (US/China) and deployment contexts (India); differential performance across populations (23% higher false negative rates for pneumonia in darker skin tones; melanoma detection errors)
  • Infrastructure Gaps: Computational resource limitations, human capacity building, offline-capable tools, multilingual accessibility, and digital literacy barriers
  • Translation Gap: The disconnect between predictive models and actionable health outcomes—diagnostic capability without therapeutic power
  • Community Health Worker (CHW) Integration: LLM-based chatbots (WhatsApp, voice-based) designed to support ASHAs, ANMs, and other frontline workers
  • Protocol-Driven AI: Using retrieval-augmented generation (RAG) to ground chatbots in validated government protocols rather than free-form LLM outputs
  • Heterogeneity in India: Geographic, socioeconomic, cultural, and linguistic diversity requiring locally tailored solutions rather than one-size-fits-all deployments
  • Data Ecosystems & Privacy: Data solidarity, DPDP Act implications, health data exchanges, and shifting incentives around data collection
  • Evidence Generation: Moving beyond RCTs toward "massively online AB testing" (PMAN framework) for continuous validation in real-world settings
  • Last-Mile Populations: Focus on underrepresented groups—migrant mothers, tribal populations, religious minorities, adolescent mothers—whose data is often absent from datasets

Key Points & Insights

  1. Data Bias is Foundational: Over half of clinical AI models use US or China training data, creating systematic failures when deployed in India. Example: 23% higher false negative rates for pneumonia detection in certain populations due to training data mismatch.

  2. Grounding Prevents Hallucination: Successful implementations (Arman's ANM chatbot, Kushi Baby's ASHA chatbot) use RAG architecture, limiting responses to validated protocols and government guidelines rather than free-form LLM outputs. This ensures safety and accuracy.

  3. Context Must Precede Technology: Before deploying a chatbot or AI tool, ground-truthing is essential—understanding medication supply chains, follow-up infrastructure, clinic availability, trained staff, and state-specific protocols. Generic tools fail without this foundation.

  4. Voice-Based Interaction Solves the Digital Divide: Non-smartphone voice interfaces (accessible via phone calls, with auto-language recognition) bypass the assumption of app-based digital literacy and smartphone ownership—critical for reaching true last-mile users.

  5. Human-in-the-Loop is Non-Negotiable: Even high-accuracy chatbots (97–98% satisfaction) require human oversight for edge cases, complex queries, and safety-critical decisions. "Tech plus touch" is essential.

  6. Heterogeneity Demands Local Validation, Not Global Models: India's internal diversity (geographical, linguistic, cultural, socioeconomic) is so pronounced that a single national AI solution will fail in some regions. Platforms for local evaluation and deployment are needed.

  7. Data Gaps Are Structural and Gendered: Minorities and underrepresented groups (migrant mothers, tribal populations, Muslim women, adolescent mothers) are absent from datasets entirely. Women's health data itself is historically male-biased. Active, intentional work to digitize and include these populations is prerequisite.

  8. Procurement & Benchmarking Are Unsolved: No standard exists for comparing AI health solutions across implementations. Without benchmarking mechanisms, procurement becomes paralyzed and decision-makers cannot identify which solution works for their context.

  9. Incentives Drive Data Quality: Reframing data reporting as opportunity rather than punishment (e.g., high TB mortality rates trigger investigation, not blame) encourages honest reporting and identifies where communities are being missed.

  10. Language and Colloquial Terminology Matter Enormously: Medical terms translated directly (e.g., "diabetes") may not be understood by end users who use colloquial terms ("sugar"). Anemia ("kami" in Hindi) requires culturally grounded terminology to be effective.


Notable Quotes or Statements

"If there's no medication supply or no follow-up infrastructure or no accessible clinic or no trained doctors, we still can't act on it. And what that results in is a diagnostic gap without any therapeutic power." — Opening remarks on the translation gap

"The chatbot doesn't draw from the worldwide web. It draws answers only from the universe of the protocol that we created, which has been vetted so thoroughly at every level." — Dr. Aperna Hay (Arman), on protocol-grounded AI design

"Tech plus touch." — Dr. Aperna Hay, summarizing the essential combination of AI and human oversight

"If we wait for models to be built until we have all the data, we will never build the models. What we need instead is a way to test these models continually, look at where they are going wrong, and use that to retune and retrain them." — Dr. Anurag Aggarwal (Ashoka University), on iterative local validation vs. waiting for perfect datasets

"India itself is so heterogeneous that any one central program for the around the country is not going to work well. What you need are platforms to locally evaluate your problems." — Dr. Anurag Aggarwal, on the heterogeneity problem

"Data is not something to be afraid of but an opportunity that we can build on." — Dr. Mitun Kumar Mitra (IIT Mumbai), reframing data reporting mindset

"Even the way that the ASHA is able to ask questions from the chatbot—they will not be the same as what you and I might end up doing. That's another element of literacy." — Urva Shivatal (Kushi Baby), on designing for actual user behavior

"The problem is not in creating the plan. The problem is in getting people to manage the plan." — Dr. Anurag Aggarwal, on palliative care and implementation vs. diagnosis


Speakers & Organizations Mentioned

Panelists

  • Dr. Aperna Hay — Founder, Arman; Gynecologist & social entrepreneur (maternal and child health focus)
  • Urva Shivatal — Country Lead, Evidence & Impact, Kushi Baby; formerly Jhpiego
  • Dr. Anurag Aggarwal — Dean of Biosciences & Health Research, Ashoka University; Physician-scientist, genomics leader
  • Dr. Mitun Kumar Mitra — Professor of Physics, IIT Mumbai; Theoretical physicist working on living systems and TB analytics

Organizations & Initiatives

  • Arman — NGO providing mHealth solutions and training to community health workers across 7 Indian states; 500,000+ healthcare workers reached
  • Kushi Baby — Not-for-profit strengthening public health systems through data; works with Asha workers in Rajasthan and Maharashtra
  • Microsoft Research — Technology partner for ASHA chatbot; provides CSR/philanthropic support
  • Hopkins (Johns Hopkins University) — Collaborator on TB post-treatment outcomes modeling
  • National Disease Modeling Consortium — Set up by Ministry of Health & Family Welfare; brings together programmatic expertise, ICMR, epidemiologists
  • Palliium India — Partner in palliative care knowledge repository work
  • Bhashini — Multilingual language support platform referenced
  • Gemini — Language model platform referenced
  • Microsoft Gates Foundation — Supporting voice data benchmarking initiative
  • Government of India (MoHFW) — Created ANM protocols (2019 partial care guidelines); involved in health data exchange planning
  • Indian states: Telangana, Uttar Pradesh, Maharashtra, Rajasthan, Bihar, Odisha (mentioned as geographies of deployment or disease burden)

Technical Concepts & Resources

AI/ML Approaches

  • Retrieval-Augmented Generation (RAG): Core architecture used in both Arman and Kushi Baby chatbots to ground responses in validated protocols/guidelines
  • Large Language Models (LLMs): GPT-based and other models tuned on local data
  • Multimodal Input: Voice and text-based queries; voice-to-text conversion
  • Multi-Agentic Workflows: AI systems that cross-check and validate before responding
  • Massively Online AB Testing (PMAN) — Proposed framework for continuous, real-world validation of AI solutions in diverse user populations rather than centralized RCTs

Data & Evidence Methods

  • Epidemiological Modeling: Traditional statistical + machine learning models to account for heterogeneity
  • RCT (Randomized Controlled Trials): Acknowledged as insufficient for heterogeneous Indian contexts
  • Voice-to-Digital Data Conversion: Technology to capture interactions and build data repositories from voice-based interactions (Bhashini, others)
  • Digital Health Data Exchange: Proposed federated architecture allowing data use without centralized storage (DPDP Act–compliant)

Implementation Platforms & Tools

  • WhatsApp-based Chatbots: Leveraging existing user behavior; used by both Arman (ANMs) and Kushi Baby (ASHAs)
  • Voice-based Phone Interfaces: Call-in systems with auto-language recognition (referenced as in-test mode)
  • Learning Management Systems (LMS): App-agnostic platforms delivering video, simulations, interactive quizzes on protocols
  • RCH (Reproductive and Child Health) Database: Government database in states like Telangana; used to track outcomes and layer additional questions
  • Government Protocols: ANM protocols (35 high-risk conditions), color-coded, algorithmic, state-specific modifications

Key Datasets & Data Sources

  • Programmatic Data: Government TB, LF, and other disease surveillance databases
  • Local Surveys & Studies: Community-level data to supplement national datasets
  • Entomological Data: Largely absent for vector-borne diseases (malaria, dengue, lymphatic filariasis, encephalitis)
  • Socioeconomic & Cultural Data: Missing or underrepresented; critical for understanding heterogeneity

Health Domains Discussed

  • Maternal & Child Health: High-risk pregnancy detection, ANM decision support
  • Tuberculosis: Post-treatment outcomes, recurrent TB detection, screening in high-risk populations
  • Lymphatic Filariasis ("Elephantitis"): Neglected tropical disease modeling; vector biology; mass drug administration compliance
  • Anemia Detection: Community-level screening via digital tools
  • Palliative Care: End-of-life planning communication and adherence

Terminology & Concepts

  • Last-Mile Populations: Poorest, most remote, hardest-to-reach communities
  • Health Heterogeneity: Geographic, socioeconomic, linguistic, cultural variation requiring localized solutions
  • Data Solidarity: Legislative + educational approach to protecting data subjects while enabling research use
  • Data Representation: Ensuring underrepresented populations are included in training datasets
  • Data as Performance Metric vs. Opportunity: Reframing honest reporting of gaps/failures as diagnostic opportunity rather than accountability threat
  • Human-in-the-Loop: Hybrid AI-human decision-making, especially for safety-critical scenarios
  • Offline-Capable Tools: Solutions functional without reliable internet
  • Digital Literacy & Readiness: Ability to use smartphone/feature phone; language proficiency; familiarity with app/voice interfaces

Summary Table: Key Implementations

InitiativeOrganizationTarget UsersTechnologyCoverageKey Innovation
Integrated High-Risk Pregnancy ManagementArmanANMs, medical officers, specialistsLLM chatbot (WhatsApp) + RCH database7 states, 14,000 ANMs trained (scale to 12,000 by March)Protocol-grounded RAG; state-specific modifications; 10-second latency; voice + text input
ASHA SahiliKushi BabyASHA workers (frontline community health)LLM chatbot (WhatsApp) + government guidelinesRajasthan, Maharashtra; 11,000 ASHAsRAG-based safety guardrails; addressed counseling, nutrition, vaccination queries; confidence booster
TB Post-Treatment OutcomesIIT Mumbai + HopkinsDistrict TB officers, state programsEpidemiological + ML modelsIndia-wide (programmatic data)Local heterogeneity modeling; identifies lost-to-follow-up, migrant workers; tailored by geography
Lymphatic Filariasis EliminationNational Disease Modeling ConsortiumState health programs, district officersStatistical + ML modelingEndemic districts (Bihar, Odisha, Maharashtra)Integrates human behavioral data + entomological gaps; identifies "never treated" populations
Palliative Care PlanningAshoka University (with Palliium India)End-of-life patients & familiesMulti-agentic GenAI workflow; voice interfaceIn development/test modeAddresses implementation gaps rather than diagnosis; builds knowledge repositories; structured "I don't know" responses
Voice-Based Health InterfaceAshoka University (with Bhashini, Gemini)Low-digital-literacy usersPhone call interface; auto-language recognition; voice-to-textIn test modeSolves digital divide; no smartphone/app literacy required; multilingual (Hindi, English, Marathi pilot)

Critical Gaps & Future Work

  1. Data Representation: Systematic inclusion of underrepresented populations (migrant mothers, tribal, religious minorities, adolescents)
  2. Procurement Standards: Benchmarking framework for comparing AI health solutions across contexts
  3. Health Data Exchange: Federated data architecture (DPDP-compliant) allowing research use without centralization
  4. Entomological Data: Large-scale vector surveillance for TB, malaria, dengue, LF
  5. Voice-to-Digital Infrastructure: Scaling technology to reliably convert voice interactions into structured data
  6. Evidence for Scale: Understanding how solutions proven at 500–11,000 users perform at state or national scale

This summary preserves the transcript's emphasis on implementation realism, context-driven design, and the centrality of human infrastructure and data justice to any successful AI health deployment in India.