All sessions

Quality Control in Healthcare AI

Contents

Executive Summary

This summit panel discussion emphasizes that unregulated AI in healthcare is fundamentally unsafe and requires rigorous quality control mechanisms spanning the entire AI lifecycle. While AI tools can significantly improve diagnosis, efficiency, and accessibility, particularly in resource-limited settings, they must be deployed incrementally, locally validated, and continuously monitored with clear regulatory frameworks and human oversight to protect patient safety and build clinician trust.

Key Takeaways

  1. Quality control is not optional—it is foundational. Without systematic QC spanning design through post-deployment monitoring, AI in healthcare is unsafe and unethical. The corollary: "No AI without QA" must be the standard.

  2. Local validation is mandatory, not optional. A tool validated in Boston or Berlin will not automatically work in Bangalore without rigorous local re-validation against local patient populations, diseases, and clinical workflows.

  3. AI is a decision-support tool, not a replacement for clinicians. The goal is to reduce clinician workload, improve efficiency, and extend expertise to under-resourced regions—not to automate human judgment away. Humans must remain in the decision-making loop.

  4. Regulatory clarity and institutional governance must precede deployment. India's regulatory framework (draft released October 2023), the EU AI Act's high-risk classification, and government support mechanisms (ICMR clinical trial network, medtech mitra portal, NIRMAN monitoring framework) are essential infrastructure.

  5. Trust is built through transparency, testing, and incrementalism. Patient and clinician trust in AI cannot be rushed. Incremental, step-by-step adoption with visible regulatory oversight, open communication about limitations, and demonstrated local effectiveness builds sustainable trust far better than aggressive roll-outs.

Key Topics Covered

  • Quality Control in AI Healthcare – The necessity of comprehensive QC frameworks beyond traditional performance metrics
  • Regulatory Landscape – India's evolving regulatory frameworks, EU AI Act classification of medical AI as high-risk, and the role of CDSCO
  • Clinical Validation & Real-World Deployment – The "cliff" between lab performance and clinical effectiveness; importance of local validation
  • Data Governance & Bias – Managing biased, incomplete, or non-representative datasets; fairness and equity concerns
  • Model Monitoring & Drift Detection – Post-deployment continuous monitoring to detect performance degradation and data distribution shifts
  • Accountability & Governance – Institutional responsibility, version control systems, and the "kill switch" for AI products
  • Human-AI Collaboration – Role separation frameworks, clinician-in-the-loop design, and AI as a decision support tool, not replacement
  • Ethical & Trust Concerns – The tension between AI adoption and patient trust; cognitive debt in clinical learning; natural vs. artificial intelligence
  • Workforce Development & Training – Building skills for local AI validation; preventing degradation of clinical expertise through over-reliance on AI
  • National AI Health Strategy – India's National AI Strategy for Health (released during session) and government support mechanisms for innovators

Key Points & Insights

  1. "AI without quality control is modern-day sorcery" – Dr. Vive Kandesh emphasized that uncontrolled AI deployment poses existential risks to medicine; poor quality AI can harm patients, worsen inequities, and expose organizations to regulatory and reputational damage.

  2. The Lab-to-Clinic Performance Cliff – Dr. Karthik highlighted a critical gap: AI models achieve high accuracy in controlled lab settings but experience dramatic performance degradation in real clinical environments with noisy, variable data. Accuracy scores that meet publication standards may drop substantially when deployed at population scale.

  3. Performance Metrics Are Insufficient – Moving beyond F1 scores, ROC curves, and accuracy, healthcare AI evaluation must focus on outcome-based metrics: Does the tool improve actual health outcomes? Is it cost-effective? What is the ROI?

  4. The False Negative Invisibility Problem – In decision-support workflows, false negatives (missed cases) often remain invisible because clinicians don't re-review cases the AI marked as "normal." This creates systemic blind spots requiring continuous control sample testing and inter-institutional data sharing.

  5. Product Definition & Version Control – Per regulatory definition, an approved AI "product" must be static and fixed; it cannot continuously learn post-deployment. New versions require re-validation and re-approval, creating a clear version control mechanism and "kill switch."

  6. Quality Control Must Span the Entire AI Lifecycle – From design, development, and validation through deployment, post-deployment monitoring, and feedback loops—each stage requires distinct controls. Continuous evaluation is more critical than one-time validation.

  7. Institutional, Not Algorithm, Accountability – Regardless of the tools in the chain, the institution on the prescription letterhead bears legal and ethical accountability. This clarifies responsibility but requires institutions to implement robust governance policies and AI inventories.

  8. Local Validation & Contextualization are Non-Negotiable – Tools developed and validated in different geographic, demographic, or clinical contexts require local re-validation before deployment. Global "copy-paste" deployment of AI systems fails to serve local clinical priorities and populations.

  9. Role Separation Frameworks Over Replacement – AI excels at handling high-volume, repetitive tasks (e.g., triaging normal X-rays, flagging outliers for expert review, nudging clinicians to review prior decisions). Expert clinicians should remain in decision-making roles while AI reduces administrative burden and increases efficiency.

  10. Human Learning & Cognitive Debt Risks – Over-reliance on AI-assisted diagnosis may degrade clinical training and expertise in the next generation of physicians (e.g., spell-check effects on spelling, reduced ability to read normal imaging). Deliberate safeguards must preserve foundational clinical learning.


Notable Quotes or Statements

  • Dr. Vive Kandesh: "AI without quality control is modern-day sorcery... AI without quality control is absolute hogwash."

  • Dr. Vive Kandesh: "If quality control is not applied to AI, machine intelligence could be the last invention that humanity will ever need to make."

  • Dr. Vive Kandesh: "The core of medical practice is primum non nocere—do no harm. Patient safety must be at the center of AI healthcare tools."

  • Dr. Karthik Arappa: "Eventually, what you want is outcome-based evaluation. Are these tools truly improving health outcomes? It has to be outcome-based, not performance-metrics-based."

  • Dr. Karthik Arappa: "Tools that get developed locally and validated locally tend to serve better the clinical outcomes we want to accomplish."

  • Dr. Karthik Arappa: "[The science of AI is] about how do you deploy, where do you deploy, and when you deploy—ensuring that you follow the science of AI development."

  • Taruna (ICMR): "The regulator calls an AI tool a product only when it has stopped learning. If it continues to learn, it is in development; the product is fixed."

  • Professor Nishit (IIT Kanpur): "Any trust that the patient places will be on an institution... whichever institution is on the letterhead that signs the prescription has to have accountability, no matter what tools are in the chain."

  • Closing Statement (Panel): "The patient has to be at the center of everything. That has to be the bottom line... We practice with caution and incremental steps."


Speakers & Organizations Mentioned

Key Speakers:

  • Dr. Vive Kandesh – Surgeon Vice Admiral; clinician and uniform officer; lead on clinical effectiveness and patient safety
  • Dr. Karthik Arappa – WHO Digital Health Division; medical professional; lead on national AI health strategy and outcomes-based evaluation
  • Taruna – ICMR (Indian Council of Medical Research); head of development research; regulator perspective on data integrity and clinical trials
  • Professor Nishit (IIT Kanpur) – Computer scientist; focus on accountability, open-source vs. regulated models, and control sample testing
  • Dr. Karthik (second speaker with this name) – Medical technologist; radiation oncology expert; focus on role separation frameworks and real-world use cases
  • Professor Tapan Gandhi – Added late; reinforced human-in-the-loop philosophy and emphasized purpose-driven AI development

Organizations & Institutions:

  • WHO (World Health Organization) – Digital Health Division
  • CDSCO – Central Drugs Standard Control Organization (Indian regulator); drafting regulatory frameworks for AI (October 2023 draft, final framework in "few months")
  • ICMR – Indian Council of Medical Research; operates 84 clinical trial sites across India, 170 medical colleges with research units, and rural health research units
  • IIT Kanpur – Indian Institute of Technology; research on AI governance and control mechanisms
  • IIT Bombay – Collaborating on curated, annotated gold-standard datasets for glioblastoma, breast cancer, and oral cancer (AI Kosha platform)
  • NITI Aayog – National policy think tank; guidance on medtech mitra portal
  • National Health Authority – Developing crowdsourced data platform for regulatory oversight
  • AFMC – Armed Forces Medical College (Pune)
  • Department of Health Research – Government body managing clinical trial infrastructure and health technology assessment
  • DJFMS – Directorate of Medical Education (referenced in thanks)
  • FDA (U.S.) – Has authorized ~600 medical devices; contrast with unregulated AI tools

Technical Concepts & Resources

Quality Control & Validation Frameworks:

  • Control Sample Testing – Continuous floatation of known positive/negative cases through deployed models to detect drift (addresses false negative invisibility)
  • Distribution Drift Detection – Monitoring whether model performance degrades as real-world data distributions shift from training distributions
  • Model Drift vs. Data Drift – Distinguishing changes in model weights/behavior from changes in input data characteristics
  • Performance Beyond Standard Metrics – Moving beyond accuracy, precision, F1 score, ROC curves to outcome-based measures (health impact, cost-effectiveness, ROI)
  • Post-Deployment Monitoring (Continuous Evaluation) – Real-time, ongoing assessment of model performance in clinical settings, not just one-time validation

Regulatory & Governance Tools:

  • EU AI Act – Classifies most medical AI as high-risk; requires strict quality and monitoring
  • India's Draft AI Regulatory Framework (CDSCO, October 2023) – Establishes compliance requirements; final version expected within months
  • National AI Health Strategy (Released during this summit) – India's comprehensive strategy for AI in healthcare
  • Version Control for AI Products – Static, fixed products with regulatory re-approval for new versions (not continuous learning post-deployment)
  • AI Governance Policy – Institutional-level governance required for AI implementation
  • AI Inventory Management – Tracking embedded tools, devices, and cloud solutions across healthcare systems
  • Standardized Procurement – Systematic, controlled processes for acquiring AI tools

Clinical Validation & Research Infrastructure:

  • ICMR Clinical Trial Network – 84 sites across India with endemic disease expertise (e.g., malaria sites, TB sites); curated for diverse populations
  • 170 Medical Colleges with Multidisciplinary Research Units – Distributed research capacity for localized validation
  • Rural Health Research Units – Field feasibility testing in low-resource settings
  • Health Technology Assessment (HTA) – Department of Health Research unit evaluating licensed AI tools against standard-of-care baselines using randomized clinical trials (RCTs)
  • Randomized Clinical Trials (RCTs) – Robust framework for comparative evaluation (AI-assisted diagnosis vs. clinician-alone); identified as gap in India's current AI validation landscape

Data & Datasets:

  • AI Kosha Platform – Repository of curated, annotated, metadata-rich gold-standard datasets:
    • Glioblastoma (CNS cancer)
    • Breast cancer imaging
    • Oral cancer imaging
    • Available to entrepreneurs and researchers for model development
  • Data Labeling & Curation – Ensuring high-quality, representative training data with expert annotation
  • Metadata Standards – Preserving clinical context and demographic information alongside imaging/clinical data

Support & Funding Mechanisms:

  • Medtech Mitra Portal – "Medtech Highway" launched December 25, 2023; one-stop shop for AI innovators offering:
    • Financial support
    • Technical support
    • Regulatory guidance (testing batches, manufacturing licenses, clinical trial protocol development)
    • Direct engagement with ICMR, CDSCO, NITI Aayog (18+ stakeholder board)
    • Supported 900+ innovators to date
  • Challenge Grants (First-in-the-World) – ICMR grants covering full TRL (Technology Readiness Level) 1–8 pipeline:
    • Phase 1: Proof of concept
    • Phase 2: Prototype
    • Phase 3: Product commercialization
  • NIRMAN Framework – 6 IIT collaboration to develop continuous performance monitoring frameworks for deployed AI models

Role Separation & Workflow Design:

  • Role Separation Frameworks – Strategic division of labor: AI handles high-volume, low-complexity tasks (e.g., flagging abnormals); expert clinicians handle complex decisions, exceptions, and nuanced cases
  • Decision Support vs. Automation – AI designed to nudge, assist, or reduce workload rather than autonomously decide (e.g., peer review nudging in radiation oncology contouring)
  • Human-in-the-Loop Workflows – Clinician remains in decision-making loop; AI provides input, not directives

Emerging Challenges Identified:

  • Cognitive Debt – Risk of clinician skill atrophy when AI reduces need for expertise (e.g., spell-check reducing spelling ability; AI reducing diagnostic skill in next generation)
  • Data Leakage & Privacy Risk – Hospitals reluctant to share data for crowdsourced control samples due to regulatory liability and privacy concerns
  • Rare Condition Control Samples – Single hospitals may see only 5 rare cases/year; insufficient for reliable model testing; requires inter-institutional collaboration
  • Usability & Transparency Issues – Ensuring AI tools integrate into clinical workflows and clinicians can understand/audit tool decisions
  • Cyber Security & Data Integrity – Managing security risks and data provenance in increasingly distributed healthcare systems

Additional Context

Summit Timing & Government Action:

  • The national AI health strategy document and the medtech mitra portal were released during this summit
  • This indicates government-level commitment to structured AI governance in India's healthcare system
  • The CDSCO regulatory framework (draft October 2023) will finalize "in a few months"

Implicit Tensions Highlighted:

  • Efficiency vs. Safety: Pressure to adopt AI for cost/speed must be balanced against patient safety
  • Automation vs. Expertise: Risk that over-automation degrades clinical learning and expert judgment
  • Global vs. Local: Tools validated globally may not transfer without local re-validation
  • Speed vs. Caution: Industry push for rapid deployment conflicts with clinician call for incremental, cautious adoption

Actionable Momentum: The discussion strongly suggests India is positioning itself as a regulated, outcome-focused leader in healthcare AI, distinct from less-regulated global adoption patterns. The infrastructure (clinical trial network, medtech portal, regulatory framework) and funding mechanisms are in place to support both innovators and clinicians in safe, locally-validated AI adoption.