All sessions

FHIBE: Advancing Ethical AI Through Fair and Human-Centric Data

Contents

Executive Summary

This panel discussion at an AI summit explores how responsible AI can move from principle to practice, with emphasis on ethical data collection, governance frameworks, and multi-stakeholder accountability. Panelists from Sony Research, Mastercard, legal technology, and academia argue that responsibility must be embedded at the design stage—not retrofitted—and requires coordinated effort across technologists, legal experts, regulators, and educators.

Key Takeaways

  1. Responsibility is not a feature; it's a process. It must be embedded in problem design, data collection, model development, and deployment—not added at the end. Treat it as "continuous" and integral, not episodic.

  2. Accountability must be explicit and codified. As AI systems become more distributed (vendors, sub-vendors, deployers), legal and organizational clarity on who is responsible for failures is essential. Currently missing in India.

  3. Data is the foundation of responsible AI. Fair compensation, informed consent, right to erasure, and transparency about training data matter more than downstream fairness metrics. Start with ethical data collection.

  4. Multi-stakeholder coordination is non-negotiable. Technologists, lawyers, ethicists, regulators, and educators must work together from the design phase. Siloed teams produce siloed solutions.

  5. India can shape global standards, not just adopt them. By combining voluntary industry standards with targeted regulation in high-risk domains, and by educating future technologists in responsible practices, India can avoid overly prescriptive frameworks (like EU AI Act) while building trustworthy AI infrastructure.

Conference Talk Summary


Key Topics Covered

  • FHIBE Dataset – Sony Research's consensually collected, GDPR-compliant, globally diverse benchmark for fairness evaluation in human-centric computer vision tasks
  • Defining Responsible AI – Multiple perspectives: trust in financial services, legal compliance, ethical problem formulation, and design-by-principle approaches
  • Data Governance & Consent – Emphasis on consent, consent revocation, fair compensation, and legal compliance (DPDP Act in India)
  • Accountability & Liability – Gaps in Indian law regarding remedies for AI-related harm and unclear accountability chains in distributed AI ecosystems
  • Fairness vs. Performance Trade-offs – How organizations (Mastercard) navigate pressure to deploy quickly while maintaining responsible AI standards
  • Foundation Models & Transparency – Challenges of opaque training data in large language models and generative AI
  • Model Unlearning & Right to Erasure – Technical and legal requirements to implement DPDP Act compliance through machine unlearning
  • Educational Reform – Integrating responsible AI into academic curricula from K-12 through graduate levels
  • Regulatory Positioning – India's approach to balancing innovation, consumer protection, and global standard-setting
  • Structural Change for Responsibility – Long-term systemic interventions needed beyond incremental improvements

Key Points & Insights

  1. Responsibility Must Start at Problem Definition, Not Model Deployment

    • Dr. Mayang emphasizes that responsibility should be embedded at the earliest stage—defining the problem statement itself—creating a "rainfall effect" across all downstream components rather than being bolted on at the end.
  2. Fair Compensation for Data Contributors Is Critical But Overlooked

    • Shri from Sony identifies fair compensation as "very important but very less talked about," underscoring that ethical data collection includes paying annotators, data contributors, and intermediaries fairly.
  3. GDPR Compliance and Data Erasure Are Difficult Technical Problems

    • Dr. Mayang notes that implementing the DPDP Act's right to erasure in trained generative AI models requires machine unlearning techniques that are not yet mature or standardized—creating a gap between legal obligation and technical capability.
  4. Accountability Structures Are Missing in India

    • Aprajata highlights a critical legal gap: India lacks case law and clear remedies for AI-related harm. When distributed AI systems cause damage, it's unclear which entity (developer, deployer, or intermediary) bears responsibility.
  5. Security-by-Design as a Practical Operational Model

    • Ankur demonstrates that Mastercard operationalizes responsible AI through a "studio process" with gates at each development stage, integrating data strategy, governance, and product teams from the outset rather than siloing them.
  6. Voluntary Standards vs. Regulation Requires Nuanced Approach

    • Aprajata argues for a mixed model: voluntary frameworks for technical innovation (auditing, red-teaming, explainability) but mandatory regulation for high-risk domains (healthcare, finance, critical infrastructure) with clear accountability.
  7. Model Cards and Data Cards Are Underutilized Despite Existing Standards

    • Dr. Mayang notes that while model card and data card frameworks exist, the AI community rarely implements them, making it difficult to trace training data provenance and model properties.
  8. Consumer Awareness and Data Hygiene Are Preconditions for Responsible AI

    • Ankur observes that many Indian consumers share sensitive data without understanding consequences, requiring educational campaigns alongside industry and regulatory frameworks.
  9. Agentic AI Raises Unprecedented Accountability Questions

    • When autonomous agents make decisions (e.g., purchases, transactions), current governance structures cannot answer "who is responsible?"—requiring new regulatory definition before deployment at scale.
  10. Educational Intervention at K-12 Level Is Structural Change

    • Dr. Mayang's initiative to introduce AI ethics into NCERT textbooks for classes 11–12 (reaching ~2 crore students annually) aims to build a generation trained in responsible AI principles from the foundation.

Notable Quotes or Statements

"Responsible AI is scaled intelligence with no bias, it is fair, accountable and compliant to the regulations." — Ankur (Mastercard)

"Responsibility cannot be retrofitted; it must be embedded in design." — Dr. Mayang (IIT Ropar/IIT Jaipur)

"We need to start asking the question: 'How do we make it better before we even start building it?'" — Shri (Sony Research)

"The problem is not that a technology makes it challenging to erase data. The problem is that there's a very active legal obligation that exists [under DPDP Act]." — Aprajata (AZB & Partners)

"No matter how responsibly perfect your product is, when you're debating between shareholder interests, shareholder interest will always take primacy. So you need a more practical mindset." — Aprajata

"Agentic AI: The first question visitors ask is 'who will handle it if I raise a dispute?'" — Ankur (on unresolved accountability in autonomous systems)

"If you have to implement DPDP Act in true sense, you need mechanisms to unlearn generative AI models. We are not there yet." — Dr. Mayang


Speakers & Organizations Mentioned

SpeakerOrganizationRole / Focus
ShriSony Research IndiaResearcher; lead on FHIBE dataset
AnkurMastercardVP, AI Garage & Center for Excellence (EMA regions); 20+ years AI experience
AprajataAZB & PartnersPartner specializing in AI, technology law, data governance; qualified in India, England, Wales, New York
Dr. MayangIIT Jaipur (formerly IIT Ropar)Academic; lead on Btech AI/Data Science programs; NCERT AI textbook author
(Unnamed third panelist)(Delayed due to traffic)(Final introduction incomplete in transcript)

Related Entities:

  • Sony Research India
  • Mastercard
  • NCERT (National Council of Educational Research and Training)
  • IIT Jaipur, IIT Ropar, IIT Delhi
  • EU (EU AI Act referenced as cautionary example)
  • India's DPDP Act (Digital Personal Data Protection Act)

Technical Concepts & Resources

Datasets & Standards

  • FHIBE (Fair Human-centric Image Benchmark)
    • First globally diverse, consensually collected, GDPR-compliant fairness evaluation dataset
    • Supports 9 different computer vision tasks from single dataset
    • 40+ dense, self-reported, QA-checked annotations per image
    • Includes subject-level attributes (skin tone, facial hair, hair color), environmental conditions (indoor/outdoor, light source), device metadata (camera, focal length)
    • Designed to diagnose bias across diverse demographics

Frameworks & Processes

  • Model Cards – Standardized documentation of model properties, training data, performance, limitations
  • Data Cards – Standardized documentation of dataset composition, labeling, provenance, properties
  • Security-by-Design (Mastercard's approach) – Gated development process ensuring fairness, transparency, accountability, security, and regulatory compliance at each stage
  • Studio Process – Multi-stage governance with gates for due diligence, data strategy, bias assessment

Technical Challenges

  • Machine Unlearning / Model Forgetting – Emerging capability needed to comply with right-to-erasure requirements in trained generative AI models
  • Right to Erasure (DPDP Act) – Legal requirement to remove personal data and its influence from trained models (not yet practically implemented at scale)
  • DPDP Act (India's Digital Personal Data Protection Act) – Requires consent, consent revocation, proportionality, purpose limitation, data erasure within 2 years
  • EU AI Act – Cautionary example of overly prescriptive regulation that may throttle innovation
  • GDPR Compliance – Referenced as a standard for ethical data handling

Educational Initiatives

  • Btech in Artificial Intelligence (India's first undergrad AI program, IIT Delhi & IIT Hyderabad)
  • Btech in AI & Data Science (IIT Jaipur)
  • MLDDL Ops Course (IIT Jaipur) – Teaching responsible AI at design and operations levels
  • NCERT AI Curriculum (Classes 11–12) – Foundational responsible AI education targeting ~2 crore students annually

Emerging Domains

  • Agentic AI – Autonomous agent systems (e.g., autonomous transaction agents) requiring new accountability frameworks
  • Super-App / UPI Ecosystem (India) – Digital infrastructure with no current guidelines on data use in downstream AI processes

End of Summary