Governing Autonomy: Trust in Agentic AI Systems
Contents
Executive Summary
This panel discussion addresses the critical challenge of governing autonomous AI agents as they move from experimental systems to production deployment across enterprises and public services. The speakers argue that governance must be embedded as a foundational layer from the start—not retrofitted—and emphasize that trust infrastructure, transparent standards, and collaborative frameworks are essential to enable safe, scalable agentic AI ecosystems while preserving innovation.
Key Takeaways
-
Governance is not a bottleneck to innovation—it's the foundation for scaling. Companies that embed trust frameworks, auditability, and observability into product design from day one achieve faster enterprise adoption and market leadership. The tension between "moving fast" and "governing responsibly" is false; responsible deployment is how you scale.
-
Open-source governance frameworks and standards are force multipliers. Projects like Cognizant's NeuroAI multi-agent accelerator (Apache 2.0 licensed, security-by-design) democratize access to properly governed agentic systems. Startups cannot afford proprietary governance stacks—standardized, open-source frameworks are essential for broad adoption.
-
Trust infrastructure must evolve from centralized to federated. As agents from different organizations and using different LLMs interact, monolithic trust systems will fail. The solution is decentralized trust monitors, agent identity registries, and federated governance frameworks that work across organizational and model boundaries—similar to how the early internet solved identity and discovery.
-
Real-time observability beats post-hoc investigation. Kill switches, token-level logging, agent communication trails, and live monitoring are non-negotiable because agents operate continuously across networks. Waiting until something goes wrong to investigate is too late; continuous runtime governance is the standard.
-
Regulatory frameworks should come after maturity, but policy framing should come now. Early excessive regulation stifles innovation; premature certification requirements create barriers to entry. However, policy framing that outlines the bounds of acceptable behavior—without mandating solutions—allows safe experimentation while guiding collective development.
Key Topics Covered
- Autonomy as a spectrum: Understanding agents not as a binary category but as existing on a continuum from basic chatbots to fully autonomous systems
- Governance stack architecture: Five-layer model covering build time, deploy time, runtime, remediation, and accountability
- Trust and identity infrastructure: Agent identity registries, certifications, and "minimal viable trust" frameworks for startups
- Runtime governance vs. post-hoc solutions: The necessity of real-time monitoring and kill switches rather than after-the-fact remediation
- Multi-agent coordination and orchestration: Managing complex interactions between multiple agents while maintaining safety and auditability
- Regulatory lessons from other industries: Applying safety frameworks from aviation, drones, and nuclear power to agentic AI
- Standards and evaluation benchmarks: The need for transparent testing methodologies specific to multi-agent systems
- Human-in-command vs. human-in-the-loop: Evolution from constant human approval to supervised autonomy as systems mature
- Open-source and commons approaches: Making governance infrastructure accessible to startups and smaller organizations
- Decentralized/federated trust models: Building systems that work across multiple LLM providers and organizations
Key Points & Insights
-
Autonomy requires foundational governance, not afterthought solutions: Governance must be treated as a first-principle concern built into system architecture from inception, not added later. This mirrors how internet standards (identity, discovery, trust) were designed as foundational primitives in a federated system.
-
Agents operate across a continuum, not as a binary state: From Google's Deep Research (limited autonomy, no action) to autonomous vehicles (full end-to-end action), agents exist on a spectrum of autonomy across dimensions including memory, planning horizons, and decision-making authority. This requires differentiated governance approaches.
-
The "minimal viable trust stack" for startups: Even resource-constrained startups need four core elements to launch agentic systems: (a) clearly defined agent identity registry, (b) guardrails at the orchestration layer, (c) real-time observability architecture, (d) clear oversight mechanisms. These are not expensive but are non-negotiable.
-
Governance is becoming a competitive advantage, not a compliance burden: Enterprises purchasing AI systems today spend millions on governance upfront because the costs of failure are so high. Teams embedding governance into product design and weekly operating cadence ("security is governance") win on GTM; this has become table-stakes for Series A funding conversations.
-
Data plane innovation + control plane governance must evolve in parallel: Rather than choosing between open innovation and strict governance, the model should be: keep data plane (innovation) open and experimental, but build governance controls (control plane) in parallel as a foundational layer. These are not trade-offs.
-
Real-time observability and incident tracking are critical: Unlike traditional software, agents operate continuously across networks. Granular logging at token-level, clear communication trails between agents, energy/carbon metrics, and cost tracking are essential for auditability and post-incident learning.
-
Human-in-command, not human-in-the-loop: Aviation evolved from requiring pilots to maintain visual line of sight (VLOS) to beyond-visual-line-of-sight (BVLOS) operations once detect-and-avoid systems proved safer than human judgment. Similarly, agentic AI should evolve from constant human approval to humans in supervisory command roles as systems mature and prove reliable.
-
Multi-agent systems require multi-model evaluation frameworks: Current safety benchmarks assume single-model agents. Real-world systems will integrate multiple LLMs from different providers (open-source, proprietary, specialized). Evaluation frameworks must account for inter-agent communication, trust boundaries, and emergent behaviors across heterogeneous systems.
-
Auditability and traceability infrastructure enable collective learning: A global, open-source platform for sharing anonymized audit trails and incident data across organizations would accelerate learning. Currently, startups and enterprises solve similar problems independently; shared infrastructure for incident reporting would prevent repeated failures.
-
Demystification and commons approaches reduce unnecessary fear: Much anxiety about agents stems from jargon and inaccessibility. Treating agentic AI infrastructure as a public utility/commons (like Aadhaar, UPI in India's digital ecosystem) and making governance tooling open-source ensures that not only well-funded teams can build responsibly.
Notable Quotes or Statements
"Governance has to be the control plane, and innovation has to be the data plane. They have to be built in parallel. We cannot choose one versus the other."
— Mahes (Project Nanda)
"You need to have a minimal viable trust tag. You need to be able to tell what is this agent supposed to do, is it actually doing what it's supposed to do, and if not, can you actually have someone stop it on a real-time basis."
— Apu Goyel (Insight Partners)
"Governance inside" [is the new "internet inside"]. With so much uncertainty, enterprises are spending millions on governance upfront because the costs of things going wrong are so high.
— Apu Goyel
"We should be thinking of agents as living on a continuum from very basic autonomy all the way to autonomous vehicles that are fully end-to-end. We can't say agents are only the ones that take actions."
— Ellie Safi (Google)
"Similar to aviation, which moved from human always keeping visual line of sight to beyond visual line of sight, agentic AI will move from human-in-the-loop to human-in-command."
— Ellie Safi
"If governance is not an embedded conversation in your product, it will not be a competitive edge for you in the GTM."
— Apu Goyel
"The greatest power is people... a community that was heavily representative of lawyers, doctors, artists, and engineers, because they understood the contextual problem so well."
— Aresha (IT Standards Association) on multi-stakeholder governance
Speakers & Organizations Mentioned
| Speaker | Organization | Role |
|---|---|---|
| Ellie Safi | Policy Manager, AI & Emerging Tech (Agentic AI & Robotics) | |
| Mahes | Project Nanda | Pioneer of foundational infrastructure for internet of AI agents |
| Aresha | IT Standards Association | Managing Director; Member of management council at ITE |
| Apu Goyel | Insight Partners | Principal; leads US-India investing efforts (~$90B AUM) |
| Pravin | Cognizant AI Lab (Bangalore) | Head of AI Lab; presenter of NeuroAI framework demo |
| Amir (implied moderator) | AI Commons | Runs innovative organization focused on AI as commons |
| Multiple references | DeepResearch (agentic feature example) | |
| Cognizant | — | Developing multi-agent accelerator frameworks |
| OpenAI | — | Referenced for GPT-4 and agentic experiments |
| Government of India | — | Aadhaar, UPI, DigiLocker (digital infrastructure examples) |
| FAA (Federal Aviation Administration) | US | Referenced for drone/autonomy regulations (VLOS to BVLOS) |
Technical Concepts & Resources
Frameworks & Tools
- NeuroAI Multi-Agent Accelerator (Cognizant): Low-code/no-code framework for rapid multi-agent prototyping and production deployment. Open-source under Apache 2.0 license on GitHub. Features:
- LLM-agnostic (user control over which LLMs agents use)
- Cloud-agnostic deployment
- Multi-protocol support (custom protocol, A2N, MC, CCP server)
- Security built-in from core (not afterthought)
- Granular logging down to token-level and cost tracking
- Energy usage, carbon footprint, and cost metrics per prompt
- Real-time inter-agent communication visibility
- Development environment with audit trails and traceability
Governance Stack Model (5-Layer)
- Build Time: Data governance, model versioning
- Deploy Time: Policy, permissioning, secrets management
- Runtime: Real-time observability, kill switches
- Remediation: Audit trails, incident response architecture
- Accountability: Oversight structures, postmortem protocols, compliance mapping
Agent Identity & Certification Concepts
- Agent Identity Registry: Tracked metadata (training data, creator, origin, language, behavior specs)
- Agent Passports: Identity documents for agents (comparable to human passports)
- Agentic Identity Credentials: Who the agent is, where it comes from, what data was used, permissions/policies
Safety & Evaluation Frameworks
- "Distributional AGI Safety" paper (referenced, likely by Ellie/Google/DeepMind): Discusses how AGI may emerge not as monolithic system but as network of specialized sub-AGI agents in multi-agent systems. Covers:
- Market design for agent interactions
- Baseline safety per individual agent
- Inter-agent communication safety
- Evaluation benchmarks for multi-agent systems: Transparent reporting on limitations, capabilities, and model safety by frontier labs (Google, DeepMind, etc.)
Regulatory & Design Concepts
- VLOS vs. BVLOS (aviation): Visual Line of Sight (human always monitors) vs. Beyond Visual Line of Sight (AI systems provide safety superior to human monitoring)
- Decentralized/Federated trust models: Avoiding centralized trust gatekeepers; distributing trust validation across network
- Human-in-the-loop vs. human-in-command: Evolution from constant approval to supervision
- Runtime governance: Continuous monitoring during operation, not post-hoc analysis
Standards & Policies Referenced
- ITE/ISO standards: Focused on data transparency, age-appropriate design, accountability, transparency, data privacy
- Multi-stakeholder governance: Involvement of lawyers, doctors, artists, engineers, policymakers, younger generation (AI natives)
- India's digital infrastructure model (Aadhaar, UPI, DigiLocker, OCD): Example of standardized interfaces enabling decentralized innovation
Metrics & Observability
- Token-level usage logging
- Cost-per-prompt tracking
- Energy consumption and carbon footprint measurement
- Inter-agent communication auditing
- Incident reporting and post-mortems
Contextual Notes
- The talk was held at an India AI Summit in New Delhi
- The event emphasized both technical solutions (open-source frameworks) and policy/governance approaches
- Multiple speakers emphasized India's success with digital infrastructure (Aadhaar, UPI) as a model for how to standardize interfaces while allowing decentralized innovation
- Strong emphasis on making governance accessible to startups and resource-constrained organizations, not just large tech companies
- Recurring theme: governance as competitive advantage, not compliance burden; innovation and responsibility are aligned, not in tension
