All sessions

Waves of infrastructure: Open Systems, Open Source, Open Cloud

Contents

Executive Summary

This talk presents a vision for the next wave of AI infrastructure innovation, positioning on-premises and sovereign cloud computing as essential complements to hyperscaler infrastructure. The speaker argues that India faces unique opportunities to build indigenous AI infrastructure at population scale and lowest cost, drawing parallels to prior technology transitions (semiconductors, cloud computing) to justify investment in distributed, memory-centric systems optimized for inference workloads and localized data processing.

Key Takeaways

  1. The next wave of AI computing is decentralized, inference-focused, and locally sovereign: On-premises and regional cloud infrastructure will be as important as hyperscaler cloud for enterprise and government AI deployments, driven by data gravity, latency, cost, and sovereignty concerns.

  2. India's competitive advantage lies in execution at population scale and lowest cost: Rather than competing on chip design or training infrastructure with the US or China, India should focus on building end-to-end AI solutions (hardware + software + data infrastructure) optimized for cost and scale—potentially creating a new category of companies.

  3. Open systems and open models are structural enablers of competition and localization: The shift from proprietary (Sun, SPARC) to open standards (Linux, x86) and from closed models to open models creates room for regional players to build domain-specific, cost-optimized solutions without vendor dependence.

  4. Memory-centric, inference-optimized architectures require rethinking systems from first principles: The shift from training to inference, combined with the need for agentic AI and deep context, means memory hierarchies, latency, and data organization matter more than raw GPU flops—opening space for architectural innovation.

  5. Sustained, multi-stage funding and long-term commitment are necessary: Building indigenous AI infrastructure companies will require government support, venture capital, and corporate investment over 10–20 year cycles (as with ISRO), not venture-scale funding cycles. This is a nation-building effort, not a startup sprint.

Key Topics Covered

  • Historical technology transitions: Patterns of innovation cycles (semiconductors → cloud → inference-driven AI computing)
  • Infrastructure economics: $2 trillion projected spend on AI infrastructure over 5–10 years; cost structures and power requirements
  • On-premises vs. cloud: The case for private cloud infrastructure given that 80% of enterprise data remains on-premises
  • Inference as a new computing driver: Shift from training-dominated workloads to inference-dominated systems requiring different architectural approaches
  • Open systems, open source, and open cloud: The role of open standards and open-source models in enabling competition and innovation (parallel to Linux's role in distributed computing)
  • Memory-centric architecture: The importance of memory hierarchy, KV caches, and logical memory layers for LLM inference and agentic AI
  • Ethernet and networking: 800 Gb/s and terabit Ethernet as enabling infrastructure for distributed inference
  • Data infrastructure and sovereignty: Cataloging, indexing, and searching unstructured and structured data while maintaining data sovereignty
  • Agricultural, healthcare, and education use cases: Real-world applications driving local compute adoption in India
  • Model selection and optimization: Automating cost-effective model routing in production systems
  • Venture ecosystem and funding: Challenges and opportunities for building indigenous technology companies at scale

Key Points & Insights

  1. Technology transition patterns recur: Every 7–15 years, major computing paradigm shifts occur (microprocessor era → cloud era → inference era). The shift from training to inference mirrors historical transitions from single-CPU to distributed systems, requiring new architectural thinking.

  2. Inference will drive computing demand more than training: Unlike training, inference workloads are geographically distributed, latency-sensitive, and cost-sensitive. This creates demand for localized, efficient inference infrastructure rather than centralized training clusters—especially in cost-constrained markets like India.

  3. Memory, not flops, is the limiting factor: The speaker emphasizes a shift from thinking about compute in terms of raw floating-point operations to rethinking system architecture around memory hierarchies—KV caches, multi-level memory types, and logical memory state for agents and personalization.

  4. 80% of enterprise data is on-premises and will remain so: This structural reality (reinforced by Michael Dell's quote) justifies the need for modern on-premises AI infrastructure. Not all data can or should move to public clouds due to privacy, latency, sovereignty, and cost constraints.

  5. Open models, open source, and open systems enable competition: Just as Linux democratized access to UNIX and enabled hyperscale infrastructure, open models and open standards in inference will prevent vendor lock-in and allow region-specific, domain-specific, and country-specific implementations.

  6. India has a unique opportunity in population-scale, cost-optimized AI: With 1.4 billion people and lower labor costs, India can achieve what no other region can: delivering AI services at sub-200 rupees/month with sub-120 millisecond latency. This problem space (population scale + cost constraint) uniquely positions India for innovation.

  7. Data infrastructure (indexing, cataloging, search) is as important as compute: The talk emphasizes that bringing compute to data requires solving the data organization problem first—not just storage, but searchable, indexed, semantically organized data layers that enable higher-order reasoning.

  8. 90% of AI pilots fail to reach production due to quality, cost, and reliability issues: Partners like DVM highlight that model selection, quality evaluation, and cost optimization are unsolved problems in production AI—not model capability or demo quality.

  9. Semiconductors and AI follow similar complexity patterns: Just as ~150 people of exceptional talent were needed to design world-class microprocessors in the 1990s, ~120–150 people are needed to build foundation models today. This suggests model-building and hardware-building may consolidate around fewer, well-resourced teams.

  10. Large indigenous technology companies will emerge from this transition: Just as Y2K created Infosys, TCS, and Wipro, and just as Nvidia, AMD, and others emerged from prior computing shifts, this AI infrastructure transition will spawn new Indian companies across hardware, middleware, applications, and domain-specific solutions.

Notable Quotes or Statements

  • "We overestimate what can be done in two years but underestimate what can be done in 10 years." — Sets the frame for long-term technology transitions and patience required for infrastructure buildout.

  • "People who are serious about software should make their own hardware; the corollary is people serious about hardware should also make their own software." — Emphasizes the need for vertical integration in AI infrastructure to compete at scale.

  • "Inference at scale is extreme computing." — Jensen Huang (via speaker), highlighting the distinct challenges of production inference vs. training.

  • "80% of the data is still on prem." — Michael Dell (via speaker), justifying the market for on-premises AI infrastructure.

  • "It's not just hardware. It's going to be algorithmic improvements, other improvements." — Suggests that 120ms latency across all queries is achievable through systems innovation, not just raw compute.

  • "If India can deliver at a cost point like 200 rupees a month at 120 milliseconds for any query to be handled... it serves a lot of people but it also will drive tremendous amount of innovation." — Defines a specific, ambitious target for India's AI infrastructure ambition.

  • "Build for India, build from India." — Encapsulates the vision of creating indigenous AI technology companies rooted in India but serving global markets.

  • "90% of geni pilots never make it to production... not because the demo was bad or the models were weak, but primarily because of three reasons: quality is undefined, costs are unpredictable, and model selection is a moving target." — Barat, DVM; identifies the real production bottleneck in enterprise AI.

Speakers & Organizations Mentioned

Speaker / RoleOrganizationKey Topic
Renu (primary speaker)Proximal CloudInfrastructure, systems innovation, India's AI opportunity
Lalit BaratFarmax (agriculture)Edge AI for farming, sensor data, autonomous systems
Barat (surname unclear)DVMModel selection, inference optimization, quality-first routing
Sundep KumarInstant Systems (venture builder, Silicon Valley)Startup building, hallucination mitigation, financial AI applications
Abhishek KajjanZeta BoltLLM acceleration, custom silicon, inference speed optimization
Arya BarachariInfosysSemiconductors, AI vision for India; domain-specific agentic AI; fab productivity gains
(Unnamed)UC San DiegoPublic-private partnership, data science institute, health sciences research
(Unnamed)CedakHardware partnerships, RDRA supercomputer platforms
(Unnamed)Deep LakeData infrastructure, object storage, indexing at petabyte scale
Satya (Nadella, implied)MicrosoftCopilot, graph-based data organization

Technical Concepts & Resources

  • Architectures & Technologies:

    • Memory hierarchies: HBM (high-bandwidth memory), KV caches for LLMs, multi-level memory types
    • Networking: 800 Gb/s and terabit Ethernet; impact on distributed systems design
    • Inference optimization: Model selection, cost-per-quality routing, continuous model evaluation
    • Data infrastructure: Distributed data catalogs, indexing, semantic search, knowledge graphs
    • Agentic AI: Multi-agent workflows, deep context retention, personalization layers
    • Edge computing: On-premises appliances, air-gapped deployments, sovereign data handling
  • Hardware Partnerships & Vendors:

    • AMD: x86 CPUs, RDNA GPU roadmap, 256–512 GB HBM capacity
    • Nvidia: GPUs, inference acceleration, training infrastructure (implied competitor)
    • Cedak / CDAC: Indian supercomputer vendor; RDRA series platforms
    • Dell, HP, Super Micro: OEM hardware providers
    • Sunmina: Manufacturing partnerships in India (Chennai)
  • Platforms & Tools:

    • Kubernetes: Container orchestration (referenced as outcome of prior infrastructure wave)
    • Deep Lake: Vector database for data indexing and search
    • DVM: Inference layer with eval-first, routing-second model selection
    • Proximal Cloud stack: Four-layer stack (hardware, data infrastructure, AI ops, agentic layer)
    • Archive.org papers database: 25 million papers, 100 TB, used for education use case
  • Data Types & Use Cases:

    • Structured data: SQL, relational databases, financial records
    • Unstructured data: Images, video, audio, documents, OCR data
    • Time series: Sensor data from agriculture, IoT
    • Multimodal: MRI images, satellite imagery, drone imagery
    • Knowledge graphs: Causal relationships, entity relationships, domain ontologies
  • Key Metrics & Targets:

    • 120 milliseconds: Target latency for any query response (inspired by Google's 20ms principle)
    • 200 rupees/month: Target price point for AI services for Indian population
    • 10 gawatt: Projected AI data center power capacity for India by 2030–2035
    • $250 billion: Hardware spend correlated with 10 gawatt capacity
    • $2 trillion: Global infrastructure spend projection over 5–10 years
    • 50–512 GB HBM: Memory capacity trajectory for inference-optimized GPUs

Note on Transcript Quality: The transcript contains occasional transcription errors and incomplete thoughts, likely due to real-time speech recognition. Key ideas remain interpretable, but some technical details and speaker names may be imprecise.