All sessions

AI at Scale: Building High-Performance Data Centres

Contents

Executive Summary

This session addressed the critical infrastructure and policy challenges of scaling AI data centres globally, with particular focus on energy efficiency, water consumption, grid integration, and resource optimization. Speakers from the National Laboratory of the Rockies (now National Renewable Energy Lab), IIT Delhi, and Indian policy think tanks presented integrated technical solutions, research capabilities, and governance frameworks needed to ensure AI growth aligns with sustainable resource management and grid stability.

Key Takeaways

  1. Data Centre Deployment is No Longer a Decade-Long Process: Growth from 10,000 ft² to square-mile-scale facilities occurred in 3 years, not 10+ years. This urgency demands accelerated innovation and collaborative governance—traditional planning timelines are obsolete.

  2. Hybrid Power Systems Are Non-Negotiable for Grid Stability: Base-load generation alone cannot serve AI data centres; combinations of batteries, fuel cells, and energy storage must manage pulsating loads to prevent equipment wear and ensure grid reliability. This is not optional optimization—it is foundational infrastructure.

  3. Location Matters More Than Technology Alone: Geographic siting considering water availability, fiber connectivity, renewable generation proximity, and climate resilience can reduce operational carbon footprint and resource pressure more effectively than incremental efficiency improvements. "Site with foresight" requires data-driven resource adequacy mapping.

  4. Policy Alignment is Essential for Technology Adoption: Advanced cooling and energy storage technologies remain expensive without policy incentives (tax holidays, infrastructure status, performance-based subsidies). Absence of coordinated policy creates a race-to-the-bottom dynamic that undermines sustainability.

  5. Unknown Unknowns Require Continuous Testing & Adaptation: National laboratories with living testbeds (HPC facilities, grid simulation platforms, hardware-in-the-loop experiments) are critical for validating emerging technologies before deployment at scale. The chip-to-grid approach enables integrated optimization that isolated R&D cannot achieve.

Summit Talk Summary


Key Topics Covered

  • Energy Challenges & Grid Integration: Rapid power demand fluctuations from AI workloads (up to 80% variance in seconds), grid reliability standards, and transmission/distribution planning
  • Data Centre Cooling Technologies: Liquid cooling systems, direct-to-chip cooling, immersion cooling (single-phase and two-phase), heat recovery for building heating
  • Power Electronics & Infrastructure: UPS systems, battery storage, fuel cells, power conversion losses, medium-voltage (MVDC) and high-voltage (HVDC) architectures
  • Resource Constraints: Water consumption (~20 lakh liters/day per 100 MW data centre), carbon footprint, rare earth mining impacts, and competing grid demands
  • Spatial Planning & Siting: Geographic optimization considering fiber connectivity, transmission lines, renewable sources, climate resilience, and local zoning constraints
  • Policy & Governance Frameworks: Sustainability provisions, mandatory environmental disclosure, performance standards (PUE, WUE, carbon metrics), and state-level industrial policies
  • AI Applications for Energy Systems: AI-driven load forecasting, demand response, thermal-aware server placement, non-intrusive load modeling (NILM), and distribution system optimization
  • Grid Stability & Flexibility: Demand response programs, battery/fuel cell integration, flexible load management, and real-time distribution management

Key Points & Insights

  1. Scale & Urgency of Growth: Data centre electricity demand is projected to double by 2030; India alone expects 5-8 GW (or potentially 16 GW) of additional data centre capacity. Global data centres currently consume ~1-1.5% of electricity; AI workloads could represent 50% of all data centre capacity by 2030.

  2. Energy Distribution Inefficiencies: Current data centres waste 30-50% of their energy consumption during idle states, and cooling accounts for ~40% of total energy consumption—creating significant optimization opportunities through workload consolidation and thermal-aware placement algorithms.

  3. Pulsating Load Problem: AI data centres exhibit highly dynamic power demand that can fluctuate 80% in seconds due to batch processing. Traditional generation (spinning generators) cannot handle this without mechanical stress; hybrid systems combining base-load generation with batteries and fuel cells are essential to prevent rapid wear.

  4. Water-Energy Tradeoff: Air-cooled systems reduce water consumption but increase energy (and carbon) footprint; water-cooled systems are more efficient thermally but consume massive quantities (~0.5L per 50 ChatGPT prompts). Optimal cooling choice depends on local resource availability and constraints.

  5. Policy-Infrastructure Gap: Only 5 out of 15 Indian states with data centre policies include sustainability provisions. Absence of binding national frameworks creates inconsistency; competitive federalism has led to fragmented standards rather than harmonized environmental or performance metrics.

  6. Grid Operator Uncertainty: Grid operators face an unprecedented paradigm shift—data centres demand N+1 or N+2 reliability (versus traditional N-1 standard), and sudden load changes of 200 MW to 1 GW within seconds challenge conventional planning and operational tools. Current national electricity policies are inadequately addressing this.

  7. Advanced Cooling Costs: Promising technologies (directed chip cooling, dielectric plate cooling, immersion systems) face high upfront capital costs and limited vendor supply chains. Policy design must incentivize efficient technologies across their operational lifetime, not just penalize upfront costs.

  8. Circular AI-Energy Dependency: Research opportunities require multidisciplinary collaboration (power, thermal, materials, policy, regulation) because "energy for AI" and "AI for energy" are interdependent. Good resource stewardship of AI infrastructure enables AI to optimize broader energy systems, and vice versa.

  9. Emerging Geo-Strategic Considerations: Satellite-based data centres (e.g., Starlink, Amazon's million-satellite plans) could fundamentally shift the paradigm—trillions of dollars in ground-based grid infrastructure investments may become obsolete if computation migrates to orbit.

  10. Mandatory Transparency & Measurement: International and Indian models show that without mandatory environmental disclosure (water use, carbon footprint, power use effectiveness), investments gravitate toward lowest-cost sites regardless of sustainability. AI Energy Star ratings and water star ratings would increase market confidence and attract responsible investment.


Notable Quotes or Statements

  • Jacqueline Cochran (National Lab of the Rockies): "The question is not whether AI will scale. It already is. The question is how we deploy data centers that enhances resiliency, flexibility, reliability and performance."

  • Merl Bagu (Grid Integration, NLR): "Data center loads really pulse a lot depending upon the batch processing that it does... we're trying to see if we can use our regular base load resources along with energy storage and other dynamic power electronic resources to really segregate the load in two parts."

  • Prof. Aijit Abankar (IIT Delhi): "Energy for AI, and energy for AI [AI for energy]... If you take good care of the AI, so far as energy requirements are concerned, AI is likely to take good care of the energy part as well."

  • Dr. Arunava Mandal (Council on Energy, Environment and Water): "AI runs on water, energy and planning. It doesn't run just on code."

  • Reggie Chakraborty (India Smart Grid Forum): "This 10,000 ft² to 30,000 ft² data centre to square mile data centre is something which happened in 3 years. It didn't happen over a decade or a century. So this is something new, unknown unknowns we are dealing with, and everybody has to work together."

  • Chakraborty (on satellite-based data centres): "A week later Starlink has taken an H100 GPU to the orbit... Elon has announced a million satellites where data centers will be hosted on the orbit in the coming years. So if trillions of dollars investment on the grid... what will happen to that if all the data centers are going to be on the orbit?"


Speakers & Organizations Mentioned

Primary Speakers:

  • Jacqueline Cochran – Associate Laboratory Director, National Lab of the Rockies (formerly National Renewable Energy Lab, NREL), Colorado
  • Merl Bagu – Collaborative Program Manager for Grid Integration, National Lab of the Rockies; works with Department of Energy's Office of Electricity
  • Prof. Aijit Abankar – Chair Professor of Electrical Engineering, IIT Delhi
  • Dr. Arunava Mandal – Council on Energy, Environment and Water (CEEW), leading climate think tank; Chair, AI and Climate Global Expert Group at this AI Impact Summit
  • Reggie Chakraborty – India Smart Grid Forum

Organizations & Institutions:

  • National Lab of the Rockies (NLR) / National Renewable Energy Lab (NREL)
  • U.S. Department of Energy (DOE), Office of Electricity
  • IIT Delhi (Indian Institute of Technology Delhi)
  • Council on Energy, Environment and Water (CEEW)
  • India Smart Grid Forum
  • IIT Consortium (developing BharatGPT / Bharaj Jen LLM)
  • National Science Foundation (NSF) – mentioned via ANRF/DST funding
  • Government of India – Ministry of Power, AI Mission
  • Government of India – National Education Policy 2020
  • 20 State Governments in India (CEEW report partnerships)
  • Companies/integrators mentioned: Virus (data centre manufacturer/aggregator), Starlink, Amazon, PJM (U.S. grid operator region)

Technical Concepts & Resources

Simulation & Modeling Platforms

  • ARIES Platform – Department of Energy's real-time grid simulation for testing data centre power scenarios at high fidelity (used in hybrid generation + pulsating load experiments)
  • REopt – Tool for optimizing energy resources and planning
  • SIENNA – Simulation and testing tool
  • READS – Resource evaluation and analysis tool
  • REV – Resource evaluation tool (renewable energy planning)

Cooling Technologies

  • Component-level liquid cooling with water heat recovery for building heating
  • Direct-to-chip cooling – liquid coolant flowing through cold plates on racks
  • Single-phase immersion cooling – IT infrastructure embedded in dielectric liquid
  • Two-phase immersion cooling – evaporation and condensation of coolant fluid
  • Air-cooled systems – reduced water consumption, higher energy/carbon footprint

Power Architecture & Components

  • UPS (Uninterrupted Power Supply) – Manages brief power losses (seconds to minutes)
  • Battery Energy Storage Systems (BESS) – Fast response to load swings
  • Fuel cells – Intermediate-term backup power (minutes to hours)
  • Diesel Generators (DGs) – Extended backup for edge/tier 2-3 data centres
  • MVDC (Medium-Voltage DC) – Architecture for mid-size data centres
  • HVDC (High-Voltage DC) – Architecture for hyperscale facilities
  • Power electronic interfaces – Bidirectional power devices connecting UPS and grid

Performance & Sustainability Metrics

  • PUE (Power Usage Effectiveness) – Ratio of total facility power to IT equipment power; target is approaching 1.0
  • WUE (Water Usage Effectiveness) – Water consumption per unit computation
  • Carbon effectiveness – Carbon footprint per computation unit

Energy Forecasting & Optimization

  • NILM (Non-Intrusive Load Modeling) – Disaggregating distribution loads to identify consumption patterns without sub-metering
  • Thermal-aware server placement – Workload allocation optimizing thermal distribution
  • Energy-aware workload consolidation – Algorithms to minimize idle energy consumption (currently 30-50% of data centre draw)
  • Demand response (DR) programs – Dynamic load management to support grid stability
  • DER (Distributed Energy Resource) aggregation – Extracting flexibility from solar, wind, and storage for grid services

LLM/AI Systems Mentioned

  • ChatGPT – Example application for water consumption metrics
  • BharatGPT / Bharaj Jen – New LLM initiative by IIT Consortium, funded by DST/ANRF (India-specific large language model)
  • H100 GPU – High-performance accelerator; Starlink satellite example mentions deploying H100 to orbit

Hardware Testbeds & Facilities

  • National Lab of the Rockies HPC Data Centre – 10,000 ft² facility with up to 10 MW computing power; used for cooling, load, and control validation
  • Energy System Integration Facility – Grid-side equipment including virtual utility operations rooms, smart grid interoperability testing, megawatt-level hardware-in-the-loop testing, distribution management simulation
  • Advanced Research and Integrated Systems (Flatirons Campus) – Up to 10 MW testing facility; includes power-electronic grid interface testbed, Advanced Power Electronic Testbed (APTB) for MVDC/HVDC architectures, storage pads

Policy & Governance Frameworks

  • Data Centre Infrastructure Status (India) – Granted in recent budget; enables long-term tax holidays
  • India AI Mission – Supporting 500 PhD, 5,000 M.Tech, 8,000 UG students over 5 years
  • National Education Policy 2020 (India) – Mandates AI curriculum from grade 9 onwards
  • Data Localization Policy (India) – Government mandate influencing edge/tier 2-3 data centre expansion
  • 15 Indian State Data Centre Policies – Only 5 include sustainability provisions
  • National Electricity Policy 2026 (India) – Acknowledged as currently inadequate in addressing data centre grid challenges

Research Methodologies

  • Chip-to-Grid Approach – Integrated systems optimization from semiconductor cooling through grid-level dispatch
  • Co-optimization of integrated energy systems – Simultaneous consideration of power electronics, thermal, facility-level, and grid-level variables
  • Hardware-in-the-loop experimentation – Validating controls and technologies in real-time grid scenarios before deployment

Strategic Issues & Unknowns

  • Satellite-Based Data Centres: Emerging capability of orbital computing infrastructure could fundamentally disrupt multi-trillion-dollar ground-based grid investments.
  • Grid Connection Queue Backlogs: PJM (U.S.) has 30 GW of pending data centre connection requests by 2028; Texas has 240 GW through 2030. Regulatory frameworks for prioritization are underdeveloped.
  • Household Bill Impacts: Connection of all requested data centres could raise average household electricity bills by ~$70/month in some U.S. regions.
  • Visibility Gaps on Demand: Different projections (8 GW, 5-8 GW, 16 GW) from credible sources suggest incomplete demand forecasting and planning coordination.
  • Institutional Architecture for Oversight: Human review and due process in automated grid/AI operations remain inadequately specified in policy.