From Human Potential to Global Impact: Qualcomm’s AI for All Workshop

Contents

Executive Summary

Qualcomm executives and startup founders discussed the emerging ecosystem of on-device and edge AI, emphasizing how model compression, hardware advances, and distributed processing across edge-to-cloud architectures are enabling practical AI applications at scale. The central narrative argues that AI is transitioning from cloud-dependent, centralized systems to a hybrid model where intelligence operates across smartphones, IoT devices, enterprise servers, and data centers—fundamentally changing how humans interact with technology and how enterprises manage data.

Key Takeaways

Edge AI is Production-Ready: Not theoretical. Premium smartphones run 10B models, AR glasses run 1–2B models, PCs run 30B+ models. The question is no longer "can we?" but "how do we architect optimally?"
Privacy and Data Sovereignty Are Becoming Competitive Advantages: Organizations that keep sensitive data on-device or on-premises avoid breach risk, regulatory liability, and data leakage—while shadow AI shows that enterprise users are already voting with their feet for local solutions.
AI Agents Will Replace App-Based Interaction Within 3–5 Years: Voice + context + personal knowledge graphs enable natural task delegation. This fundamentally changes how users think about technology—expect the shift to accelerate post-2025 as multimodal models mature.
Hardware Innovation is Shifting from Compute to Memory: The inference bottleneck is no longer raw compute but memory bandwidth (decode stage). Qualcomm's focus on memory architecture in AI 250/300 reflects this; expect silicon design to diverge sharply from training-optimized chips.
6G Connectivity is Essential Infrastructure for Ubiquitous AI: 2028–2029 is the near-term horizon. Without reliable, low-latency connectivity, truly distributed AI systems (robotics, autonomous vehicles, remote monitoring) cannot reach full potential. Cellular networks are no longer separate from AI architecture—they are part of it.

Key Topics Covered

Model Compression & Efficiency: Shift from 175B-parameter models (GPT original) to 7–8B models with equal or superior performance; emergence of small language models (SLMs) and small multimodal models (SMMs)
Edge AI Infrastructure: Practical deployment of billion-parameter models on consumer devices (smartphones, AR glasses, PCs, wearables)
AI as User Interface: Voice, multimodal, and agentic interfaces replacing traditional app-based interaction; AI agents as orchestrators of multiple backend services
Real-World Implementation: ByteDance's AI-first phone (China) and Humane PC as working examples of agentic computing
Cloud-to-Edge Optimization: Training in cloud, inference distributed across edge and on-premises; memory architecture innovation for decode vs. prefill stages
Data Center Solutions: Qualcomm's AI 250 and upcoming AI 300 chips designed for energy-efficient inference; importance of memory bandwidth in inference workloads
6G and Cellular Integration: Next-generation wireless as critical enabler for ubiquitous AI connectivity; 2028 Olympics and 2029 deployments as milestones
Developer Enablement: Qualcomm AI Hub platform for model optimization, cloud-native testing, and deployment without requiring physical device access
Enterprise AI Adoption: Shadow AI problem (unauthorized cloud AI tool usage); on-premises and sovereign AI solutions; context-aware autonomous systems
Sector-Specific Applications: Robotics (autonomous navigation, fleet orchestration), legal tech (contract review/drafting with grounded data), fraud detection, education, agriculture

Key Points & Insights

Model Scaling Law Reversal: Smaller models (7–8B parameters) now outperform the original 175B GPT model, making edge deployment economically and technically viable. This is "an AI law" that undercuts the assumption that bigger models are always better.
Connectivity Independence: On-device AI inference removes dependency on network quality, enabling consistent user experience regardless of connectivity—critical for enterprise and consumer use cases where latency and reliability matter.
Data Sovereignty & Privacy: 78% of enterprises unknowingly use unauthorized cloud AI tools (shadow AI), exposing sensitive data. On-device and on-premises solutions address regulatory, privacy, and data residency concerns without sacrificing capability.
Agentic Interfaces Over Apps: Single-agent interface that understands voice, context, and personal knowledge graphs is replacing fragmented app-based interaction. Demonstrated working on ByteDance phone and Humane PC; enables natural task delegation ("check my bank, if I have enough, buy X, notify me").
Inference Architecture Differs from Training: Prefill stage is compute-bound; decode stage is memory-bandwidth-bound. This requires different hardware optimization than training—key insight driving Qualcomm AI 250/300 design focus on memory architecture innovation.
Distributed AI Processing (Hybrid Model): Optimal architecture isn't entirely edge or entirely cloud, but mix-and-match based on latency, compute, and data sensitivity. Smartphones (10B models), PCs (30B+), edge servers (100–300B), data centers (training & largest inference).
Robot Autonomy + Satellite Connectivity: Autonomous robots in remote areas (mining in Australia) require on-device AI because ground connectivity is absent; satellite uplinks provide orchestration layer. Connectivity gaps drive edge AI necessity, not just preference.
Grounded AI via Continuous Data Capture: Spotdraft's success came from capturing lawyer behavior in real-time (Word plugin) rather than one-time data labeling. Continuous, contextual data capture enables models to stay current and reflect actual organizational policy without explicit retraining.
Emergent Behavior in Agentic Systems: OpenAI's Canvas feature demonstrates early emergent behavior—models creating their own files and learning loops. This autonomy is beginning to move from cloud to edge; managing autonomous agents on personal data raises critical control questions.
India-Specific AI Adoption Barriers: Success requires managing expectations (AI augments, not replaces), combating job displacement fears, and building platforms that learn from users rather than requiring user training—especially important in low-digital-literacy contexts.

Notable Quotes or Statements

Dura Maladi (Qualcomm): "Model sizes are coming down quite dramatically, while the model quality continues to increase. This is the equivalent of an AI law that seems to be emerging as far as models themselves are concerned."
Dura Maladi: "The quality of the AI experience is invariant to the quality of connectivity that those devices had to have toward the back end of the network. That's as an attribute. I don't want to keep going back and forth between a regular experience and an AI experience just because I don't have internet connectivity."
Pravir Kocher (Kogo AI): "Shadow AI is a lot of people who work in companies sharing critical enterprise data on the cloud while using unauthorized AI tools like OpenAI or Claude. So 78% of enterprise users use shadow AI, and that's a big concern."
Ritukur Vijay (Autonomy): "You know, just throw a bunch of compute and a problem statement is not how AI is adopted in enterprise settings because it's very important to break down the big problem into smaller chunks and decide what you want to use AI for and what you don't."
Madhav (Spotdraft): "The wow moment was when the lawyer who doesn't trust AI suddenly said, 'No, I need to see this—this is useful.'"
Shini Shinwa (Tech Mandra): "Understanding the limitations of AI [is critical]. It's very easy to understand advantages, but if we can set expectations right that AI will augment work to a certain extent, that will be key."
Pravir Kocher: "We wanted to build an AI-native enterprise of the future... that one person should be able to roll out an entire company using a single stack. That's what Kogo OS is aiming for."

Speakers & Organizations Mentioned

Speakers/Panelists:

Dura Maladi – Executive Vice President & General Manager, Technology Planning, Edge Solutions & Data Center, Qualcomm Technologies
Siddhika Nerikur – Senior Director and Head, Qualcomm AI Hub
Ritukur Vijay – Co-founder & CEO, Autonomy (autonomous robotics and orchestration)
Shini Shinwa – Innovation Track Lead, Tech Mandra (AI, blockchain, metaverse innovation)
Madhav – Co-founder & CTO, Spotdraft (AI for legal contracts)
Pravir Kocher – Co-founder, Kogo AI (private agentic operating system)

Companies/Organizations:

Qualcomm Technologies
ByteDance (AI-first phone demonstration)
Humane (Humane PC)
Spotdraft
Autonomy
Tech Mandra
Kogo AI
OpenAI (GPT, Canvas)
Rio Tinto (mining operations using autonomous robots)
Public sector banks in India
State governments (India)
Waymo (autonomous vehicles in San Francisco)

Technical Concepts & Resources

AI Models & Architecture:

GPT (Original) – 175B parameters, November 2022 announcement; baseline for model compression discussion
Small Language Models (SLMs) – 7–8B parameter models with performance parity or superiority to 175B models
Small Multimodal Models (SMMs) – Extended SLMs with larger context lengths, on-device learning, personalization, and reasoning capabilities
Vision Language Models (VLMs) – Running on-device for context understanding in robotics (1.5+ years in Autonomy's navigation systems)
Agentic Systems – Models with autonomous decision-making, file creation, and self-learning loops (e.g., OpenAI Canvas)

Hardware & Platforms:

Qualcomm AI Hub – Cloud-native platform for model optimization, testing, and deployment; free device access via IP address; supports any model provider
Qualcomm AI 250 – Data center inference accelerator focused on memory architecture innovation; rolling out in Middle East
Qualcomm AI 300 – Upcoming second-generation solution with continued memory architecture innovation (not yet announced)
Premium Smartphones – Run 10B parameter models efficiently
AR Glasses – Run 1–2B parameter models
PCs – Run 30B+ parameter models
ByteDance AI-First Phone – Production phone with voice-only agentic interface; all apps run in background orchestrated by central AI agent
Humane PC – Real-time decision between on-device and cloud inference based on query complexity

Technical Concepts:

Prefill vs. Decode Stages – Prefill is compute-bound (benefits from more horsepower); decode is memory-bandwidth-bound (insensitive to compute). Requires different architecture optimization.
Shadow AI – Unauthorized use of cloud AI tools (OpenAI, Claude) by enterprise employees, exposing sensitive data; 78% prevalence
Transfer Learning (Device → Data Center) – Applying hardware optimization lessons from low-power devices to energy-efficient data center design
Hybrid AI Architecture – Distributed processing across devices, edge, on-premises, and cloud based on use case requirements
Personal Knowledge Graph – Contextual data about a user's preferences, policies, and history; used by AI agents for personalized decision-making
Orchestration Platform – Fleet management for heterogeneous robots; runs on cloud while on-device AI handles navigation and perception
Grounded Answers – AI responses anchored in customer-specific data (e.g., Spotdraft's contract analysis tied to customer's actual legal policies)
Emergent Behavior – Autonomous model actions (e.g., file creation, self-learning loops) that go beyond predefined task execution

Methodologies & Approaches:

Continuous Data Capture – Capturing user behavior in real-time (e.g., Word plugin tracking lawyer actions) rather than one-time labeling
Synthetic Data Generation – Creating training data without large real-world datasets; addresses privacy/data minimization concerns
On-Premises Model Deployment – Running 100–300B parameter models on customer-owned servers via air-cooled AI accelerator cards

6G & Connectivity:

Next-Generation Cellular (6G) – Planned deployment timeline: 2028 Olympics (trials), 2029 (first commercial deployments)
Satellite Connectivity – Autonomous robots equipped with satellite uplinks for remote area operation (mining, rural deployment)

Industry Domains Referenced:

Legal tech (contract review, negotiation, drafting)
Robotics (autonomous navigation, fleet orchestration, mining)
Fraud detection (edge-deployed LLMs for call screening)
Education (LLMs for India-specific curriculum)
Agriculture (India-specific panchangcalendars codified in software)
Finance & Banking (PSU banks adopting AI at scale)
Manufacturing (factory automation with robotics and metaverse simulation)

Context & Methodological Notes

Event: AI for All Workshop at a major tech summit (location not explicitly stated but references San Diego, Olympics 2028 context, and Middle East deployments)
Audience: Developers, enterprise customers, startup founders, technologists interested in edge AI and agentic systems
Format: Keynote followed by panel discussion with rapid-fire Q&A and pitch segment
Tone: Balanced between technical depth and practical applications; forward-looking (2028–2030 timeline emphasis) with real-world working examples prioritized over theoretical projections