ARM at COMPUTEX 2025: A Strategic Inflection Point for AI Everywhere

Executive Summary

Chris Bergey, Senior Vice President and General Manager of the Client Line of Business at ARM, delivered a keynote at COMPUTEX 2025 on May 20th, 2025 that framed the current era as a historic inflection point in computing—one where AI is no longer an idea but a force, reshaping everything from cloud infrastructure to edge devices. The presentation outlined ARM’s strategic positioning in this new landscape, emphasizing three core pillars: ubiquitous platform reach, world-leading performance-per-watt, and a powerful developer ecosystem.

Bergey argued that the exponential growth in AI workloads—both in scale and diversity—demands a fundamental rethinking of compute architecture. He positioned ARM not just as a CPU IP provider but as a full-stack platform company delivering optimized, scalable solutions from data centers to wearables. Key themes included the shift from training to inference, the rise of on-device AI, and the growing importance of power efficiency across all form factors.

The talk also featured panel discussions with Kevin Dearling (NVIDIA) and Adam King (MediaTek), offering perspectives on technical constraints, innovation vectors, and the role of partnerships in accelerating AI adoption.


Three Critical Takeaways

1. AI Inference Is Now the Economic Engine—Not Training

Technical Explanation

Bergey distinguished between the computational cost of model training vs. inference, highlighting that while training requires enormous flops (~10^25–10^26), inference—though less intensive (~10^14–10^15 per query)—scales with usage volume. For example, if each web search used a large language model, ten days’ worth of inference compute could equal one day of training compute.

This implies a shift in focus: monetization stems not from model creation, but from scalable deployment of efficient inference engines across mobile, wearable, and embedded platforms.

Critical Assessment

This framing aligns with current trends. While companies like NVIDIA continue optimizing training clusters, the greater opportunity lies in edge inference, where latency, power, and throughput are paramount. However, the keynote underplays the complexity of model compression, quantization, and hardware/software co-design, which are critical for deployment at scale.

ARM’s V9 architecture and Scalable Matrix Extensions (SME) are promising for accelerating AI workloads in the CPU pipeline, potentially reducing reliance on NPUs or GPUs—a differentiator in cost- and thermally-constrained environments.

Competitive/Strategic Context

  • x86 Alternatives: Intel and AMD dominate traditional markets but lag ARM in performance-per-watt. Apple’s M-series SoCs, based on ARM, demonstrate clear efficiency gains.
  • Custom Silicon: Hyperscalers like AWS (Graviton), Google (Axion), and Microsoft (Cobalt) increasingly favor ARM-based silicon, citing up to 40% efficiency improvements.
  • Edge NPU Trade-offs: Competitors like RISC-V and Qualcomm Hexagon push AI logic off-core, whereas ARM integrates it into the CPU, improving software portability but trading off peak throughput.

Quantitative Support

  • Over 50% of new AWS CPU capacity since 2023 is ARM-based (Graviton).
  • ARM-based platforms account for over 40% of 2025 PC/tablet shipments.
  • SME and NEON extensions yield up to 4x ML kernel acceleration without dedicated accelerators.

2. On-Device AI Is Now Table Stakes

Technical Explanation

Bergey emphasized that on-device AI is becoming the norm, driven by privacy, latency, and offline capability needs. Use cases include coding assistants, chatbots, and real-time inference in industrial systems.

ARM showcased its client roadmap, including:

  • Travis CPU: Next-gen core with IPC improvements and enhanced SME.
  • Draga GPU: Advanced ray tracing and sustained mobile graphics.
  • ARM Accuracy Super Resolution (AASR): AI upscaling previously limited to consoles, now on mobile.

Critical Assessment

On-device AI is architecturally sound for privacy-sensitive or latency-critical apps. Yet, memory and thermal constraints remain obstacles for large model execution on mobile SoCs. ARM’s strategy of enhancing general-purpose cores aids flexibility, though specialized NPUs still offer superior throughput for vision or speech applications.

While ARM’s developer base (22 million) is substantial, toolchain fragmentation and driver inconsistencies complicate cross-platform integration.

Competitive/Strategic Context

  • Apple ANE: Proprietary and tightly integrated but closed.
  • Qualcomm Hexagon: Strong in multimedia pipelines but hampered by software issues.
  • Google Edge TPU: Power-efficient but limited in scope.

ARM’s open licensing and platform breadth support broad AI enablement, from Chromebooks to premium devices.

Quantitative Support

  • MediaTek’s Companio Ultra delivers 50 TOPS AI performance on ARM V9.
  • Travis + Draga enables 1080p upscaling from 540p, achieving console-level mobile graphics.

3. Taiwan as the Nexus of AI Hardware Innovation

Technical Explanation

Bergey emphasized Taiwan’s pivotal role in AI hardware: board design, SoC packaging, and advanced fab technologies. ARM collaborates with MediaTek, ASUS, and TSMC—all crucial for AI scalability.

He highlighted the DGX Spark platform, combining 20 ARM V9 CPUs and an NVIDIA GB10 GPU, delivering petaflop-class AI compute to compact systems.

Critical Assessment

Taiwan excels in advanced packaging (e.g., CoWoS) and silicon scaling. But geopolitical risks could impact production continuity. ARM’s integration with Taiwanese partners is a strategic strength, yet resilience planning remains essential.

DGX Spark is a compelling proof-of-concept, though mainstream adoption may be constrained by power and cost considerations, especially outside research or high-end enterprise.

Competitive/Strategic Context

  • U.S. Foundries: Lag in packaging tech; TSMC leads sub-5nm.
  • China: Investing heavily but remains tool-dependent.
  • Europe: Focused on sustainable compute but lacks vertical integration.

ARM’s neutral IP model facilitates global partnerships despite geopolitical tensions.

Quantitative Support

  • Taiwan expects 8x data center power growth, from megawatts to gigawatts.
  • DGX Spark packs 1 petaflop compute into a desktop form factor.

Conclusion

ARM’s COMPUTEX 2025 keynote presented a strategic vision for a future where AI is ubiquitous and ARM is foundational. From hyperscale to wearable, ARM aims to lead through performance-per-watt, platform coverage, and ecosystem scale.

Challenges persist: model optimization, power efficiency, and political risk. Still, ARM’s trajectory suggests it could define the next computing era—not just through CPUs, but as a full-stack enabler of AI.

For CTOs and architects planning future compute stacks, ARM’s approach offers compelling value, especially where scalability, energy efficiency, and developer reach take precedence over peak raw performance.