Imec Unveils Breakthrough in 3D HBM-on-GPU Thermal Optimization for Next-Gen AI

At the 2025 IEEE International Electron Devices Meeting (IEDM), imec, a global leader in advanced semiconductor research, introduced the first comprehensive thermal system-technology co-optimization (STCO) study for 3D HBM-on-GPU (high-bandwidth memory on graphical processing unit) architectures. This innovative approach addresses one of the most pressing challenges in high-performance computing for artificial intelligence: managing heat in densely integrated systems.

3D HBM-on-GPU: Unlocking New Levels of Compute Density

Integrating high-bandwidth memory (HBM) stacks directly on top of GPUs represents a significant advancement for data-intensive AI workloads. The 3D HBM-on-GPU architecture enables a substantial increase in compute density—supporting up to four GPUs per package—as well as greater memory per GPU and higher GPU-memory bandwidth. This marks a notable improvement over current 2.5D integration methods, where HBM stacks are positioned around one or two GPUs on a silicon interposer.

However, this aggressive 3D integration introduces new thermal challenges. The increased local power density and vertical thermal resistance can lead to excessive heat buildup, threatening the reliability and performance of both GPUs and memory.

Thermal Simulation and Mitigation Strategies

Imec’s study presents the first detailed thermal simulation of 3D HBM-on-GPU integration, identifying critical thermal bottlenecks and proposing effective mitigation strategies. The research team modeled a system with four HBM stacks—each comprising twelve hybrid-bonded DRAM dies—mounted directly atop a GPU using microbumps, with cooling applied above the HBM stacks. Power maps based on industry-relevant workloads were used to pinpoint local hotspots and benchmark the 3D design against a 2.5D baseline.

Without any thermal mitigation, the 3D configuration reached a peak GPU temperature of 141.7°C, far exceeding safe operational limits. In contrast, the 2.5D baseline peaked at a manageable 69.1°C under identical cooling conditions. These findings underscored the need for a holistic approach to thermal management in advanced 3D architectures.

Imec’s researchers evaluated a combination of technology-level and system-level mitigation techniques. Technology-level solutions included HBM stack merging and thermal silicon optimization, while system-level strategies involved double-sided cooling and GPU frequency scaling.

Results: Achieving Safe Operating Temperatures

Notably, halving the GPU core frequency reduced peak temperatures from 120°C to below 100°C, meeting critical requirements for memory operation. While this adjustment resulted in a 28% slowdown in AI training steps, the overall throughput density of the 3D package still surpassed the 2.5D baseline, thanks to the increased integration and bandwidth.

Imec’s Cross-Technology Co-Optimization (XTCO) Program

This research also highlights the capabilities of imec’s cross-technology co-optimization (XTCO) program, launched in 2025. XTCO is designed to align imec’s technology roadmaps with key industry challenges in system scaling, focusing on compute density, power delivery, thermal management, and memory bandwidth. By integrating system-technology and design-technology co-optimization (STCO/DTCO) with imec’s extensive expertise, the program addresses the growing complexity and demands of next-generation compute systems.

Imec’s pioneering work in 3D HBM-on-GPU thermal optimization sets a new benchmark for the industry, demonstrating that cross-layer, multidisciplinary approaches are essential for enabling the next wave of AI hardware innovation.