In a recent EE Times article, Gary Hilson notes that high bandwidth memory (HBM) deployments are becoming more mainstream due to the massive growth and diversity in artificial intelligence (AI) applications.
“HBM is [now] less than niche. It’s even become less expensive, but it’s still a premium memory and requires expertise to implement,” writes Hilson. “As a memory interface for 3D-stacked DRAM, HBM achieves higher bandwidth while using less power in a form factor that’s significantly smaller than DDR4 or GDDR5 by stacking as many as eight DRAM dies with an optional base die which can include buffer circuitry and test logic.”
According to Jim Handy, principal analyst with Objective Analysis, GPUs and AI accelerators have an “unbelievable hunger” for bandwidth and HBM gets them where they want to go.
“The applications where HBM is being used need so much computing power that HBM is really the only way to do it,” Handy tells the publication. “If you tried doing it with DDR, you’d end up having to have multiple processors instead of just one to do the same job, and the processor cost would end up more than offsetting what you saved in the DRAM.”
Early HBM3 hardware will reportedly be capable of ~1.4x more bandwidth than HBM2E. As the standard evolves, this number is expected to increase to ~1.075TB/s of memory bandwidth per stack, with maximum I/O transfer rates of up to 8.4Gbps. This means that the total bandwidth provided by a four-stack HBM3 solution at 665GB/s will hit ~2.7TB/s.
As Hilson emphasizes, moving to HBM3 requires careful planning and expertise, which is why Avery Design Systems is creating a streamlined ecosystem for design and verification to make HBM3 adoption as easy as possible. In late 2021, Avery announced that Rambus would use Avery’s HBM3 memory model to verify its HBM3 PHY and controller subsystem.
The Rambus HBM3-ready memory interface consists of a fully integrated physical layer (PHY) and digital memory controller, the latter drawing on technology from the company’s recent acquisition of Northwest Logic. The subsystem supports data rates of up to 8.4 Gbps and delivers as much as 1 terabyte per second of bandwidth, thereby doubling the performance of high-end HBM2E memory subsystems.
“People are starting to move from architecting for HBM3 to starting chip implementation,” states Chris Browy, VP of Sales and Marketing at Avery Design Systems. “Now that there are more AI chips coming online and the competition is fierce, everybody’s looking to take advantage of the latest memory architectures.”
According to Frank Ferro, Rambus Senior Director of Product Marketing for IP Cores, the neural networks in AI applications require a significant amount of data both for processing and training—with training sets alone increasing 10x per year.
For AI training and high-performance applications, says Ferro, HBM3 can deliver more than one terabyte per second with two DRAM stacks. With four DRAM stacks, this number increases to 3.2 terabytes per second, offering significant processing power for AI-and high-performance computing applications. In addition, HBM3 delivers better power and area efficiency, as the DRAM stack and SoC are placed in a single package substrate.