Ed Sperling of Semiconductor Engineering notes that power has always been a “global concern” in the design process because it affects every part of a chip. Nevertheless, partitioning for power rather than functionality or performance has not, historically, been seriously considered, although the status quo is beginning to change.
For example, says Sperling, the increasing use of system partitioning into multiple chips connected by high-speed buses rather than putting everything on a single chip offers some interesting possibilities for managing power.
Read first our primer on:
HBM2E Implementation & Selection – The Ultimate Guide »
According to Kelvin Low, senior director of foundry marketing at Samsung, system architects are now looking at power management in a different way rather than simply relying on silicon technology.
“You can partition a system to achieve system-level performance scaling,” Low told SemiEngineering. “So if you use a 2.5D approach with HBM2 (second-generation High-Bandwidth Memory), the system-level performance increases. It becomes a partition problem, but the distributed processing approach is an important enabler.”
As Sperling points out, this approach has a bearing on power as well, because it takes less power to drive signals through an interposer than through increasingly narrow wires on a single die at advanced nodes. As a result, there are significant power savings in addition to performance increases.
Frank Ferro, a senior director of product management for memory and interface IP at Rambus, expressed similar sentiments.
“One of the advantages of HBM2 is that it is that you can move it closer to the processing, and you have 2 gig (gigatransfers/second per pin) rates,” Ferro told the publication. “The power of HBM2 is lower, too, and you can re-use quite a bit of technology. But it does require a new PHY design.”
As Ferro explained in a Semiconductor Engineering article earlier this year, HBM bolsters local available memory by placing low-latency DRAM closer to the CPU. In addition, HBM DRAM increases memory bandwidth by providing a very wide interface to the SoC of 1024 bits. This means the maximum speed for HBM2 is 2Gbits/s for a total bandwidth of 256Gbytes/s. Although the bit rate is similar to DDR3 at 2.1Gbps, the 8, 128-bit channels provide HBM with about 15X more bandwidth.
Perhaps not surprisingly, mass-market deployment of HBM will present the industry with a number of challenges. This is because 2.5D-packaging technology, along with a silicon interposer, increases manufacturing complexities and cost. In addition, HBM routes thousands of signals (data + control + power/ground) via the interposer to the SoC (for each HBM memory used). Clearly, maximal yields will be critical to making HBM cost effective, especially since there are a number of expensive components being mounted to the interposer, including the SoC and multiple HBM die stacks.
Nevertheless, even with the above-mentioned challenges, the advantage of having – for example – four HBM memory stacks, each with 256Gbytes/sec in close proximity to the CPU, provides both a significant increase in memory density (up to 8Bb per HBM) and bandwidth when compared with existing architectures.
Interested in learning more? The full text of “Partitioning for Power” by Ed Sperling is available on Semiconductor Engineering here.
Leave a Reply