In 1945, mathematician and physicist John von Neumann described a design architecture for an electronic digital computer in the First Draft of a Report on the EDVAC. Also known as the Princeton architecture, the design included a processing unit with an arithmetic logic unit and processor registers; a control unit containing an instruction register and program counter; memory to store both data and instructions; external mass storage; as well as an input and output mechanism.
Although modern systems have indeed benefited from decades of Moore’s Law and Dennard Scaling, the basic computer architecture has remained fundamentally unchanged since the days of Von Neumann. While a plethora of alternative architectures have been proposed over the years, none has managed to gain the sustained traction of the von Neumann architecture. But as a 2016 Bernstein research report observes, the sustained industry reliance on this architecture has led to the development of multiple bottlenecks.
“The first major limitation of the Von Neumann architecture is the ‘Von Neumann Bottleneck’; the speed of the architecture is limited to the speed at which the CPU can retrieve instructions and data from memory,” Bernstein analysts Pierre Farragu, Stacy Rasgon, Mark Li, Mark Newman and Matthew Morrison explained. “The throughput of a computer system is limited due to the relative ability of processors compared to top rates of data transfer. Therefore, the processor is idle for a certain amount of time while the memory is accessed.”
According to the analysts, the Von Neumann bottleneck has only worsened over time, as the disparity between processor speed and memory access throughput speed widens.
“Whilst a number of solutions have been proposed and implemented in modern day computers (including cache memory and branch predictor algorithms), these solutions have not been able to solve the root of the problem – the actual underlying design architecture,” the analysts stated. “Secondly, the step-by-step serial nature of a Von Neumann processor means that analyzing a very large, complex data set requires a large amount of processing power which is both timely and very expensive. Therefore, when it comes to certain applications, traditional processor architecture can simply not be utilized in a fast and cost efficient manner.”
Steven Woo, VP of Systems and Solutions at Rambus, expressed similar sentiments during a recent interview with Rambus Press.
“Bottlenecks have arisen in traditional architectures that are driving the industry to re-think how systems should be designed moving forward,” he explained.
“Several techniques for bringing architectures back into balance are being pursued by the industry, including Near Data Processing to minimize data movement and energy consumption, and hardware acceleration to improve performance and power efficiency.”
More specifically, says Woo, acceleration can now be implemented across a wide range of silicon, including field-programmable gate arrays (FPGAs). As Microsoft’s Project Catapult illustrates, FPGAs are helping to play a critical role in evolving future computing platforms. To be sure, FPGAs are already used in Microsoft’s Bing search engine and will soon power new search engines based on deep neural networks.
“Project Catapult signals a change in how global systems will operate in the future. From Amazon in the US to Baidu in China, all the Internet giants are supplementing their standard server chips—central processing units, or CPUs—with alternative silicon that can keep pace with the rapid changes in AI,” Wired’s Cade Metz recently reported.
“FPGAs also drive Azure, the company’s cloud computing service. And in the coming years, almost every new Microsoft server will include an FPGA. [In addition], Office 365 is moving toward using FPGAs for encryption and compression as well as machine learning—for all of its 23.1 million users. Eventually, these chips will power all Microsoft services.”
According to Woo, the industry has realized that it can no longer rely on Moore’s Law alone to optimize CPU performance and power efficiency.
“As the Bernstein analysts note, traditional architectures are not the best choice for some applications because they don’t address key bottlenecks that exist in these workloads,” he added. “Traditional processors coupled with FPGAs, and technologies to minimize data movement, offer new approaches to improving performance and power efficiency in modern systems. We believe FPGAs will continue to play an important role in helping to evolve computing platforms by enabling flexible acceleration and near data processing.”
Leave a Reply