Page loading . . .

  
 Category: Magazine & Journal Articles Online: Article Archive 2010: Thursday, May 23, 2013
Dodging Amdahl's Law with Message Passing, FPGA-Based Parallel Processing  
Publication: EE Times Programmable Logic Designline
Contributor: Impulse Accelerated Technologies, Inc.
 Printer friendly
 E-Mail Item URL

February 24, 2010 -- In configuring next generation large scale parallel processing arrays some teams are relying on "heterogeneous processing." Basically a fifty-cent phrase describing a microprocessor with one or more on board co-processors for high-speed on-node processing, most typically GPU, FPGA, Cell, and/or DSP. While the debate continues about the right ratio of microprocessors to co-processors, most teams agree that the basic plumbing of memory management can be the real bottleneck. Today the only real solution is having the microprocessor and co-processors share memory on the node, and interconnecting many nodes with a GigE, Infiniband, or a custom interconnection, configuring the nodes in a distributed memory layout.

Enter the unintended consequence of scaling. Amdahl's law says that as you add more processors, you get bogged down by more overhead. Basically the Nth guy you add to build a brick wall begins to slow things down because all the brick layers are reaching for bricks off the same pile, and get in each other's way. Add another N brick layers and it just gets worse. So the idea is to compliment the original process (the first brick layer) with a co-processor that makes that brick layer more efficient (faster), independent of any other brick layer. Image a machine that hands the brick layer a pre-cemented brick, so all they need to do is place it. Or, there is always the old analogy:

"I know how to make 4 horses pull a cart - I don't know how to make 1024 chickens do it." -- Enrico Clementi

Using co-processors dodges Amdahl's law by using more powerful nodes, thus needing fewer of them to reach the same level of performance. While this approach is successful, it puts more burden on the programmer to make a heterogeneous programming model, and successfully implement it on a given node and across multiple nodes. How does the program deploy the algorithm in this new environment? Can it be emulated in one simulation? How does the programmer debug a multi-node program? all using co-processors?

This article will discuss these basics within the tool flow and then focus primarily on memory mapping issues at the low end of FPGA enabled co-processing, and at the high end of the thousand processor arrays.

By Dave Strenski and Brian Durwood. (Durwood is the co-founder of Impulse Accelerated Technologies, Inc. and Strenski is an application analyst at Cray, Inc.)

This brief introduction has been excerpted from the original copyrighted article.


View the entire article on the EE Times Programmable Logic Designline website.

Read more about
Impulse Accelerated Technologies, Inc.
on SOCcentral.com

Keywords: FPGAs, field programmable gate arrays, FPGA design, EDA, EDA tools, electronic design automation, coprocessors, co-processors, parallel processing, Impulse Accelerated Technologies, EE Times Programmable Logic Designline,
596/30926 2/24/2010 3216 164


Designer's Mall
0.15625



 Search for:
            Site       Current Category  
   Search Options

Subscribe to SOCcentral's
SOC Explorer
Newsletter
and receive news, article, whitepaper, and product updates bi-weekly.

Exec Viewpoint

The Many Faces
of Low-Power Verification


Ghislain Kaiser
CEO, Docea Power

Exec Viewpoint

Maximizing the Value of Your Internal IP


Warren Savage
CEO, IPextreme

Odd Parity

Lets' Go On
with the Show!


Mike Donlin
The Write Solution

Odd Parity Archive

Barbara's Bytes

So, Just What
Is ESL


Barbara Tuck
Senior Editor,
SOCcentral

SOCcentral Job Search

SOC Design
ASIC Design
ASIC Verification
FPGA Design
CPLD Design
PCB Design
DSP Design
RTOS Development
Digital Design

Analog Design
Mixed-Signal Design
DFT
DFM
IC Packaging
VHDL
Verilog
SystemC
SystemVerilog

Special Topics/Feature Articles
3D Integrated Circuits
Analog & Mixed-Signal Design
Design for Manufacturing
Design for Test
DSP in ASICs & FPGAs
ESL Design
Floorplanning & Layout
Formal Verification/OVM/UVM/VMM
Logic & Physical Synthesis
Low-Power Design
MEMS
On-Chip Interconnect
Selecting & Integrating IP
Signal Integrity
SystemC
SystemVerilog
Timing Analysis & Closure
Transaction Level Modeling (TLM)
Verilog
VHDL
 
Design Center
Whitepapers & App Notes
Live and Archived Webcasts
Newsletters


About SOCcentral.com

Sponsorship/Advertising Information

The Home Port  EDA/EDA Tools  FPGAs/PLDs/CPLDs  Intellectual Property  Electronic System Level Design  Special Topics/Feature Articles  Vendor & Organization Directory
News  Major RSS Feeds  Articles Online  Tutorials, White Papers, etc.  Webcasts  Online Resources  Software   Tech Books   Conferences & Seminars  About SOCcentral.com
Copyright 2003-2013  Tech Pro Communications   1209 Colts Circle    Lawrenceville, NJ 08648    Phone: 609-477-6308
184.596  0.234375