Page loading . . .

  
 Category: SOCcentral Feature Articles & Columns: Feature Articles: Friday, September 03, 2010
Eliminating the "Long Loop" in FPGA Design  
Contributor: GateRocket, Inc.
 Printer friendly
 E-Mail Item URL

July 12, 2010 -- The number of FPGA design starts continues to grow at an ever-increasing rate. In some cases, teams which previously focused on ASIC designs are migrating to FPGA implementations. This is because modern, high-end FPGAs have the capacity and performance capabilities required by many applications, but without the expense, risks, and time-to-market delays associated with ASIC technologies. In other cases, research and development teams working on projects like robotic vision systems may need some way to accelerate their algorithms, and FPGAs offer an ideal solution.

One thing that is common to newcomers to the FPGA domain is an underlying belief that working with these components is relatively easy, fast, and painless. Most folks involved in any form of electronics design have at least a rudimentary understanding of how FPGAs work. In particular, they know that an SRAM-based FPGAs can be reprogrammed with a new configuration as required. Unfortunately, there is also an unstated impression that the process of capturing a design, translating it into a configuration file, and loading that configuration into the FPGA consumes relatively little time and effort.

This may have been true 20 years or so ago when FPGAs contained the equivalent of only a few thousand logic gates. But today's state-of-the-art FPGAs can contain the equivalent of millions of logic gates, thousands of DSP functions, megabits or RAM, and a multitude of other hard IP core functions. This causes major problems with regard to verifying the functionality of the design because a software simulation run of the full-chip RTL that once completed in hours can now take days or weeks.

The solution is to migrate as much of the design into a physical FPGA as soon as possible, because this will allow those portions to be run at-speed, and it will also dramatically reduce the loading on the software simulator. Unfortunately, many elements of the design process are being stressed to the breaking point. For example, full-chip logic synthesis and place-and-route (PAR) runs that used to complete during lunch can now exceed 24 hours. This means that whenever a bug slips through to the system test lab and requires a change to the FPGA design, it can take more than a day to get the device re-programmed with a fix ready for testing.

The result is a "Long Loop" with regard to detecting, isolating, debugging, and fixing a bug. In many cases, actually identifying the source of a bug can be problematical, because bugs can be introduced at any stage of the design process. Furthermore, since one bug may mask several others, it is not uncommon to re-spin the FPGA and re-test it in the system, only to discover that additional changes are required. It’s easy to see how this slow, iterative, "Long Loop" process can become unwieldy, and can lead to weeks or months of project delays. So, is there any way in which we can eliminate the "Long Loop"? Read on...

I see bugs everywhere...

When it comes to FPGA design, bugs can be introduced anywhere in the design flow. Consider the (very high-level) view of the design flow illustrated in Figure 1. Purely for the sake of these discussions, we'll restrict ourselves to considering only a few of these design flow elements: IP selection and integration, RTL design, synthesis, and place-and-route.

Figure 1. Bugs can be introduced at any point in the design flow.


Let's start with the IP. In the case of ASIC designs, any third-party IP is typically presented in the form of RTL (it may be encrypted, but it is still RTL). This means that the RTL representations that are used for initial software simulations are subsequently synthesized, placed, and routed along with the rest of the design. This provides a reasonably high level of confidence that the RTL and gate-level representations of the design are functionally equivalent.

In the FPGA domain, by comparison, it’s common to be presented with two models: a high-level representation containing behavioral constructs for use in simulation, and a gate-level representation to be incorporated into the FPGA. The problem is that there may be subtle differences between the behavioral and gate-level representations, and these differences only manifest themselves when the FPGA design is deployed in its target system.

Or consider the RTL that you capture yourself. Following software simulation, you typically have a high level of confidence that your RTL is functionally correct, so when you synthesize the design and load it into the target FPGA, you may not initially consider these functions as being the source of any errors. Eventually, you realize that it's your RTL that's at fault. You re-check your simulation results but these still appear to be correct. So next you add some debugging logic around what you think may be the problem and then re-run synthesis and place-and-route, all of which may take hours or days.

And still the design in the FPGA doesn’t work? What is going on? What you don't see is that your simulation runs are ignoring the pragmas you added into the RTL for use by the synthesis engine. Perhaps one of these pragmas told the synthesis tool that it could make arbitrary decisions about unspecified choices; maybe this results in a register being overwritten when an unspecified address is written to inadvertently; and maybe this is contrary to what happens in the software simulator.

Or consider the tools themselves. Generally speaking, we tend to believe that synthesis tools are much more robust than they actually are. In reality, even though some synthesis tools have been around for years and years, users are still logging bugs against them. One problem is that today's designs are extremely large and their corresponding synthesis runs can take a long time, so the developers of the synthesis engine start to perform aggressive optimizations. But every time a corner is cut it's necessary to account for an enormous set of conditions, which sets the scene for errors to be introduced.

And problems aren’t limited to differences between simulation and synthesis. In many cases these two tools may perform their roles as expected, and then the place-and-route engines make their own decisions and optimizations that introduce unexpected functionality (read "bugs") into the design. For example, the place-and-route engines may decide that a register has to be initialized to some state, so they make arbitrary choices that can cause the silicon to do something odd and expected.

All of these bugs are insidious, because you don't know what is happening and you can't identify the problem because every step in the process appears to produce the results you expect ... until you reach the programmed FPGA. Simulation was fine – synthesis was fine – place-and- route completed without any warnings or errors – the netlist loads without issues, but the FPGA doesn't work in the system and resolving the issue is going to require you to cycle many times around the "Long Loop."

Eliminating the "Long Loop"

One solution is to combine actual FPGA hardware and RTL simulation models in the same verification run. This can be achieved by means of a RocketDrive and associated RocketVision software from GateRocket. The RocketDrive is presented in the form of a removable "caddy" that plugs into a standard drive bay on a desk-side workstation. RocketDrives come in a variety of models, each targeted toward a different family of FPGAs from Altera or Xilinx. In each case, the RocketDrive contains the largest member of the family with which you are working.

Let's consider a typical scenario involving a new project. In some cases this new project will be based on a previous generation of the product and/or platform, in which case you will have access to a number of previously proven functional blocks. Using RocketVision, you can direct the system to place all of the previously proven blocks in the RocketDrive, and to keep any unverified third-party IP blocks and any new blocks you've developed in the software simulator. This immediately lets you benefit from the acceleration of much of the design yielding dramatically faster simulation iterations.

As each new block is verified at the RTL (or behavioral) level in the context of your full-chip design, its synthesized/ gate-level equivalent can be moved over into the physical FPGA in the RocketDrive. As soon as a problem manifests itself, the verification run can be repeated with the RTL version of the suspect block resident in the simulation world running in parallel with the gate-level version realized in the physical FPGA. By means of RocketVision, the signals from the peripheries of these blocks (along with any designated signals internal to the blocks) can be compared "on-the-fly."

Using this technology – combining conventional simulation with physical hardware and an appropriate debugging environment – it's possible to very quickly detect, isolate, and identify bugs, irrespective of where they originated in the FPGA design flow. Once a bug has been isolated to one block of the design, a change can be made to the RTL representation of that block, which can then be re-run along with the hardware representation of the other blocks. In this way, a fix can be immediately tested and verified without re-running synthesis and place-and-route, and with only the suspect block running in the software simulator.

This technique provides the ability to make multiple design-change-debug iterations in a single day. This approach can also reduce the number of RTL-to-bitstream iterations by 50%.

The end result of RocketDrive and RocketVision is that design and verification engineers now have the ability to see how the design behaves in the physical chip running like it will in-system while still having access to all the capabilities and flexibility of a software simulator. This new technique lets engineers quickly detect, identify, and correct differences between the original RTL and the physical chip. In addition to accelerating verification runs by a factor of 10X or more, this new approach reduces the in-silicon debugging process by a factor of 30X, thereby dramatically speeding the debugging of the FPGA design.

Using this technique to eliminate the "Long Loop" in the FPGA design flow can save weeks or months of valuable engineering time and resources, speeding time-to-market and time–to-profit.

By Dave Orecchio.

Dave Orecchio is President and CEO of Inc. and has 24 years of semiconductor industry experience at four venture-backed companies with a focus on semiconductors, ASIC, and FPGA design and development. His leadership brought three of the four companies to successful exits for the investors. Prior to GateRocket, he held executive positions in marketing, sales and general management at LTX, Viewlogic Systems, Synopsys, Innoveda, Parametric Technologies and DAFCA.

Go to the GateRocket, Inc. website to learn more.

Keywords: ASICs, ASIC design, FPGAs, field programmable gate arrays, FPGA design, EDA, EDA tools, electronic design automation, simulation, simulators, verification, GateRocket,
488/31775 7/12/2010 474 474
Add a comment or evaluation (anonymous postings will be deleted)

Designer's Mall
1.15625



 Search for:
            Site       Current Category  
   Search Tips

Subscribe to SOCcentral's
SOC Explorer
Newsletter
and receive news, article, whitepaper, and product updates bi-weekly.

Exec Viewpoint

Seeing Is Believing: How Visualization Simplifies IC DRC


Michael White
Senior Product Marketing Manager
Mentor Graphics Corp.

Tech Viewpoint

Verification Challenges
Require
Surgical Precision


Dr. Pranav Ashar
Chief Technical Officer
Real Intent, Inc.

Odd Parity

Summertime and the
Leavin’ Ain’t Easy


Mike Donlin
The Write Solution

Odd Parity Archive

SOCcentral Job Search

SOC Design
ASIC Design
ASIC Verification
FPGA Design
CPLD Design
PCB Design
DSP Design
RTOS Development
Digital Design

Analog Design
Mixed-Signal Design
DFT
DFM
IC Packaging
VHDL
Verilog
SystemC
SystemVerilog

Special Topics/Feature Articles
3D Integrated Circuits
Design for Manufacturing
Design for Test
ESL Design
Floorplanning & Layout
Formal Verification
Logic & Physical Synthesis
Low-Power Design
On-Chip Interconnect
Reconfigurable Computing
Selecting & Integrating IP
Signal Integrity
SystemC
SystemVerilog
Timing Analysis & Closure
Transaction Level Modeling (TLM)
Verilog
VHDL
.
Designer's Kiosk
Whitepapers & App Notes
Live and Archived Webcasts


About SOCcentral.com

Sponsorship/Advertising Information

The Home Port  EDA/EDA Tools  FPGAs/PLDs/CPLDs  Intellectual Property  Electronic System Level Design  Special Topics/Feature Articles  Vendor & Organization Directory
News  Major RSS Feeds  Articles Online  Tutorials, White Papers, etc.  Webcasts  Online Resources  Software   Tech Books   Conferences & Seminars  About SOCcentral.com
Copyright 2003-2010  Tech Pro Communications   1209 Colts Circle    Lawrenceville, NJ 08648    Phone: 609-477-6308
553.488  1.234375