Page loading . . .

  
 You are at: The item(s) you requested.Monday, February 08, 2010
Platform ASICs vs. FPGAs    Featured
Contributor: LSI Corp.
 Printer friendly
 E-Mail Item URL

September 1, 2005 -- Design teams are continually seeking ways of maintaining their competitive edge and improving profitability. The search for solutions that will provide faster time to market, lower cost and higher performance for successive generations of products is never-ending. While most technologies represent incremental advances over previous generations, occasionally a disruptive technology is developed, providing a quantum leap over its predecessors. Adopting such technologies early can often help companies gain a commanding market lead over competitors who are slower to realize the advantages offered.

FPGAs have been popular for designs where fast turnaround time and low NRE are desirable. There are significant costs associated with using FPGAs for large designs, however. FPGAs suffer from high per-unit cost, low performance, low levels of logic integration and high power consumption. Low performance also causes other problems. For instance, significant additional time may be required to hand-optimize RTL code to achieve the same level of performance as faster technologies such as Platform ASICs.

In contrast, Platform ASICs have considerably lower unit costs, offer higher performance and logic integration, and use less power. Plus, as the need for hand optimization of RTL is avoided, they provide a critical time to market improvement.



Cost

There are several fundamental technology limitations that affect the minimum per-unit cost of FPGAs as table 1 clearly shows. The programmable logic and routing found in an FPGA results in die sizes that are very large, reducing yield and significantly increasing manufacturing cost. Related additional package costs, external non-volatile programming devices, and extra PCB routing also add to the overall FPGA per-unit cost.

Figure 1. Cost comparison



Compared to FPGAs, Platform ASICs have a low per-unit cost. They are typically based on a fine-grained architecture that implements logic with ASIC-like efficiency, but without the high NRE costs of an ASIC As a production technology, Platform ASICs address the key issues and provide designers with an excellent capability for medium volume production devices - typically in the range of 1,000 to 100,000 devices.

Performance

Platform ASICs are typically at least three times faster than an FPGA based on an equivalent process technology. Their advantage is so significant that they still outperform FPGAs implemented in a process technology with a smaller transistor size. For instance, as shown in figure 2, the RapidChip® Foundation™ family at 0.18 microns outperforms the industry's most advanced 90-nm FPGAs.

Figure 2. Performance comparison.



The Platform ASIC vs. FPGA performance gap is due to the limitations of an FPGA's logic cell and interconnect architecture. In contrast to the slow performance of an FPGA, empirical evidence shows that platform ASICs achieve around 80 percent of cell-based ASIC performance.

FPGA vendors claim their devices run at high frequencies. This may be true, but it is a misleading indicator of real-world performance. The reality is that FPGAs are very low performance in comparison to many Platform ASICs. For example, the 0.11-micron RapidChip Xtreme Platform ASIC family can achieve frequencies over 250MHz with more than 25 levels of logic. In an equivalent FPGA, only around 5-8 levels of logic can reasonably be expected at the same frequency.

Interconnect delay

The main reason for the performance gap between Platform ASICs and FPGAs is interconnect delay. Unlike the length-optimized metal routing of some Platform ASICs, FPGA routing is a combination of short, medium, and long fixed wire lengths connected via pass transistors. These active routing elements add significant delay to signal paths. This problem is exacerbated as design size increases. Large FPGA designs can have routing delays comprising up to 80 percent of the total path delay.

The complexity of FPGA routing makes it difficult to accurately predict timing until after routing is completed. The distance between two logic elements in an FPGA is not a good predictor of the timing delay between them. There are many possible routes between two points on an FPGA, and the timing variance between different paths can be considerable. In addition to the routing topology, the timing depends on the type of wire used and the number of buffers and pass transistors encountered. This explains the large discrepancy experienced between synthesis and post-layout/place-and-route results.

Unlike FPGA routing, Platform ASICs often use an optimally buffered interconnect scheme, so delays increase linearly and predictably with increasing wire length. For example, LSI's RapidChip technology has four to five routing layers and can route over logic. Therefore, it is possible to achieve highly optimal timing delays.

These desirable routing characteristics allow physical synthesis tools, such as Synplicity's Amplify, to accurately predict the timing during placement. There are no post-route timing surprises, and no need to iterate in the synthesis - place-and-route timing loop that is commonly faced by FPGA designers.

Fabric granularity

Another factor contributing to the Platform ASIC's 3x performance advantage is logic granularity. SRAM-based FPGA architectures have a fundamental limitation: A LUT-based architecture or some variation is most commonly used to implement FPGA logic functions.

Most RTL design code does not map efficiently into a coarse-grained LUT-based structure. This reduces both performance and density and ultimately increases unit cost.

A coarse-grained logic fabric degrades performance for a number of reasons. Timing delays suffer from large incremental steps. Even when adding a single extra input to a logic term, additional LUTs and interconnections may be required, causing a significant jump in signal delay. Timing optimization is difficult because of this abrupt, step-function behavior. It is not possible to tune FPGA paths in small timing increments using different combinations of drive strengths and logic.

The logic structure used for Platform ASICs is often constructed from very small base-units consisting of several transistor pairs. One or more base-units may be used to construct a single library cell, which is often similar to a standard ASIC cell. There may be up to 500 unique types of logic cells of varying functionality and drive strength. This ensures optimal resource usage for the required level of performance.

During synthesis, RTL is mapped into the logic cells. These cells are physically implemented by adding metal routing layers to a pre-diffused base die. The metal routing connects the base-units necessary to form library cells. It also connects the library cells together to efficiently implement user logic.

Figure 3. RapidChip Platform ASIC vs. FPGA path optimization.



Ease of timing closure

In a design with demanding timing (performance) requirements, it is much easier to close timing with Platform ASICs than FPGAs. The granular, flexible nature of most Platform ASIC architectures enables design tools to perform timing optimization automatically and effectively. A fine-grained fabric has many timing benefits, as shown in Figure 3:
  • A broad range of available cells ensures optimum cell delay for each function mapped
  • Optimal path tuning is possible, including:
    • drive strength adjustment
    • path buffering
    • logic flattening and restructuring

In contrast, an FPGA's coarse-grained architecture limits these types of optimizations. In an FPGA, timing can change dramatically between synthesis and place and route, further complicating timing closure iterations. In some cases, meeting the system timing requirements is not possible without hand-optimizing the RTL. Techniques such as bus widening and pipelining may be required to meet performance goals.

Architectural modifications such as these can be very time consuming, and can add months to a project schedule where significant optimization is required. Even with these modifications, there is no guarantee that timing requirements will be met. In contrast, Platform ASIC architectures and tools allow maximum performance to be obtained with minimum effort. The higher-performance routing and logic fabric minimize the need for RTL optimization, reducing implementation time and complexity.

Another issue arises with frequent changes to the RTL such as bug fixes, feature enhancements, etc. Even small changes can have significant effects on the FPGA implementation, changing the critical paths and in some cases reducing the effectiveness of any previous optimization work.

Logic integration

Large designs often won't fit into a single FPGA. Due to the significant overhead of programmable routing, even the largest FPGA provides relatively low logic integration. For example, over 50 percent of an FPGA's die area may be occupied by fixed routing structures. Since Platform ASIC technology routes signals over the top of logic cells, it does not require dedicated routing channels that waste large portions of the die.

To work around the logic integration limitation, time consuming and complex partitioning tasks are required to fit a design into multiple FPGAs. In contrast, all but the largest designs will fit on a single Platform ASIC. Compared to FPGAs, Platform ASICs provide extremely high logic densities. As a result, the total number of available gates on a single Platform ASIC is many times higher than the largest FPGA.

Figure 4. FPGAs waste internal LUT resources.



FPGAs also waste a lot of internal LUT resources, as can be seen in Figure 4. RTL design code maps inefficiently into coarse-grained architectures, so only a proportion of the internal LUT resources will ever be utilized. Even if all the LUTs within an FPGA are used, significant amounts of the total logic available can remain unused. By contrast, the logic and routing in most Platform ASICs is not fixed. Resources are only used when needed, increasing logic integration levels and decreasing power consumption.

Power consumption

A further benefit of Platform ASIC's fine-grained fabric and point-to-point routing is highly efficient power consumption. Compared to Platform ASICs, FPGAs typically dissipate many times more power. There are several reasons for this discrepancy:
  • The routing capacitance of an FPGA is typically many-times larger than a Platform ASIC. Compared to most Platform ASICs, FPGAs contain much longer routing tracks with significant parasitic capacitance, and the switching activity on these long routing tracks causes significant power dissipation.
  • Unlike most Platform ASICs, FPGAs have fixed clock routing structures - all registers within a clock domain are connected to a clock tree, even if they are not used. When the clock is toggling, power is wasted in clock segments which connect unused registers.
  • FPGAs require a significant number of transistors to configure LUTs and programmable routing. In a typical FPGA design, large portions of these transistors are not actually used; however they still draw leakage current.


Figure 5. Power consumption (left to right: ASIC, RapidChip, FPGA.)



Summary

Platform ASICs offer significant benefits in most areas when compared with FPGAs. FPGAs play an important role for prototyping and very low volume production. For medium volume production, however, Platform ASICs such as the RapidChip Platform ASIC from LSI Logic provide the best-in-class implementation vehicle.

High performance, cost-optimized and production ready, Platform ASICs ensure project goals are rapidly achieved, even when pushing the performance envelope. Superior performance alleviates the need for hand-optimized RTL, which can often significantly reduce front-end design complexity and effort. Platform ASICs enable design teams to reduce their overall time to market and therefore achieve rapid success in the marketplace.

By Greg Martin, RapidChip Technical Marketing, LSI Logic Corp.

Go to the LSI Corp. website to learn more.

Keywords: SOCcentral, LSI Logic, structured ASICs, Platform FPGAs,
488/15887 9/1/2005 7167 7167
  Subject: Comparison with AMD Torrenza implementation
  Martin    setecastronomyph@gmail.com8/16/2007 
  Very informative article. Are there any recent performance comparison benchmarks against AMD Torrenza implementations such as DRC?
  My rating:
Add a comment or evaluation (anonymous postings will be deleted)

Designer's Mall



Copyright 2002 - 2004 Tech Pro Communications, P.O. Box 1801, Merrimack, NH 03054
 Search site for:
    Search Tips

Subscribe to SOCcentral's
SOC Explorer
Newsletter
and receive news, article, whitepaper, and product updates bi-weekly.

Odd Parity

Fun Under the T.A.R.P.


Mike Donlin
The Write Solution

Odd Parity Archive


SOCcentral Job Search

SOC Design
ASIC Design
ASIC Verification
FPGA Design
CPLD Design
PCB Design
DSP Design
RTOS Development
Digital Design

Analog Design
Mixed-Signal Design
DFT
DFM
IC Packaging
VHDL
Verilog
SystemC
SystemVerilog


Special Topics/Feature Articles
New 3D Integrated Circuits New
Design for Manufacturing
Design for Test
ESL Design
Floorplanning & Layout
Formal Verification
Logic & Physical Synthesis
On-Chip Interconnect
Low-Power Design
Reconfigurable Computing
Selecting & Integrating IP
Signal Integrity
SystemC
SystemVerilog
New Timing Analysis & Closure New
Transaction Level Modeling (TLM)
Verilog
VHDL


About SOCcentral.com

Sponsorship/Advertising Information

The Home Port  EDA/EDA Tools  FPGAs/PLDs/CPLDs  Intellectual Property  Electronic System Level Design  Special Topics/Feature Articles  Vendor & Organization Directory
News  Major RSS Feeds  Articles Online  Tutorials, White Papers, etc.  Webcasts  Online Resources  Software   Tech Books   Conferences & Seminars  About SOCcentral.com
Copyright 2003-2009  Tech Pro Communications   1209 Colts Circle    Lawrenceville, NJ 08648    Phone: 609-477-6308 Skype: john_miklosz
1  Execution time: less than 2 second(s)