Page loading . . .

  
 Category: SOCcentral Feature Articles & Columns: Feature Articles: Wednesday, October 22, 2014
Five Emerging DRAM Interfaces You Should Know About for Your Next Design   Featured
Contributor: Cadence Design Systems, Inc.
 Printer friendly
 E-Mail Item URL

December 30, 2013 -- Because dynamic random-access memory (DRAM) has become a commodity product, suppliers are challenged to continue producing these chips in increasingly high volumes while meeting extreme price sensitivities. It's no easy feat, considering the ongoing demands for increased bandwidth, low power consumption, and small footprint from a variety of applications. This article takes a look at five next-generation DRAM technologies that address these challenges.

Mobile ramping up DRAM demands

Notebook and desktop PCs continue to be significant consumers of DRAM; however, the sheer volume of smartphones and tablets is driving rapid DRAM innovation for mobile platforms. The combined pressures of the wired and wireless world have led to development of new memory standards optimized for the differing applications. For example, rendering the graphics in a typical smartphone calls for a desired bandwidth of 15GBps — a rate that a two-die low-power double-data-rate 4 (LPDDR4) X32 memory subsystem meets efficiently. At the other end of the spectrum, a next-generation networking router can require as much bandwidth as 300GBps — a rate for which a two-die Hybrid Memory Cube (HMC) subsystem is best suited.

LPDDR4 and HMC are just two of the industry's emerging memory technologies. Also available (or scheduled for mass production in the next couple of years) are LPDDR3, Wide I/O 2, and High Bandwidth Memory (HBM). But why deal with all of these different technologies? Why not just increase the speed of the DRAM you are already using as your application requirements change?

Unfortunately, core DRAM access speed has remained pretty much unchanged over the last 20 years and is limited by the RC time constant of a row line. For many applications, core throughput (defined as row size X core frequency) is adequate and the problem is then reduced to a trade-off between the number of output bits versus output frequency (LPDDR3, LPDDR4, Wide I/O 2, and HBM are among the memory subsystems that address these concerns). However, if an application requires more bandwidth than the core can provide, then multiple cores must be used to increase throughput (HMC subsystems can be used in these scenarios).

Increasing DRAM bandwidth is not an effort without trade-offs. While bandwidth is primarily limited by I/O speed, increasing I/O speed by more bits in parallel or higher speeds comes with a power, cost, and area penalty. Power, of course, remains an increasing concern, especially for mobile devices, where the user impact is great when battery life is short and/or the devices literally become too hot to handle. In addition, increasing package ball count results in increased cost and board area.

The emerging DRAM technologies represent different approaches to address the bandwidth, power, and area challenges. In this article, we'll take a closer look at the advantages and disadvantages of five memory technologies that are sure to play integral roles in next-generation designs.

LPDDR3: Addressing the mobile market

Published by the JEDEC standards organization in May 2012, the LPDDR3 standard (see Figure 1) was designed to meet the performance and memory-density requirements of mobile devices, including those running on 4G networks. Compared to its predecessor, LPDDR3 provides a higher data rate (1,600Mbps), more bandwidth (12.8GBps), higher memory densities, and lower power. [1]

Figure 1. LPDDR3 architecture.

To achieve the goal of higher performance at lower power, three key changes were introduced in LPDDR3: lower I/O capacitance, on-die termination (ODT), and new interface-training modes. Interface-training modes include write-leveling and command/ address training. These features help improve timing queues and timing closure, and also ensure reliable communication between the device and the system-on-chip (SOC). The mobile memory standard also features lower I/O capacitance, which helps meet the increased bandwidth requirement with increased operating frequency at lower power. [2]

LPDDR4: Optimized for next-generation mobile devices

LPDDR4 (see Figure 2) is the latest standard from JEDEC, expected to be in mass production in 2014. The standard is optimized to meet increased DRAM bandwidth requirements for advanced mobile devices. LPDDR4 offers twice the bandwidth of LPDDR3 at similar power and cost points. To maintain power neutrality, a low-swing GND-terminated interface (LVSTL) with data bus inversion has been proposed. Lower page size and multiple channels are other innovations used to limit power. For cost reduction, the standard LPDDRx core architecture and packaging technologies have been reused with selected changes such as a reduction of the command/ address bus pin count. [3]

Figure 2. LPDDR4 architecture.

Wide I/O 2: Supporting 3D-IC packaging for PC and server applications

The Wide I/O 2 standard (see Figure 3), also from JEDEC and expected to reach mass production in 2015, covers high-bandwidth 2.5-D silicon interposer and 3-D stacked-die packaging for memory devices. Wide I/O 2 is designed for high-end mobile applications that require high bandwidth at the lowest possible power. This standard uses a significantly larger I/O pin count at a lower frequency. However, stacking reduces interconnect length and capacitance and eliminates the need for ODT. This results in the lowest I/O power for higher bandwidth.

Figure 3. Wide I/O 2 architecture.

In 2.5-D stacking, two dies are flipped over and placed on top of an interposer. All of the wiring is on the interposer, making the approach less costly than 3-D stacking but requiring more area. Heat dissipation is not much of a concern, since cooling mechanisms can be placed on top of the two dies. This approach is also lower cost and more flexible than 3-D stacking because faulty connections can be reworked.

There are electronic design automation (EDA) tools on the market that help designers take advantage of redundancy at the logic level to minimize device failures. For example, Cadence® Encounter® Digital Implementation allows designers to route multiple redistribution (RDL) layers into a microbump, or to use combination bumps. In this scenario, if one bump falls, the remaining bumps can carry on normal operations.

With 3-D stacking, heat dissipation can be an issue — there isn't yet an easy way to cool the die in the middle of the stack, and that die can heat up the top and bottom dies. Poor thermal designs can limit the data rate of the IC. In addition, a connection problem — especially one occurring at the middle die — renders the entire stack useless.

HMC: Breaking barriers to reach 400G

HMC (see Figure 4) is being developed by the Hybrid Memory Cube Consortium and backed by several major technology companies, including Samsung, Micron, ARM, HP, Microsoft, Altera, and Xilinx. HMC is a 3-D stack that places DRAMs on top of logic. This architecture, expected to be in mass production in 2014, essentially combines high-speed logic process technology with a stack of through-silicon-via (TSV) bonded memory die. [4] In an example configuration, each DRAM die is divided into 16 "cores" and then stacked. The logic base is at the bottom, with 16 different logic segments, each segment controlling the 4 or 8 DRAMs that sit on top. This type of memory architecture supports more "DRAM I/O pins" and, therefore, more bandwidth (as high as 400G). According to the Hybrid Memory Cube Consortium, a single HMC can deliver more than 15X the performance of a DDR3 module and consume 70% less energy per bit than DDR3.


Figure 4. HMC architecture.

HMC uses a packetized protocol on a low-power SerDes interconnect for I/O. Each cube can support up to four links with up to 16 lanes. With HMC, design engineers will encounter some challenges in serialized packet responses. When commands are issued, the memory cube may not process these commands in the order requested. Instead, the cube reorders commands to maximize DRAM performance. Host memory controllers thus need to account for command reordering. HMC provides the highest bandwidth of all the technologies considered in this article, but this performance does come at a higher price than other memory technologies.

HBM: Emerging standard for graphics

HBM (see Figure 5) is another emerging memory standard defined by the JEDEC organization. HBM was developed as a revolutionary upgrade for graphics applications. GDDR5 was defined to support 28GBps (7Gbps x32). Extending the GDDRx architecture to achieve a higher throughput while improving performance/watt was thought to be unlikely. Expected to be in mass production in 2015, the HBM standard applies to stacked DRAM die, and is built using TSV technologies to support bandwidth from 128GBps to 256GBps. JEDEC's HBM task force is now part of the JC-42.3 Subcommittee, which continues to work to define support for up to 8-high TSV stacks of memory on a 1,024-bit wide data interface. [5] According to JEDEC, the interface would be partitioned into 8 independently addressable channels supporting a 32-byte-minimum access granularity per channel. There is no command reordering, which allows the graphics controller to optimize access to memory. The subcommittee expects to publish the standard in late 2013.

Figure 5. HBM architecture.

Which memory standard is best for your next design?

As this article has discussed, each emerging memory standard tackles the power, performance, and area challenges in a different way. There are trade-offs from one to another, with each optimized for a particular application or purpose. How do you select the right memory standard for your design?

Obviously, the basis for your decision will be your application's requirements. When considering a smartphone, for example, you may decide between Wide I/O 2 and LPDDR4. Because thermal characteristics are critical in smartphones, the industry consensus has turned to Wide I/O 2 as the best choice. Wide I/O 2 meets heat-dissipation, power, bandwidth, and area requirements. However, it is more costly than LPDDR4. LPDDR4, on the other hand, also provides advantages in bandwidth, TSV readiness, and software support. Given its lower silicon cost, LPDDR4 may be more ideal for cost-sensitive mobile markets.

On the other end of the application spectrum, consider high-end computer graphics processing, where chip complexity is a given and high-resolution results are expected. Here, you might look to the higher-bandwidth HBM technology. Computer graphics applications are less constrained by cost than, say mobile devices, so the higher expense of HBM memory may be less of an issue. Table 1 compares the features of the five standards discussed here.

Table 1. Where does each memory standard stand?

To help integrate your design at the register-transfer level, EDA companies like Cadence offer IP portfolios for memory subsystems. Cadence has controller and PHY IP for a broad array of standards, including many of those discussed here. Cadence also provides memory model verification IP (VIP) to verify memory interfaces and ensure design correctness. In addition, tools such as Cadence Interconnect Workbench lets the SOC designer optimize the performance of memory subsystems through a choice of memory controller parameters such as interleaving, command-queue depths, and number of ports. These tools can help speed up SOC development and ensure first-pass success.

In summary

Choosing the right DRAM technology for your design requires careful consideration. There are a variety of architectures that are either available now or will soon hit the market. Each has its strengths and weaknesses in terms of meeting bandwidth, power, cost, and other key specifications; your specific application and market requirements should guide you in making the right choice for your next design.

By Gopal Raghavan

Gopal Raghavan, a Cadence fellow, is responsible for areas including IP roadmap development, SoC integration, and performance evaluation. He has electrical engineering degrees from the Indian Institute of Technology (bachelor¹s) and from Stanford University (M.S. and Ph.D.).

References

[1] "Mobile DDR," Wikipedia.
[2] "Industry view: JEDEC on LPDDR3," Kristin Lewotsky, EE Times.
[3] "LPDDR4 Moves Mobile," Daniel Skinner, JEDEC Mobile Forum 2013.
[4] "About Hybrid Memory Cube," Hybrid Memory Cube Consortium.
[5] "3D-ICs," JEDEC.

Cadence Design Systems enables global electronic design innovation and plays an essential role in the creation of today's electronics. Customers use Cadence software, hardware, IP, and expertise to design and verify today's mobile, cloud and connectivity applications.

This article is based on a whitepaper available as a PDF on the Cadence Design Systems, Inc. website.


Go to the Cadence Design Systems, Inc. website to learn more.

Keywords: ASICs, ASIC design, FPGAs, field programmable gate arrays, FPGA design, embedded system design, embedded systems, computer system design, general-purpose computers, special-purpose computers, embedded memory, DRAM, LPDDR3, LPDDR4, Wide I/O, HMC, HBM, Cadence Design Systems, SOCcentral
488/41742 12/30/2013 1403 1403
Add a comment or evaluation (anonymous postings will be deleted)

Designer's Mall
Halloween countdown banner
0.953125



 Search for:
            Site       Current Category  
   Search Options

Subscribe to SOCcentral's
SOC Explorer
Newsletter
and receive news, article, whitepaper, and product updates bi-weekly.

Executive
Viewpoint

Verification Contortions


Dr. Lauro Rizzatti
Verification Consultant
Rizzatti, LLC

Executive
Viewpoint

Deep Semantic and Formal Analysis


Dr. Pranav Ashar
CTO, Real Intent

SOCcentral Job Search

SOC Design
ASIC Design
ASIC Verification
FPGA Design
CPLD Design
PCB Design
DSP Design
RTOS Development
Digital Design

Analog Design
Mixed-Signal Design
DFT
DFM
IC Packaging
VHDL
Verilog
SystemC
SystemVerilog

Special Topics/Feature Articles
3D Integrated Circuits
Analog & Mixed-Signal Design
Design for Manufacturing
Design for Test
DSP in ASICs & FPGAs
ESL Design
Floorplanning & Layout
Formal Verification/OVM/UVM/VMM
Logic & Physical Synthesis
Low-Power Design
MEMS
On-Chip Interconnect
Selecting & Integrating IP
Signal Integrity
SystemC
SystemVerilog
Timing Analysis & Closure
Transaction Level Modeling (TLM)
Verilog
VHDL
 
Design Center
Tutorials, Whitepapers & App Notes
Archived Webcasts
Newsletters



About SOCcentral.com

Sponsorship/Advertising Information

The Home Port  EDA/EDA Tools  FPGAs/PLDs/CPLDs  Intellectual Property  Electronic System Level Design  Special Topics/Feature Articles  Vendor & Organization Directory
News  Major RSS Feeds  Articles Online  Tutorials, White Papers, etc.  Webcasts  Online Resources  Software   Tech Books   Conferences & Seminars  About SOCcentral.com
Copyright 2003-2013  Tech Pro Communications   1209 Colts Circle    Lawrenceville, NJ 08648    Phone: 609-477-6308
553.488  1.03125