August 24, 2012 -- SCE-MI stands for Standard Co-Emulation Modeling Interface and is the Accellera standard for bridging two realms: un-timed (HLV, on a host) and timed (HDL, in an emulator). The main goal was to eliminate communication bottlenecks that could compromise performance of hardware emulation systems, such as Aldec's HES, that could run in the 10-MHz range. This is why communication channels are transaction-oriented, not event-oriented as in simulation acceleration. The idea is that a single message from software could trigger hundreds of clock cycles in hardware, and similarly, hundreds of hardware clock cycles are needed to form a message for software. This is achieved using synthesizable transactors (see Figure 1) bus functional models that reside in hardware and translate function calls into sequences of bits reducing the bandwidth required for software/ hardware communication and allowing an emulator to run closer to its full speed.
Figure 1. Synthesizable transactors.
SCE-MI has been available for years, with version 1.0 making its debut in 2003. The next major version added two new interfaces to the macro-based model: function-based and pipes-based. The function-based model became very popular, because it is basically the widespread SystemVerilog DPI (Direct Programming Interface). On the other hand, the pipes-based model hasn't gained much attention from users because it's perceived as complicated and doesn't offer clear advantages over other models. For the sake of brevity, we'll omitted a discussion of the pipes-based model in this article.
Figure 2. The time-line for SCE-MI versions.
The situation has lead to the common misconception that SCE-MI 1 is macro-based and SCE-MI 2 is function-based. In fact, the macro-based model is also part of SCE-MI 2 and interoperability of the models is useful. The most important differences between these two models of SCE-MI 2 are shown in Table 1.
Table 1. Comparison of macro-based and function-based SCE-MI models.
The macro-based model is easy to use; there are two modules instantiated in RTL code to pass messages to and from software part, using simple dual-ready handshake protocol that moves message only when both the hardware and software parts are ready. Controlling clocks is not mandatory so doesn't add to complexity.
Figure 3. The macro-based model.
The function-based model is even simpler; the function call is the transaction. Thanks to DPI (Direct Programming Interface), the function defined in one language can be called in another.
Figure 4. The function-based model.
Ad-hoc and reusable transactors
For the transactors required for a specific problem or task, which must be written ad-hoc, time is often the most important consideration. In this case, even adding a few modules seems like a lot and operating on a higher level saves work, so DPI is often chosen. Function-based SCE-MI is preferred by verification engineers.
One example is the resetting transactor, when a specific reset sequence is required for the device under test (DUT). Monitors are also often easier to write using a function-based model. The simplest case is adding $display tasks to RTL code.
In case of more complex transactors which are meant to be reusable (e.g., those for popular interfaces such as AXI, AHB or UART), the best choice is to use off-the-shelf transactors from Aldec's Transactor Library.
If for some reason a decision is made to develop transactors in-house, a macro-based model is often the better choice. It has a lower abstraction level and so is preferred by design engineers that are used to RTL code. Also, explicit clock control could be leveraged to stop the DUT clock as required.
Some example use cases
There is no simple answer to the question "Which SCE-MI is better?" It's a matter of choosing the right tool for the job. In what follows, various use cases of SCE-MI are presented, with a recommended model whenever one has important advantages. Keep in mind that use cases and use models (macro- and function-based) can be mixed and matched to perfectly suit design and verification demands.
Connecting with OVM/UVM
There is no doubt that verification methodologies based on System Verilog, such as Accellera's Universal Verification Methodology (UVM), and its Open Verification Methodology (OVM) predecessor, greatly improve verification productivity. Writing testbenches is easy thanks to the higher level of abstraction. Constrained random stimulus reduces the number of hand-written test scenarios required, while reusable checkers, scoreboards and coverage collectors complete the environment.
One minor inconvenience is that both DPI and SCE-MI standards assume that the testbench and the design are in a different languages. Both focus on defining function calls from C to HDL and from HDL to C. But macro-based and function-based SCE-MI transactors could be easily connected to a SystemVerilog testbench using a thin, intermediate C layer as shown in Figure 5.
Figure 5. Testbench and DUT in SystemVerilog.
Testing against executable specifications
Frequently, executable specifications are available while implementing an algorithm on the RTL level; this is common for codecs, ciphers, etc. After implementation, this high-level model could be reused to test the low-level implementation. In this case, the same stimuli is applied to both modules, and results are automatically compared. Also, coverage could be measured to control constrained random generation of transactions and to stop testing when the required level is reached. SCE-MI is the ideal choice to bridge two implementations on different abstraction levels.
Figure 6. Comparing with golden model.
Connecting modules in the modeling stage and already implemented in RTL
Similarly, SCE-MI can be used to bring up a system in hardware, to benefit from a much higher speed than in simulation, even when part of the system is not yet synthesizable and exists as behavioral models only.
Starting the SOC integration phase before the implementation phase is completed can save a lot of time (see Figure 7).
Figure 7. Productivity gain with SCE-MI emulation compared to the traditional approach.
Prototyping can become really slow when external devices need to be connected. SOCs often have a lot of interfaces,so connecting the prototype to the outside world LCD, image sensor, keypad, USB, SD card, HDMI, etc. could be a challenge.
Figure 8. Traditional prototyping.
With an emulator and SCE-MI virtual components, no additional hardware is needed no external wires and no hassle (compare Figures 8 and 9). LCD and HDMI outputs, for example, could be displayed in windows on a host workstation.
Popular interfaces are available off-the-shelf, and specific applications such as buttons and jog-dials can be quickly created using DPI function-based SCE-MI.
Figure 9. Virtual prototyping with SCE-MI.
An interesting example is the testing of how firmware and applications on a smartphone react to varying inputs from an accelerometer, gyroscope and GPS. If you thought about tilting and shaking rapid prototyping board or traveling with it, think about having control from workstation, with a GUI for changing the position of the smartphone in space, simply dragging a 3D-model along three axes, or altering the location be clicking on map, using the Google Maps API.
SOCs are not only hardware, but also the embedded software. Integrating the two can be a nightmare, but with the following tools and methods it will be trouble-free and efficient.
VSTREAM is a fast and flexible virtual debug interface that connects software debuggers to an emulator. This interface is much faster than JTAG (a serial interface, slow by design) and doesn't require an additional debug cable for connection. On the other hand, VSTREAM is an SCE-MI transactor using a high-bandwidth link (usually PCIe) between the emulator and the host. This allows faster software download and debugger responsiveness.
When controlling software execution in an embedded processor with breakpoints, stopping and stepping, having full visibility to registers and memory greatly accelerates SOC integration. Also, complex, multicore systems can be stopped quickly to preserve critical state information, by providing a powerful cross-core breakpoint facility.
Figure 10. ARM VSTREAM transactor.
The ARM VSTREAM transactor uses the function-based model to achieve easy migration from simulation to emulation on any platform.
Processor models: ISS, TLM, OVP
With wide adoption of the TLM2 standard and initiatives such as OVP (Open Virtual Platform), a broad range of processor models is available. Currently, there are over 100 CPU models, including ARM7, ARM9, ARM11, Cortex, MIPS, ARC, Xilinx Microblaze and PowerPC. A library of other TLM2 components enables the creation of high-level system prototypes quickly. Such loosely timed transaction-level systems simulate several orders of magnitude faster than RTL; but when RTL modules are added, speed drops significantly. With the Aldec emulation platform and TLM2 SCEMI adapter, a user can have the best of two worlds: high-speed simulation of TLM2 models in software and high-speed emulation of RTL modules in hardware.
QEMU (Quick Emulator) is free processor emulator with efficient binary translation and a set of device models (peripherals). It can boot many guest operating systems (Android, Linux, Windows CE, etc.) emulating several hardware platforms (ARM, x86 MicroBlaze, PowerPC, SPARC, and so on). It can even run Zilog Z80, emulating Sinclair ZX spectrum, if you long for the good old days!
Figure 11. SCE-MI with QEMU.
Combining QEMU with SCE-MI offers a two-fold benefit: 1) software developers can develop drivers, testing them with actual RTL in hardware, using the Aldec debug solutions to get full hardware visibility when needed. and 2) the hardware module can be tested using emulation with real stimuli from real applications.
As can be seen from the above examples, SCE-MI is versatile and could adapt to most design and verification challenges. Table 2 summarizes the recommended model (macro- or function-based) for each use case.
Table 2. SCE-MI model recommendations.
So the answer to the question "Which SCE-MI is better?" is simple you can choose whichever suits you best!
By Jacek Majkowski
Jacek Majkowski is a senior hardware engineer from Aldec, Inc. where he is a specialist in SCE-MI co-emulation. Prior to this position, he spent 7 years in the field of hardware-assisted verification. Jacek received his Master of Science in Electrical Engineering from AGH University of Science and Technology in Krakow, Poland.
Editor's note: For information about Aldec's HES Hardware Emulation Solutions, please visit www.aldec.com/products/HES.
Go to the Aldec, Inc. website to learn more.