Bits, Bytes, and Gates: Verification Frameworks

Showing posts with label Verification Frameworks. Show all posts

Saturday, November 30, 2019

Adding Task-Based Bus Functional Models to Cocotb

Getting a project started -- even to a certain level of completeness -- is often pretty simple. A couple weekends of hacking often results in pretty good progress and results. Finishing things up, in contrast, is often a slow process. That has certainly been the case with some work I did back in May and June this year to prototype a task-based interface between Python and an HDL simulator. The proof-of-concept work I did at the time seemed quite promising, but the clear next step was to make that work more accessible to others. That "last little bit of work" has certainly turned out to take more time that I had originally assumed!

Just as a reminder, the motivation for interacting with an HDL environment at the task level is quite simple: performance. Communicating across language (especially interpreted language) boundaries tends to be expensive, so minimizing the number of cross-language communications is critical to achieving high throughput. Using a task-based interface between Python and an HDL environment boosts performance in three ways. First, a task-based interface groups data so fewer language-boundary crossings are required to communicate a given amount of data. Secondly, and more importantly, using a task-based interface enables the HDL environment to filter events and only interact with the Python environment when absolutely necessary. Finally, using a task-based interface enables integrations with high(er)-speed environments, such as emulation or the current release of Verilator, where signal-level integration isn't practical.

I started looking at Python as a testbench language for reasons that might initially seem strange: the Python ecosystem (primarily PyPi) that makes it easy to publish bits of library and utility code in a way that it is easily-accessible to others. Often, the ability to take advantage of the work of others is gated by the effort required to gather all the required software dependencies. The Python ecosystem promises to alleviate that challenge, and I was excited to explore the possibilities.

Where Does it Fit?

I'm aware of a few projects that use Python for verification, but it currently appears that Cocotb is the most visible Python-based testbench for doing hardware verification using Python testbench code. Consequently, it made very good sense to see whether the task-based integration I had prototyped could be integrated with the existing Cocotb library.

Cocotb is a Python library that supports light-weight concurrency via co-routines, and provides primitives for coordinating these co-routines with each other and with activity in a HDL simulation environment. In addition to the Python library, Cocotb provides native-compiled C/C++ libraries that integrate with the simulation environment via APIS implemented by the simulator (VPI, DPI, FLI, or VHPI depending on the simulator). Currently, Cocotb interacts with simulation environments at the signal level.

In considering how to add task-based interactions to Cocotb, there were a several requirements that I thought were quite important. First, the user should not be forced to choose between signal-level and task-based interactions between Python and HDL. It should be possible to introduce task-based interactions to a testbench currently interacting at the signal level, or add a few signal-based interactions to a testbench that primarily interacts at the task level. Secondly, a task-based integration must support a range of simulator APIs. I had prototyped a DPI-based integration, which is supported by SystemVerilog simulators, supporting Verilog and VHDL simulators as well was clearly important. Finally, achieving good performance was a key requirement, since performance is the primary motivation for using a task-based interface in the first place.

Task-Based BFM Cocotb Architecture

From a system perspective, the diagram above captures how task-based BFMs integrate with Cocotb. Each BFM instance is represented in the HDL environment by an instance of an HDL module. This module is special, in that it knows how to accept and make task calls and convert between signal-level information and those task calls.

Each BFM also knows how to register itself with a BFM Manager within Cocotb. When the HDL portion of a BFM registers with Cocotb, the BFM Manager creates an instance of a Python class that represents the BFM within the Python environment.

The BFM Manager provides methods to allow the user's test code to query the available BFMs and obtain a handle to the BFM instances required by the test. From there, the user's test simply calls methods on the Python class object and/or receives callbacks.

Task-Based BFM Architecture

Let's take a quick look at the work needed to support a task-based BFM. First off, the BFM author needs to a Python class to implement the Python side of the BFM. That class is decorated with a @cocotb.bfm decorator that associates HDL template files with the BFM class. Below is a BFM for a simple ready/valid protocol.

@cocotb.bfm(hdl={

cocotb.bfm_vlog : cocotb.bfm_hdl_path(__file__, "hdl/rv_data_out_bfm.v"),

cocotb.bfm_sv : cocotb.bfm_hdl_path(__file__, "hdl/rv_data_out_bfm.v")

})

class ReadyValidDataOutBFM():

# ...

Next, the BFM author must specify the low-level interaction API with the HDL BFM. All calls must be non-blocking, so most interactions with the HDL environment are implemented as a request/acknowledge pair of API calls.

@cocotb.bfm_import(cocotb.bfm_uint32_t)

def write_req(self, d):

pass

@cocotb.bfm_export()

def write_ack(self):

self.ack_ev.set()

Calling a class methods decorated with @cocotb.bfm_import will result in a task call in the HDL BFM. Class methods decorated with @cocotb.bfm_export can be called from the HDL BFM.

Finally, on the Python side, the BFM author will likely provide a convenience API to simplify the testwriter's life:

@cocotb.coroutine

def write_c(self, data):

'''

Writes the specified data word to the interface

'''

yield self.busy.acquire()

self.write_req(data)

# Wait for acknowledge of the transfer

yield self.ack_ev.wait()

self.ack_ev.clear()

self.busy.release()

There's one piece left, and that's the HDL BFM. This is specified as a template:

module rv_data_out_bfm #(

parameter DATA_WIDTH = 8

) (

input clock,

input reset,

output reg[DATA_WIDTH-1:0] data,

output reg data_valid,

input data_ready

);

reg[DATA_WIDTH-1:0] data_v = 0;

reg data_valid_v = 0;

always @(posedge clock) begin

if (reset) begin

data_valid <= 0;

data <= 0;

end else begin

if (data_valid_v) begin

data_valid <= 1;

data <= data_v;

data_valid_v = 0;

end

if (data_valid && data_ready) begin

write_ack();

if (!data_valid_v) begin

data_valid <= 0;

end

task write_req(reg[63:0] d);

begin

data_v = d;

data_valid_v = 1;

end

endtask

// Auto-generated code to implement the BFM API

${cocotb_bfm_api_impl}

endmodule

The BFM author must implement tasks that will be called from the Python class. Task proxies that will invoke Python methods are implemented by the Cocotb automation, and substituted into the template where the ${cocotb_bfm_api_impl} tag is specified.

Current Integrations

Currently, task-based BFM integrations are implemented for Verilog via the VPI interface and for SystemVerilog via the DPI interface. A VHDL integration isn't currently supported, but that's on the roadmap. One complication with VHDL is that there are actually several interfaces that may need to be supported depending on the simulator -- VHPI, VHPI via VPI, Modelsim FLI. Here I could use some input from the community on priorities, so I'd definitely like to hear from you if you're using Cocotb with VHDL...

Results

As I mentioned at the beginning of this post, performance is the primary reason for using task-based interaction between Python and a HDL environment. So, how much improvement can you expect? Well, that entirely depends on how frequently your tests interact with the HDL environment and, to a certain extent, on how long your tests are. If your testbench needs to interact with the testbench every clock cycle, then you're unlikely to see much benefit. If, however, your testbench spends quite a few cycles waiting for the design to respond, then you're likely to see pretty significant benefits.

I'll use my FWRISC (Featherweight RISC) RISCV core as an example. In this environment, the bulk of the test is actually compiled code that executes on the processor. The Python testbench is primarily responsible for checking results and providing debug information when needed.

A diagram of the simulation-based testbench is shown above. The Tracer BFM is responsible for monitoring execution of the FWRISC core and sending events up to the high-level testbench as needed. These events include:

Instruction executed
Register written
Memory written

I've created two Cocotb implementations of this BFM: one that interacts at the signal level, and one that interacts at the task level. To compare the performance, I'm running a Zephyr test with Icarus Verilog for 10ms of simulation time.

Let's start with as close to a direct comparison as possible. Both the signal-level and task-based BFM will capture the same information and propagate it to the Python testbench.

Signal-Level BFM: 85s (wallclock)
Task-Level BFM: 33s (wallclock)

Okay, so already we're looking pretty good. This performance increase is simply because the task-based BFM doesn't need to call the Python environment every cycle.

Another way we can benefit is to use a higher-performance simulator. Icarus Verilog is interpreted, and supports a full event-driven simulation environment. Verilator has a much more restricted set of features (synthesizable Verilog only, limited signal-level access, etc), but is also much faster. It also doesn't currently support the signal-level access to the extent necessary to do a direct comparison between a task-based BFM and a signal-level BFM. So, how do we look here? I actually had to increase the simulation time to 100ms (10x longer) to get a meaningful reading.

Task-Level BFM: 18s (wallclock)

So, coupling a fast execution platform with an efficient integration mechanism definitely brings benefits!

Next Steps

So, where do we go from here? Well, please stay tuned for my next blog post to get more details on how create task-based BFMs using these features. I also have an active pull request (#1217) to get this support merged into Cocotb directly. Until then, you can always access the code here.

Disclaimer

The views and opinions expressed above are solely those of the author and do not represent those of my employer or any other party.

Saturday, April 12, 2014

System Level Verification: What, Another Framework?

Frameworks have been enormously helpful in making testing and verification productive and reusable. In the RTL verification space, AVM, VMM, OVM, and UVM have all contributed to standardizing a pattern for RTL verification that encompasses encapsulation and reuse, automation for stimulus generation, and mechanisms for results checking. In the software space, frameworks such as JUnit and CPPUnit have, likewise, provide structure around standardizing how test results are checked and how test suites are composed and executed in an automated fashion.

As more focus is given to verification issues at SoC and system level, it makes sense to ask: are the requirements for a verification framework in a system level environment met by existing (primarily RTL-centric) verification frameworks, or is there something fundamentally different at system level? As it turns out, there are some rather unique aspects of system-level verification that make existing verification frameworks unsuitable for application in this space.

Perhaps the most visible difference in an system-level design is the presence of embedded processors that are simultaneously part of the design and part of the verification environment. The verification framework must enable automation for test generation, as well as facilitate managing the processors as design and verification resources in configurations ranging from 'bare metal', to OS and test, to an OS running application software and a test.

Another difference with a system-level environment is that verification must look both forward and backwards in the verification process. It is a requirement that the same test scenario be able to run in an SoC-level simulation/emulation context that includes a SystemVerilog/UVM testbench, as well as running in a prototype context (FPGA or first silicon), as well as potentially running on the end product. Now, that certainly doesn't mean that the same verification will be done at each step: the purpose and goals of verification in each context are quite different. However, being able to re-run the same scenario in two verification contexts provides some continuity between the contexts and avoids having to start from scratch when changing contexts. For example, consider an SoC-level environment in simulation. Much of the test stimulus is still hardware-centric, but early embedded-software scenarios are being executed. When moving to emulation, it is enormously valuable to be able to run the same scenario that was proven to run in simulation and not have to start developing a test from scratch. Continuity is even more helpful when moving from emulation (which still largely maintains the appearance of simulation) to an FPGA prototype environment that is radically different.

As mentioned in a previous post, system-level environments tend to look more like cooperating islands of test functionality rather than the monolithic testbench used for block and subsystem RTL verification. A system-level verification framework must enable reuse of verification infrastructure across these islands, as well as facilitating the cooperation of these islands in carrying out the broader verification task.

Just because the requirements are different for a system-level verification framework doesn't mean that design of a system-level verification framework must start from first principals. Hardware- and software-centric test frameworks have been in development for over a decade (some would argue much, much longer), and there is much to be gained from the features that, over time, were found to be useful for verifying hardware and software.

I've recently started work on a lightweight C++-based verification framework that targets the requirements of system-level verification. The framework borrows features and patterns from existing verification frameworks, and adds features and patterns to address some of the unique requirements of driving embedded software tests and coordinating the multiple 'islands' of a system-level verification environment. The framework is designed to be encapsulated in environments as diverse as UVM and special-purpose verification hardware accelerators. The next few blog posts will explore some of the specifics of this developing framework.

~Everything is a system -- some are just smaller than others

Saturday, March 29, 2014

System-level verification: Islands vs Continents

As the saying goes, "There is nothing permanent except change". However, even with constant change, similarities with what came before abound and true and complete discontinuities are relatively rare. This is certainly true in the functional verification space as we begin looking beyond the methodologies that served us well for unit-level and subsystem-level verification and ask "what's next?". There are many similarities between the requirements for a system-level verification framework and requirements for frameworks targeting unit-level and SoC-level environments. In both cases, encapsulation, reuse, abstraction, and test automation are important. However, there are also fairly significant differences in requirements as well. The biggest differences involve testbench structure, time correlation expectations and requirements, and modeling languages.

In a unit-level and SoC-level environment, testbench environments tend to be monolithic. The very fact that it is common to refer to 'the testbench' highlights the all-encompassing nature of the testbench environment in unit- and subsystem- level verification. By contrast, system level (and to a certain extent SoC level) verification tend to be more distributed -- more like loosely-connected islands than the all-encompassing continent of a unit-level testbench environment.

In a SoC-level verification environment, the primary 'island' is the verification-centric embedded software running on the processor or processors that is effectively verifying the system from the inside out. This software may run autonomously relative to the testbench that is testing the design from the outside in, or it may be coordinated -- either loosely or quite tightly -- with the activity of the testbench surrounding the design.

In a system-level verification environment, the 'island' effect is even more pronounced. System-level verification is typically carried out in the lab with either an FPGA prototype or first silicon. In this case, the testbench will be divided into islands such as test equipment connected to the design interfaces, test code running on the embedded processors, and specialized test hardware built into the chip.

A key requirement in unit- and SoC-level testbench environments has historically been tight time synchronization and correlation of activity. Given the simulation and emulation-centric nature of unit, subsystem, and SoC-level verification this makes perfect sense: since the execution engine maintains a coherent view of events and time, the testbench environment can maximize modeling ease and the repeatability and predictability of results. However, this global view of time comes at the cost of execution speed. Simulation-based testbench environments are by-and-large single threaded, and remain largely unable to take advantage to the recent explosion in the availability of multi-core machines to accelerate simulation speed.

By contrast, a system-level verification environment cannot afford to sacrifice the much higher execution speed delivered by an FPGA or first silicon prototype to maintain lock-step synchronization between the execution of the entire environment. Even if higher execution speed could be sacrificed, maintaining full time synchronization would artificially constrain the system and make the results of system level verification impossible to trust.

Finally, verification frameworks designed for use at unit to SoC level have typically be written in a verification-centric language such as Vera, 'e', or SystemVerilog. This makes sense, of course, since these languages provide features specific to RTL verification. However, the fact that these existing languages are typically tightly tied to simulation environments makes reusing verification IP and tests created using them in a system-level environment nearly impossible. A system-level verification framework is essentially constrained to use a 'lowest common denominator' language in order to ensure maximum reuse.

A verification framework that seeks to provide value for system-level verification must be designed from the ground up with these requirements in mind. Over the next few posts, we'll have a look at how these requirements are being addressed in a new system level verification framework currently being developed.

~Everything is a system -- some are just smaller than others

Tuesday, March 18, 2014

Verification Frameworks and System-Level Verification

It seems the last decade or so has been the decade of the verification language and the verification framework. From SystemC, Vera, and 'e' to SystemVerilog, and VMM and AVM to OVM and UVM, a lot of focus has been placed on making design verification engineers productive in creating verification environments and tests. By and large, these verification language and frameworks focused on block-level and subsystem-level verification -- areas where automating application of test stimulus at the signal level was the name of the game. The world changes, however. Today, there is a growing interest in verification at the SoC and system level. At this level, where embedded software and high-level test stimulus are important, frameworks that are tied to RTL-centric language features are a bit of an impedance mismatch -- despite how relevant and valuable they are for unit and subsystem level verification.

Looking forward, the question must be raised: given the importance of verification frameworks in enabling productive, modular, and reusable verification in RTL-centric environments, might a verification framework focused on system-level verification bring the same benefits? As you might guess, I believe in the value of having a verification framework that addresses the somewhat-unique (at least as compared to those of RTL-centric environments) requirements of system-level environments.

Now, just because there might be value in a different verification framework focused on the needs of system-level verification doesn't mean that we have to start from scratch in designing this framework. The experiences from the past decade in terms of general requirements for and useful attributes of a verification framework are invaluable in informing the core elements of a new system-level verification framework.

So, the past informs the future even as the environment and requirements change. Over the course of the next few blog posts, I'll outline more details on key attributes of a system-level verification framework.

Are you doing verification at the system level? If so, what verification framework are you using?