Saturday, December 22, 2018

FWRISC: Turbo-charged Unit Tests with Formal Verification

I'll admit it. I've been a long-time Formal Verification skeptic. To be clear, I've always understood in theory the power of Formal to perform exhaustive checking, and I understood how Formal could be used to check design-internal properties. It's just that I couldn't seem to get my head around how to apply formal techniques to real verification. Most of the designs I've worked with in the past are highly sequential, with tests spanning milliseconds of simulation. I just couldn't see how I would replace any of my simulation tests with Formal for these designs.

As I was designing Featherweight RISC and its unit test suite, I had a couple of realizations -- questions, really:
  • Had I been looking at Formal incorrectly all along? Would it be more easily applied as a designer tool vs a tool for verifiers? As the designer of Featherweight RISC, I could already see properties that must always hold within the design that, as a verifier, I would not be aware of.
  • Could Formal be used as a turbo-charged unit test suite? I was certainly aware of holes in the directed-test test suite. Constrained-random stimulation might help to plug them, but could Formal be used instead?
I had a deadline looming while designing Featherweight RISC and writing the unit tests, so I resolved to come back to formal verification when I was under less of a time crunch to see if I could break my mental block around Formal.

Open Source Formal?
Before I had a chance to take a deeper look into Formal myself, I started to see other people in the open-source silicon community mentioning Formal. It seemed there were two reasons: Yosys and SymbiYosys, and RISC-V Formal -- all creations of a certain Clifford Wolf. 
As you might be aware, Formal analysis is based on a synthesized view of a design. Yosys is an open source synthesis engine, and SymbiYosys is a layer on top to perform formal analysis of the synthesized view of the design. RISC-V Formal is a formal app built specifically for testing RISC-V core implementations. As I understand it, the RISC-V Formal App should work for any RISC-V implementation, provided you're willing to create a harness to interface the core to the App.

I could have simply picked up the RISC-V Formal library and applied it to Featherweight RISC. However, my primary goal was to really get my head around formal, so I decided to start from the ground up. I did elect to work with SymbiYosys as my Formal engine since it was readily available. 

Designing a Formal Unit Test Suite
My goal was to create a Formal version of my unit test suite -- a sizable collection of focused tests instead of a few expansive tests. My initial thinking was not to find new bugs in my design, but to create a better test suite to catch regressions as I overhauled various portions of the design. After a little initial exploration, I settled on a tentative test structure:

  • One test file per unit test to specify the instruction under test
  • One checker per category of test (eg arithmetic, jump, branch, load/store) to check results
Luckily, I found that I could use the same tracer interface that I had used for my simulation unit tests to check results in my Formal unit tests. This interface (shown below) provides information about the instruction being executed, register reads and writes, and memory accesses. The tracer interface was designed to withstand changes to the implementation architecture of Featherweight RISC, and I expect to find out pretty soon whether that is the case or not.

Remember the simulation testbench diagram (shown below)? 

For comparison, here is my Formal testbench:

Most of the checking logic, with the exception of a few design-centric checks, are contained in the checker. The test itself is composed of an initialization statement for the instruction the core is executing. A test literally looks like this:

 So, what's happening here? Well, the jal macro causes the formal testbench to set the instruction data read by the FWRISC core to a jal instruction. The last two parameters to the macro are the offset and destination register, respectively. The $anyconst directive tells the Formal tool that it can set the offset and destination fields of the instruction to any value (there's that Formal comprehensiveness). That's it for test specification, which leaves most of the heavy lifting to the checker.

Result Checking
Results checking is where I had the most unanswered questions in applying Formal. I found the checking simple properties that must hold across the lifetime of the design fairly easy to understand. These assertions are typically placed within the design. For example, we want to ensure that the destination register write-enable will only be asserted when the destination register is not zero (RISC-V, like many other RISC cores, hard-wires R0 to zero). In the Featherweight RISC implementation, it is an error if the design attempts to write to R0.

But, how do we test the multi-cycle behavior of our design? By writing a synthesizable checker to track the state of the design and apply assertions at the appropriate time. This type of checker will contain logic whose sole purpose is verification, and must be synthesizable. However, that doesn't mean that it needs to be complicated.

Checking Jump Instructions
So, how might we check jump instructions? Well, we need to capture some information after execution of the first instruction:

  • Register values written by execution of the first instruction
  • The address of the first instruction
  • The opcode and fields of interest from the first instruction
  • Expected target of the first instruction

 Here is the logic that captures the register values:

Pretty simple, right? Now, we capture information about the first instruction to execute:

Notice that at the end of the first instruction, we move to 'state==1'. This is where we will check whether or not the expected address was taken. Here's the checking code:

Again, not so hard. We just check that the PC of the instruction that just executed is the same as the previously-computed jump target (pc_target). If the jump target is not word-aligned, we check that an exception is taken. 

Results, Surprising and Otherwise
I'll be honest, when I decided to create my Formal unit test suite, I really didn't expect to find bugs. I simply expected it to help in quickly identifying regressions as I made changes to the design. But, Formal lived up to its exhaustive-checking reputation and found a couple of corner case bugs that neither my test suite, the RISC-V compliance test suite, nor running the Zephyr RTOS encountered.

The first bug I found was in the immediate-operand logical instructions. I had tested ANDI/ORI/XORI instructions with small literals, and they worked just fine. My Formal testbench showed that I had incorrectly used a sign-extended literal. In other words, ORI rd, rs1, 0x800 had been implemented as ORI rd, rs1, 0xFFFFF800.  The best part? The waveform trace looked like this:

Just a single instruction, and enough information about the value of registers to understand what had gone wrong.

I also found an issue with support for shift-zero. Admittedly, both of these are corner cases. But, also potentially-exploitable corner cases from a security perspective. And issues that would be extremely difficult to track down in the context of a full application.

Lessons Learned

Start Small
Especially when getting started with Formal, I strongly recommend starting with a small design -- preferably something you wrote yourself. Formal requires a different thought process, and I found it helpful be able to learn while focusing on a small design that I fully understood. I'll certainly be interested to see if my thought process about Formal verification scales to larger designs and ones I didn't write myself.

You Do Need a Testbench
One of the mental blocks I had with Formal was my understanding that Formal didn't require a testbench. While I could absolutely see how this would be true for design-property checks, I couldn't see how more-involved checks could be done without some sort of test fixture. On this project, I started from the beginning with the assumption that I would have a Formal testbench, and found that this significantly simplified my thought process. I also found that having multiple testbenches was helpful in keeping my environment more modular, and the code more focused and specific.

Make Mistakes
No, really. One of the most-challenging parts about Formal, from my perspective, is that there's very little to look at when all assertions pass. At least with simulation, there's a waveform and a simulation transcript to use as a sanity check that something useful is actually happening. With Formal, there's a very real possibility that all assertions are passing because nothing is actually happening. Now, I'll certainly be the first to admit that this might be due to the tool I was using, or due to my inexperience using the tool. But, I found that deliberately inserting errors to force assertions to fire was a great strategy to verify that my checking logic was correct and would trap the cases that I expected.

Formal Checks Can be Time-Consuming
Due the comprehensive nature of formal, the checks can take a significant amount of time. This means that my simulation-based unit-test suite is still very useful as a first line of defense against design regressions. My simulation-based test suite takes less than a minute to run on my development server, while the similar formal-based test suite takes several minutes.
It's important to be aware that Formal tools can be configured to use different solvers for processing the design, and it seems that different designs respond better to different solvers. At least on Featherweight RISC with SymbiYosys, I found that Boolector is significantly faster than other options. So, be sure to check the options you have with respect to solvers and which one works best for your design.

Formal is Powerful
Finally, I learned that formal verification is a very powerful technique for comprehensively verifying designs. I found bugs in the Featherweight RISC design that I would never have found without significant constrained-random simulation, and, realistically, wouldn't have found until some application code stumbled over them. And, structuring my Formal test suite as a series of unit tests made identification and debug of regressions in the design significantly easier.
I'd encourage anyone designing or verifying designs to have a look (or a second look) at Formal. The results I've seen in my brief look have convinced me that Formal deserves a place in my verification toolbox!


The views and opinions expressed above are solely those of the author and do not represent those of my employer or any other party.

No comments:

Post a Comment