Axiomising RISC-V processors through formal verification

Ashish Darbari, 2019

Verification Challenges

As chip design is becoming democratised with the advent of open-source processor architectures such as RISC-V; more organisations are daring to get into processor design – ranging from embedded microcontrollers to high-end desktop and server-grade CPUs, AI and machine learning designs. Whereas designing a new microprocessor has become economically feasible due to open-source architectures; the test and verification continue to be the major hurdle with some estimating it to be around 70% of the overall cost of chip development. This cost doesn’t factor the cost of missed bugs and re-spins when the chip is found to have functional defects (bugs) that were not caught before fabrication. One must only look at the fallout of the security bugs – Spectre and Meltdown that affected virtually every processor on Earth regardless of who the vendor was and some of these are big names such as Arm®, Intel®, and AMD®.

How do we then expect future chip design to be secure against these malicious bugs that render the chip vulnerable to hacking? How do we know that the chip will continue to work as intended in the aerospace and defence sector and will not misbehave due to intentional and unintentional bugs not caught during test and verification?

The main challenge is that conventional verification techniques grounded in the simulation are not adequate to make guarantees needed to build secure and safe systems. Even faster and orders of magnitude more expensive verification methods such as emulation cannot catch all the bugs that can provide the much-needed assurance for security and safety. Formal verification is the only verification technique that can provide “proofs” of bug absence and also provide provable evidence of bugs in the design, and specifications.

Axiomising correctness

We developed a new formal proof kit using the industry standard, non-proprietary SystemVerilog assertions (SVA) for establishing ISA compliance based on the published RISC-V ISA and used this to determine if a given processor implementation conforms to the ISA semantics. Our solution is completely formal tool vendor agnostic, requires no modification of the design, and only requires the end user to be familiar with Verilog/VHDL. Moreover, our methodology is reusable for other RISC-V  processors. Our “axiomising” methodology is built using the open standard of SystemVerilog assertions (SVA) and is enhanced for performance with our abstractions and problem reduction tool kit.  We have deployed our methodology in practice to ensure that we can establish that all functional/safety/security/power requirements of a processor are turned into provably valid axioms (properties) of the design implementation.

Methodology

The general strategy for checking relies on observing the path of all instructions from the point of issue to the point of completion using observation models where non-determinism allows for other instruction inter-leavings to intersperse between the issue and completion of the specific instruction being checked. This ensures that we can catch all kinds of bugs that are control and concurrency dominated and can cause an incorrect result to occur. It is important to emphasise that checking has to be done end-to-end for every instruction with non-determinism as in practice most bugs are missed due to very directed nature of tests where they specifically explore a directed path from issue to completion without necessarily allowing for other instructions to interfere. For most instructions, we usually write one check built around observing cause/effect relationship. The check for LOADS and STORES are built with a full interface to the memory whereby we check that all variants of LOADS and STORES are checked – BYTE, WORD, HALF-WORD, aligned and misaligned. This allowed us to examine all possible dependencies between different flavours of LOADS and STORES, especially when you have a sequence of these directed to the same address.

In some cases, during the development, we also caught bugs introduced by us in the test development phase. Most of these were due to errors in our understanding of the RISC-V ISA. Thankfully with feedback from designers, we were able to align this to what the ISA mandated.

Performance

All our proof checks converge – meaning they either produce a Pass or a Fail outcome. When the outcome is a Pass, it means a given check is valid for the implementation under all possible combinations of inputs and states. In case of a failure, it indicates there is a mismatch between the design model and our checker model, which is then debugged to root-cause the issue. We were able to exhaustively prove conformance of the design implementation against our proof kit checks for RV32IC subset of RISC-V ISA for a family of RISC-V processors from the PULP platform group using any of the available off-the-shelf formal verification tools from Cadence® Design Systems, Synopsys® Inc, Mentor® – a Siemens Business and OneSpin Solutions®. Some tools were able to produce results within a few hours, and a few others took about 72 hours, but the result is the same with all the tools – what proves in one is proven in another, what fails in one remains failed in another. Within an hour, we can get proofs to converge on all the checks other than LOAD/STORES, with over 75% of the non-LOAD/STORE checks done in 7 minutes of wall clock time.

Bugs and Proofs

When we applied our methodology to the well-known 32-bit low-power processors zeroriscy and microriscy [13] available in the public domain that has been previously taped out in 40NM ASICs and FPGA, we found 66 bugs as well as exhaustively proved a range of end-to-end functional properties that establish the correctness of the processor. A lot of these bugs are due to a buggy debug unit which is planned for redesign from what we understand. We also found bugs related to malfunctioning interfaces, caught unintended logic interfering with the normal functioning of LOAD instructions, and some power issues related to redundant clocking that was not related to the faulty debug unit. In some cases, we analysed the design for livelocks and were able to establish that it didn’t have any. We caught bugs in the design where a LOAD instruction was implementing a non-standard version of LOAD, causing the standard ISA specified the behaviour to fail. We observed that standard interpretation of the STORE instruction was altered for specific micro-arch implementations without a specification saying so.

We recently analysed another core ibex [14] under development by lowRISC® for ISA compliance. Using our proof kit, so far, we have found six new bugs related to six branch instructions. It took us an hour to set it up, find missing design files, and within a couple of hours, we had all the results. On a closer look, they seem to be all related to the debug unit, which is still under development. Upon masking the debug logic in the design by introducing an assumption in our test environment, we can prove that these bugs disappear and checks that were previously failing now hold exhaustively. All ISA properties were exhaustively proven in this case as well.

Coverage and Completeness

We used a range of coverage techniques harnessing mechanisms in the tools that we used and complemented it with our techniques to ensure we built a complete solution. Being able to use different coverage solutions from different vendors offers the benefit of finding any discrepancies that you may miss if you were using only one specific tool. We introduced hand mutations (bugs) in the design to check if our properties were sensitive to them. This way, we established that our checks were indeed doing the right thing. It is a very effective, efficient and economical method in terms of run time performance to find problems with testbench. We exploited this technique heavily. We then reviewed whether we have mapped all the instructions to checks in our test environment, so we can determine if we are complete with respect to the ISA. Structural and functional coverage metrics indicated if there were any blind spots in our verification models with over-constraints.

Related Work

In the context of processor verification, the use of formal verification has been around for at least three decades. The earliest applications made use of theorem provers, and the emphasis has largely been on establishing functional correctness of processor models designed in the language of the underlying proof tool. Ken McMillan devised advanced problem reduction techniques and showed how out-of-order processor implementations could be verified with formal when the processor design was coded in the formal tool SMV [4]. However, widespread applications of formal verification at the commercial level have largely been limited with notable exceptions being the pioneering work done by Boyer Moore to verify the floating point unit of AMD processor using ACL2 theorem prover [3], Intel using their proprietary tool Forte [5], IBM using its proprietary tool Rulebase Sixth Sense [6], and more recently Arm [7] who reported developing a tool to verify ISA compliance of their implementations. Alglave et al.’s work [16] on relaxed memory models and our work [15] on using Event-B for formal modelling, testing and verification of HSA memory models are other important pieces of work in the context of formal methods and processor verification.

In the context of RISC processors, S. Tahar’s work on formal verification of the first generation of RISC processors [2] is important. He devised a framework of correctness and formalised it inside the HOL 4 theorem prover. For RISC-V, a lot of effort has kicked-off in applying formal verification. On the one hand, formal tools such as Coq and Bluespec are used to model the ISA of RISC-V [8] with the intent of finding consistency issues; and on the other hand property checking based solutions have been used for finding micro-architectural bugs as well as establish conformance to the ISA [9, 10]. Whereas the work done by Chilipala in [8] is indeed valuable, it is not clear how many users not familiar with Coq and BlueSpec can make use of it. A lot of hardware design is done using standard VHDL/Verilog, and it is not clear how the work done in [8] can be of direct use in verifying such designs without a substantial investment in enabling people in writing processor designs in BlueSpec, and at least having a basic understanding of Coq formalisms.

We believe that the work done by Hummenberger et al. in [9] and the OneSpin Solutions [10] appears to be directly useful  however both of these solutions requires the end user to be locked into the respective proprietary tools. Although Hummenberger et al. [9], do have an open source version of their tool in the public domain, this requires a somewhat intrusive setup requiring the instrumentation of the RISC-V design with testbench interface signals and also in some cases requires modifications to suit the tool limitations as evidenced by some of their users [11].

As regards to the solution from OneSpin Solutions, it is developed around a OneSpin proprietary assertion language (called operational assertions), and the vendor claims that by using this they can guarantee a gap-free formal verification solution. However, to exploit this gap-free aspect, the user has to use the formal tools (DV-Verify® and DV-Certify®) from OneSpin.  There is no support for operational assertions inside any other formal tool. OneSpin Solutions does mention the usage of SVA and that there is a translator for operational assertions into SVA; however it has been made clear through their publications [12] that to obtain gap-free solution one has to use operational assertions not standard SVA to model requirements.

Conclusion

We aimed to carry out an end-to-end formal verification with the limitations (a) that our work was carried out away from designers with limited access (b) we had no prior knowledge of design microarchitecture and (c) we had no comprehensive, detailed specification describing interfaces and internal state-machines.

In many ways, our work represents the reality of modern-day design verification where busy designers do not often document a detailed micro-architectural specification, and the first lot of verification is often carried out locally by the designers themselves using simulation or the knowledge about the specification is passed through word-of-mouth to the verification engineers who verify the design independently.

To the best of our knowledge, this is the first formal verification solution that is independent of a specific formal verification tool vendor, uses the open standard of SVA that all tools support and requires minimal setup guaranteeing greater than 99% proof convergence.

You can take our proof kit today and use the formal verification tool of your choice and start finding bugs in your RISC-V designs and build proofs of bug absence. Contact us to request a demo.

References

[1]     A Proof of Correctness of the VIPER Microprocessor, VLSI Specification, Verification, and Synthesis, A. Cohn, 1988.

[2]     A Practical Methodology for the Formal Verification of RISC Processors, S. Tahar et al., Formal Methods in System Design, 1998.

[3]     A Mechanically Checked Proof of the AMD5K86TM Floating-Point Division Program, J. Moore et al., IEEE Transaction on Computers,                 1998.

[4]     Verification of an Implementation of Tomasulo’s Algorithm by Compositional Model Checking, K. McMillan, Proceedings of CAV, 1998.

[5]     Practical Formal Verification in Microprocessor Design, R. Jones et al., IEEE Design and Test of Computers, 2001.

[6]     Integrating FV into Main-Stream Verification – The IBM Experience, J. Baumgartner, 2006.

[7]     End-to-End Verification of ARM Processors with ISA-Formal, A. Reid et al., Proceedings of the CAV, 2016.

[8]     Strong Formal Verification for RISC-V, A Chilipala, RISC-V Workshop, 2017.

[9]     Formal Verification of RISC-V Processor Implementations, Humenberger et al., RISC-V Summit, 2018.

[10]  Complete Formal Verification of RISC-V processor IP for Trojan-Free Trusted ICs, GOMAC, 2019.

[11]   Synthesising Ibex with Yosys, Github Note., https://github.com/lowRISC/ibex/issues/60

[12]  Complete Formal Verification of TriCore2 and other processors, Bormann et al., DVCon 2007.

[13]  Slow and steady wins the race? A Comparison of ultra-low power RISC-V cores for Internet-of-Things applications. P D. Schivaone et               al. International Symposium on Power and Timing Modelling, Optimization and Simulation (PATMOS), 2017.

[14]  lowRISC/ibex. https://github.com/lowRISC/ibex

[15]  Formal Modelling, Testing and Verification of HSA memory models using Event B, A. Darbari et al., ArXiV,  2016.

[16]  The Semantics of Power and ARM Multiprocessor Machine Code, Alglave et al., DAMP 2009.