When “Silicon Proven” Is Not Good Enough

Highlight

marketing@axiomise.com

+44 1442 345046

From Meltdown to Spectre, hidden vulnerabilities shook the foundations of modern computing—revealing that security isn’t just a software problem, but a critical hardware challenge we can no longer afford to ignore.

When “Silicon Proven” Is Not Good Enough

Ashish Darbari

Who would have thought that the New Year would start off on such a rocky note for … well, pretty much everyone on the planet who uses a computer? I must admit, I didn’t pick up on the signs that the trouble started brewing way back in June. Now, having done some research into what has been going on, it appears to have been kept deliberately under wraps, so that the hackers out there couldn’t find a way into it.

Yes, you guessed it: I am talking about the Meltdown!

As it happens, this has been caused by two spectacular vulnerabilities affecting all processors, and it hardly surprises me that one of these has been given such an appropriate moniker. Meltdown and Spectre are certainly two things that are keeping Intel, ARM, AMD, Apple, Microsoft, Google, Amazon and Facebook awake at night. Did I miss anyone? Oh! Yes! The millions (or possibly billions) of users of PC desktops, laptops, tablets, mobile phones who all are affected. I’m tossing and turning a bit, myself.

I’ve been thinking about the situation not just from a verification angle, but also from a validation perspective. Looking at the processor vendors, such as Intel, ARM and AMD, I find it difficult to believe that they are short on smart engineers or that they didn’t try hard enough to spot this problem. After all, the security attacks themselves can be side channel attacks, which have been around for a while. In this instance, side channel attacks can affect the CPUs predominantly due to the well-known speculative out-of-order execution optimization built into virtually all modern-day CPUs that run our tablets, phones, laptops, desktops, and servers. I doubt that people are lacking in intelligence and simply missed this. Assuming that that is the case, how could this have eluded so many for so long?

The reason, in my view, is that security is not seen as a hardware issue. Cybersecurity is seen mostly as an application software and an operating system problem; hence, all the investment—products, security companies, and research—around the world is in that space. After all, we all know how to install and update an anti-virus/anti-malware program.

Security is a software issue? Wrong. Very wrong, as it has become painfully obvious. Suddenly, even my mother-in-law is asking: “What’s this problem about passwords leaking? I read this in the Times of India.” It’s on everyone’s radar.

Security is a system issue, so security needs to be designed and architected up front. We cannot continue to use legacy optimizations from the 1990s and remain wilfully ignorant of the fact that they can be compromised for security. Speculative execution is needed for performance, and has been around for a long time; virtually all CPUs are designed that way. Side channel attacks are not new. The knowledge that there are hackers out there is not new either.

What is new is that the mindset for verification and validation needs to change. When Ken McMillan published his classic paper in 1999 on formally verifying Tomasulo’s out-of-order algorithm, security was not such a big deal. Now, with the mass impact and ever-growing volume of people using computers and connected devices, security is the biggest of big deals.

Verification alone is not enough. Thorough validation also is needed. Conventional techniques based on dynamic simulation cannot scale, and mindless emulation won’t either. What we need to employ is an array of more advanced verification techniques, such as formal, to help with verification and validation.

I’d like to back up my recommendations with a personal story to encourage you to believe that formal verification provides a game-changing way to address security.

It was not too long ago that I supported a young engineer in the verification of a design. It was far less complex than any of the CPUs we are talking about here, but the design had all the characteristics you’d see in an Intel or an ARM CPU. I stumbled upon something interesting. This design had a customisable CPU that was meant to support any firmware/software interface and had lots of interesting interfaces. The architects were confident that they had a decent architecture in place and that the microarchitecture (design) implementing it had been tested with basic C simulations. We picked this project because there were not enough resources, and I always looked for new ways of applying formal.

One of the first things I always check in a verification plan is whether the scope of the runtime features of the design is captured in the verification plan. As this design could be programmed through a software interface, and had regions of memory that had privileged access, as well as unprivileged access, it was clear that we needed to check that any unprivileged access to the privileged region was not allowed. Although the architects were aware of the concerns and had taken precautions in designing the instruction set architecture and the micro-architecture (of the design itself), there were plenty of instances when we found vulnerabilities through our formal testbench. In some cases, the vulnerabilities were plainly oversights on the part of the designer or were design bugs where the implementation didn’t conform to the known expectation (which, in some cases, was fully specified, but in other cases, it had only been specified partially).

However, the most important aspect of our work was that we exposed some combinations of instruction execution that were illegal from the security view point. These could have ended up providing illegal access to regions of memory, or, in the presence of an attack, create a livelock in the design.

These validation issues got the architects and designers really hooked on formal and on what we were doing. Not only did we end up doing the standard functional verification of that design, but also, we specifically targeted a range of security issues that we suspected could be triggered through software and firmware. We provided proofs in a formal tool that they were there, and that if the design was protected through exception recovery hardware, we could protect the hardware against explicit breaches. Having the ability to run a formal tool and model checks that reflect the actual suspected attacks was the most fun part of this work. We didn’t have to load the firmware or software to carry out this analysis, The formal tools were creating all of the interesting scenarios for us; we were only asking the relevant questions in form of formal properties. As my former doctorate supervisor, Tom Melham, said: “Ashish, it’s all in the target goal that you’re trying to prove. If you phrase the goal well, most of the work is done.” 

My conclusion at the time, and even now, is that we often fail to ask the right question at the right time.

Time and time again, I heard some version of, “We don’t need to worry about verification! Our design is an iteration of something we used before—so it is silicon-proven!” Well, as it turns out, “silicon-proven” is not enough. Modern-day formal verification technology can address serious security challenges facing our industry. All you need is a methodology to apply it and the expertise to ensure that you ask the right questions to reach your goal. I strongly encourage all of you to consider actively deploying formal for security in your design verification. As an industry, we can—we must—do better.

This article was first published on the Tech Design Forum. It is reproduced here in its original form for informational and archival purposes, with appropriate acknowledgment to the original publisher.