Verification: A Crisis of Confidence and How to Resolve It Part 1

Fabiana Muto

2 months ago

Verification: Crisis of Confidence - I

4 min Read

Highlight

marketing@axiomise.com

+44 1442 345046

While human ingenuity has driven extraordinary advances in technology, from early architecture to modern semiconductor systems, the discipline of testing and verification has not evolved at the same pace, creating a critical gap in ensuring quality, safety, and security in an increasingly complex digital world.

Verification: A Crisis of Confidence and How to Resolve It Part 1

Ashish Darbari

In my previous two blogs, I traced the ongoing evolution of formal. In this three-part series, I will examine some of the challenges related to verification in the semiconductor industry and explain why verification has become a crisis of confidence. I’ll explore some of the key reasons that, despite astronomical growth of constrained random, emulation, and FPGA prototyping, we continuously grapple with the challenges of poor quality verification. This crisis of confidence has reached fever pitch: verification schedules routinely run late, bugs are often missed, silicon re-spins happen, or even worse, disgruntled customers just walk away from projects, leaving you hanging out to dry! When did this start? How did we get ourselves into such a mess? More important, how do we get ourselves out?

The Verification Meta Model

To better understand this situation, we need to appreciate that verification is never done in isolation: it is part of the bigger picture. Verification is one of the most expensive activities in any semiconductor design project. It is impacted by four important factors, all of which need to interplay with each other in a logical fashion: the process, the verification technology, the verification methodology, and the engineers that use these—humans in the loop. These components represent the four pillars of what we call a verification meta model.

Just like models used in engineering, this meta model assumes the presence of some inputs and produces outputs, and the model’s behavior is an expected relationship between its inputs and outputs that this model is meant to preserve. For our meta model, to establish the expected relationship between inputs and outputs, the model uses a core—the process—and three primary inputs, which are verification technology, verification methodology, and a DV engineer.

To ensure that we’re all on the same page, let’s expand upon the idea of each of the four pillars of our verification meta model:

Process: A series of actions that are executed to fulfill an end goal. In this sense, you can think of a process as a description of concrete things one needs to execute in order to carry out a task. For our meta model, the task itself could be a complex set involving not only verification in this case, but also other tasks needed to enable verification, for example, training an engineer in a certain skill before he/she can become productive. A process in our meta model is a collection of heterogeneous tasks that need to be performed to obtain high quality verification without exceeding project deadlines. “High quality” means finding as many bugs as possible early in the design cycle and ensuring that none leak through at the end of the project.

Verification Technology: An application of a specific verification technique to find bugs. In some cases, such as formal, the verification technology is used to build exhaustive proofs of correctness and/or compliance. It could also be a directed testing technique to establish that SoC bring up is being done successfully. Examples of well-known verification technologies include dynamic simulation, formal, emulation, or FPGA prototyping. Each of these technologies describes a useful and complex set of techniques that can be applied in practice, involving tools and methods to perform verification.

Verification Methodology: The overall ethos guiding all aspects of verification as it relates to, and interplays with, the design process. Whereas verification technology typically outlines the core principles of how a certain technology works in practice and provides tools (such as simulators, emulators, model checkers, or FPGA platforms), methodology is necessary to partner the right application of a verification technique with the appropriate time and people in the design process.

DV Engineer: A design verification engineer responsible for building the hardware designs correctly (i.e., per the outlined requirements) without introducing or leaving bugs in the design. This could be a designer who is bringing up his own design and should be careful not to introduce bugs in the design, or it might be a conscientious verification engineer responsible for flushing out all the bugs in a design created by others.

Expected Behavior of the Verification Meta Model

A high-efficiency verification meta model that is working correctly can be expected to produce high-quality verification using a process that prescribes the right combination of verification technologies and methodologies to maximize ROI and minimize risks. The end goal, of course, is to apprehend as many bugs as possible early in the DV cycle, thereby accomplishing the shift-left verification paradigm, which economizes both time and resources. In other words, a good verification meta model does not leave bugs in silicon and certainly does not gift them to customers.

Seems straightforward enough, yes? Why, then, is it so hard to get a good verification meta model in place? Why is it the case that verification continues to be a bottleneck? Why does improper execution continue to lead to missed opportunities in the form of missed bugs, delayed schedules, frustrated management, unhappy customers, and stressed out DV engineers?

Getting to the Bottom of the Verification Breakdown

I see verification breakdown as the result of flaws in three main factors of a good process: planning, training/mentorship, and methodology.

Factor 1: Lack of Planning

One common response to the question, “Why is verification struggling?” is that there was a lack of planning. Planning itself is part of a well-oiled verification process; good planning falls out naturally from a good process. The thing to note here is that, whereas planning is essential to achieve high quality verification goals, it is not a pillar of the meta model in the way the process is; rather, a good process enables good planning, not the other way around.

Planning includes scoping out both the high-level and the low-level verification goals and describing a sequence of concrete actions required to be executed. The process, by contrast, pulls together various different plans in such a way that they work together and complement each other.

Factor 2: Lack of Sufficient Training and Mentorship

Ideally, organizations should have a plan not only for training DV engineers in required verification skills, but also outlining how a particular project would be designed and verified by the trained team of engineers. If we neglect to define and implement a plan for training new DV engineers on requisite verification skills, or if we fail to train experienced engineers on new verification techniques, then management will not be able to obtain good verification results. The lack of a verification training plan effectively means that bright, talented engineers must work on verification without the appropriate skills; as a result, they unwittingly cause massive delays to verification targets, the process yields poorly verified designs, or both. Actually, in my experience, most organizations do plan for training, but there doesn’t seem to be a process of identifying what classifies as good and relevant training. In many cases the training itself can be good, but I’ve come across situations where expensive purchased training was not only bad, but also got things wrong.

Once engineers have been trained, where do they go from there? The path from just having acquired fresh skills to delivering production-quality work is often long and doesn’t have clear milestones defined. How do organizations ensure that the trained engineer is able to apply his/her skills properly on the actual projects? The answer is good mentorship. When engineering teams fail to spot the value of a good mentor who can guide and support the freshly trained engineers, the whole team suffers, and so does the project. The reason I suspect this happens is because organizations don’t value the role of good mentors, and often they’re seen as a cost to the company as opposed to adding value. While it is true that there is a cost to having mentors whose task is not necessarily to tape out the next generation chip, but instead to enable a team to do this, the investment in a good mentor pays off several-fold, as the project is much more likely to succeed. This creates a win-win for the engineering team, the mentors, and the customers.

Factor 3: Lack of Cogent Verification Methodology

Even when project teams devote time and resources to comprehensive training and mentorship, their results are limited by the verification technologies in use. The unfortunate state of affairs in many companies is that there remains a gap in establishing sensible methodologies around these verification technologies. The lack of good methodology can have a debilitating effect on projects, even in spite of a reasonable expenditure on providing good training in basic verification skills in UVM, formal, or emulation.

In a nutshell, the crisis emerges due to a lack of a clear plan describing which verification technology or combination of technologies should be applied, as well as how, by whom, when, and with what intent. Often, it is the verification methodology that is missing from the best verification plans, which go awry as a result.

Verification, in a practical sense, is nothing more than mitigating risk. Why, then, is a plan for risk assessment often absent? An entire project team could be obsessed with meeting verification coverage goals, but thanks to metric-driven verification, the end goal of obtaining 100% coverage cannot be achieved without a verification meta model that optimally combines all the necessary components. Besides: 100% coverage alone is not the best sign-off criteria.

In the second installment of this series, I’ll give an overview of how to optimize the verification meta model and outline a design verification flow that can be applied to any team’s projects.

Is your team experiencing a verification breakdown as a result of any of the factors that I’ve outlined here? Are you facing challenges that I have not mentioned? Let me know in the comments or on Twitter (@AshishDarbari).

This article was first published on the Tech Design Forum. It is reproduced here in its original form for informational and archival purposes, with appropriate acknowledgment to the original publisher.

← Verification: A Crisis of Confidence and How to Resolve It Part 3 Verification: A Crisis of Confidence and How to Resolve It Part 2 →