Artificial intelligence (AI) and machine learning (ML) serve as the fuel source for new semiconductor architectures, and as a result, the two also leave a new string of verification and development challenges in their wake. With advanced algorithms and neural network systems, test architectures will have to find a way to keep up.
For some insight on how verification may evolve, ECN picked the brain of an expert in the field.
Q: What changes need to be made to verification to enable AI/ML chip designs?
By Frank Schirrmeister, senior group director for product management, Cadence Design Systems
Given the uncertainty for first-time physical design at advanced technology nodes, functional verification of the designs needs to be watertight. To achieve that, users look for the best verification throughput with the most cycles executed per day, and investments in tool and man-hours. These must be combined with smart bug-hunting techniques to make the most efficient use of the cycles invested, finding the most bugs per day.
To support these diverse requirements, verification engines must scale to billions of gate designs. With design teams running through the loop of compile, allocate, execute, and debug thousands of times, each individual step must be optimized, like it has been, for instance, in processor-based emulation. Compilation of the design can be done with a predictable, fast compile that can be achieved using parallelization. Given that AI/ML designs are often fairly regular, allocation of designs into the verification engines must support different design sizes, from IP in the MG range to sub-systems and SoCs in the billion-gate range, with the performance of the execution of designs needing to scale to billion-gate capacities. Finally, given the enormous amount of data generated using debug, flexibility of debug must span from streaming techniques, allowing specific known signals to be observed to the collection of all data over a certain amount of time for cases in which the area of debug is not yet known. This poses unique challenges on data transfer rates and triggers mechanisms to collect data.
Finally, from a systemic perspective, the analysis of AI/ML-specific IP integration becomes part of the verification process, and modeling of system environments—like HBM memories, for instance—must be supported.