Seminars & Colloquia
Keith Bowman
Intel
"Managing the Impact of Parameter Variations on Logic Design "
Monday October 05, 2009 04:00 PM
Location: 3211, EB2 NCSU Centennial Campus
(Visitor parking instructions)
This talk is part of the System Research Seminar series
Managing the impact of device and circuit parameter variations on performance and power is one of the primary challenges in microprocessor designs. Parameter variations may be categorized based on the temporal scale (static and dynamic) and the spatial scale (die-to-die and within-die). For static variations, a statistically-based analytical model of the maximum clock frequency (FMAX) for a microprocessor is presented to elucidate the adverse effects of parameter variations on the FMAX distribution. Validated with measured data from 0.25um and 0.13um microprocessors, the model reveals that within-die parameter variations directly impact the FMAX mean and die-to-die parameter variations impact the FMAX variance. Adaptive supply voltage (Vcc) and body-bias compensation circuits as well as time-borrowing multi-cycle interconnect circuits are presented as examples of static variation-tolerant logic design.
Dynamic variations further degrade the FMAX of a conventional microprocessor by requiring a clock frequency (Fclk) guardband to ensure correct functionality within the presence of worst-case dynamic variations over the microprocessor lifetime. Consequently, these inflexible designs cannot exploit opportunities for higher performance by increasing Fclk or lower power by reducing Vcc during favorable operating conditions. Since most systems usually operate at nominal conditions where worst-case scenarios rarely occur, these infrequent dynamic variations severely limit the performance and energy efficiency of conventional designs. A resilient design is a system with error-detection and error-recovery capabilities to maintain overall correct system functionally within the presence of errors. Resilient circuits enable the microprocessor to operate at an Fclk determined by nominal operating conditions. When dynamic variations induce a timing error, the error is detected and corrected to maintain proper logic functionality, thus effectively eliminating the Fclk guardband for dynamic variations. A 65nm resilient circuit test-chip with timing-error detection and recovery circuits is described. Silicon measurements indicate that resilient circuits enable 25-32% throughput gain at equal Vcc as compared to conventional circuits, resulting in a 17-21% throughput gain at iso-power or a 31-37% power reduction at iso-throughput. Future research opportunities to further enhance the performance and energy efficiency through resilient designs are discussed.
Keith A. Bowman received the B.S. degree in electrical engineering from North Carolina State University, Raleigh, NC in 1994 and the M.S. and Ph.D. degrees in electrical engineering from the Georgia Institute of Technology, Atlanta, GA in 1995 and 2001, respectively. He is currently a Staff Research Scientist in the Circuit Research Lab (CRL) at Intel Corporation in Hillsboro, OR. From 2001 to 2004, he worked as a Senior Computer-Aided Design (CAD) Engineer in the Technology-CAD Division at Intel in Hillsboro, OR, where he developed and supported statistical-based models, methodologies, and software tools to predict microprocessor performance and power variability. Since joining CRL in 2004, his research has focused on the development of circuit design solutions to mitigate the impact of parameter variations on microprocessor performance and power.
Special Instructions:
Host: Eric Rotenberg/Frank Mueller, ECE/CSC