Jul 26 2006
Semiconductor Research Corporation (SRC) has announced the unprecedented development of chips that refuse to fail. Joint research by SRC, the National Science Foundation (NSF) and the University of Michigan will focus on analysis of the future landscape of hard silicon failures and their impact on non-trivial designs, such as microprocessors and their switch components. Success by the collaborative research effort of government, business and academia may provide the key to the future reliability of smaller semiconductor designs.
"In this project, we'll go much further than before by designing chips that can diagnose when components wear out and heal themselves on the fly," said Sankar Basu, program director at NSF. "The bolstering of scientific underpinnings of computing is extremely important to the NSF. This issue of ensuring reliability is critical to the future of high-performance computing for even the most aggressive of applications."
Current industry efforts to make chips more reliable, through redundancy and other traditional means, involve both higher costs and the sacrifice of the speed that consumers have come to expect in nearly all electronics, from servers to cell phones to transportation. In comparison, results from today's announcement of collaborative research are projected to provide defect-tolerant designs that will increase product lifetime through components that take longer to fail. Without innovative approaches to address in-field silicon failures, product lifetime will become dangerously short.
"The aim is for chips that won't fail. That will be a first for the industry. The ramifications of increasing the reliability of the microprocessor in computing applications like planes, trains and automobiles is something we get very excited about," said William Joyner, SRC's director of Computer-Aided Design and Test for the Global Research Collaboration (GRC), a unit of the SRC that is responsible for narrowing the options for carrying CMOS to its ultimate limit. He is an IBM assignee to the consortium. "To continue the performance pace that billions of people have come to expect, we need more than technology advances. Sustained performance improvements require a critical coupling between technology and design."
Reliability of complex systems has become increasingly difficult to model since more factors must be considered, from defects in circuits and wires on silicon, to failures in the software applications that the system runs. The research will entail the development of both straight-forward and intuitive silicon-failure models, and a fast, accurate reliability modeling infrastructure, that allow designers to better understand the reliable system design space and to evaluate the robustness of potential solutions.
"The solution is not to build flawless chips, but architectures that can survive defects," said Dr. Todd Austin, associate professor of electrical engineering at University of Michigan and a former Intel design engineer. Dr. Valeria Bertacco, co-investigator and an assistant professor at the University of Michigan, adds, "We've not given up on making semiconductors always correct. Rather, we're facing up to the looming problem in the chip industry -- smaller switches and wires don't always work."
Benefits of the research will serve chipmakers and end-users for communications, computing, aeronautics and aerospace applications, medical devices, automotive and consumer electronics, and a wide range of other applications that are dependent on silicon's correct performance.
Today's announcement is the result of rigorous competition over many months under the SRC-GRC's Computer-Aided Design and Test Thrust. SRC and NSF selected the University of Michigan's team to fund for three years. SRC facilitates semiconductor research among its community of 23 companies and partners and 100 universities worldwide.
http://www.src.org