It’s the dead of summer: What better time to talk about thermal issues? This month we have a few articles covering the topic. There is one on 3-D thermal modeling and another from the iNEMI Roadmap that summarizes future requirements for global thermal management in electronic products. What you might not get from reading these articles is just how hot a topic this is becoming.
For those of you into gaming, or who have children who play, the latest from Microsoft regarding the seemingly doomed Xbox 360 may be firsthand news. For those who aren’t up to speed on the problems, allow me to summarize.
Most of the blogging on this topic has been relegated to the gaming sites, that is, until recently. With Microsoft’s announcement of a second warranty extension, now to three years from one (the original warranty was 90 days, extended to one year in 2006) and the writedown of over $1 billion to cover costs of repairs, hardware failures have become front-page news.
The failures have Xbox 360 customers seeing red – literally, as in three red lights. These lights indicate the dreaded Red Ring of Death – hardware malfunction. As of this writing Microsoft has not disclosed a specific cause for the failure, but has indicated the Xbox 360 hardware will be redesigned to better handle heat dissipation issues. For those of us with some PCB design, fabrication and assembly savvy, it’s pretty easy to put together the list of probable suspects.
Of course, suspect No. 1 is none other than the now-infamous Lead-Free Bandit. If it plays out that lead-free solder joints are a contributing factor in these failures, the RoHS Directive can add an additional $1.15 billion to the losses incurred by industry in the cost of compliance. Unfortunately, this is going to be tough to prove because it is unlikely that Microsoft will come back with a non-RoHS compliant Xbox 360 version to test the theory.
It is easy, of course, to jump to the lead-free is responsible conclusion. We can agree that it may well be a contributing factor. But it is probably not the root cause. Microsoft is mum, but repaired units reportedly have a new GPU heatsink with a heatpipe to a secondary heatsink to remove heat from the PCB. As the DailyTech Web site (dailytech.com/article.aspx?newsid=7690) further points out, repaired units also have had epoxy underfill added to CPU and GPU packages. These are actions that certainly support potential overheating and resulting thermal cycling as a potential root cause for the failures.
The big question is: Could this have been avoided? Would an improved thermal management approach, taken in the initial design phases of the product, prevented failures – not to mention the subsequent financial hit? Good questions that will probably never get answered.
It should come as no surprise that a high-end gaming box would have a large thermal dissipation requirement. Gaming electronics are prime heat generators, coupling high clock speeds with dense circuitry all crammed into the smallest possible package at the lowest possible cost.
Dr. Robin Bornoff, on pg. 35, points out that something as simple as silver- or copper-plated filled vias sited under the BGA can act as mini heat directors, taking heat from the chip package and redirecting it to copper ground layers in the PCB, where heat is better dissipated. This is just one example of the tools that designs have to work with to prevent thermal disaster. Were they applied in this case?
Depending on the chip used, the package style and even the PCB design, more or less heat will be generated that requires dissipation. The iNEMI Roadmap explains that any amount of heat generated can be dissipated. The problem is: What will it cost? Heat dissipation strategies have a nonlinear cost of ownership. As Chuck Richardson notes (pg. 38), “At 65 watts, the cost (of heat dissipation) is only $0.18 per watt, but at 75 watts, it is more than $0.45.†And in the consumer electronics world, unit cost control is king.
So, perhaps the Red Ring of Death is really a lesson in economics. The thermal dissipation solution with the best cost of ownership was selected over more effective solutions with higher price tags. But with field failure rates cited at 25 to 35%, depending on the source, the long-term economics of this decision come into question.
If you want to follow the redesign details, there is a teardown of the new Xbox 360 Elite featured at llamma.com/xbox360/news/images/xbox-360-elite/360_elite_mobo.jpg. I am sure some readers will be able to add volumes to the observations made by Llamma folks. I look forward to hearing about your conclusions from the before and after designs. (We’ll post observations on the Web site.) Good thermal management practices should always be on our minds, even after the heat from the Xbox has dissipated.