Guide to Reliability of Electrical/Electronic Equipment and Products--Thermal Management

Home | Articles | Forum | Glossary | Books


1. INTRODUCTION

The objective of thermal management is the removal of unwanted heat from sources such as semiconductors without negatively affecting the performance or reliability of adjacent components. Thermal management addresses heat removal by considering the ambient temperature (and temperature gradients) throughout the entire product from an overall system perspective.

Thermal removal solutions cover a wide range of options. The simplest form of heat removal is the movement of ambient air over the device. In any enclosure, adding strategically placed vents will enhance air movement. The cooling of a critical device can be improved by placing it in the coolest location in the enclosure. When these simple thermal solutions cannot remove enough heat to maintain component reliability, the system designer must look to more sophisticated measures, such as heat sinks, fans, heat pipes, or even liquid-cooled plates.

Thermal modeling using computational fluid dynamics (CFD) helps demonstrate the effectiveness of a particular solution.

The thermal management process can be separated into three major phases:

1. Heat transfer within a semiconductor or module (such as a DC/DC converter) package

2. Heat transfer from the package to a heat dissipater

3. Heat transfer from the heat dissipater to the ambient environment

The first phase is generally beyond the control of the system level thermal engineer because the package type defines the internal heat transfer processes. In the second and third phases, the system engineer's goal is to design a reliable, efficient thermal connection from the package surface to the initial heat spreader and on to the ambient environment. Achieving this goal requires a thorough understanding of heat transfer fundamentals as well as knowledge of available interface and heat sinking materials and how their key physical properties affect the heat transfer process.

2. THERMAL ANALYSIS MODELS AND TOOLS

Thermal analysis consists of calculating or measuring the temperatures at each component within a circuit or an assembly. Thermal analysis, which is closely related to stress derating analysis, concentrates on assuring both freedom from hot spots within equipment and that the internal temperature is as uniform and low as feasible and is well within the capabilities of the individual components.

Arrhenius theory states that chemical reaction rates double with each 10°Cin crease in temperature. Conversely, reducing the temperature 10°C will reduce the chemical reaction rate to one-half. Many part failures are attributable to chemical activity or chemical contamination and degradation. Therefore, each 10°C reduction of the temperature within a unit can effectively double the reliability of the unit.

As printed wire assemblies (PWAs) continue to increase in complexity, the risk of field failures due to unforeseen thermal problems also increases. By performing thermal analysis early in the design process, it becomes possible to ensure optimal placement of components to protect against thermal problems.

This in turn minimizes or eliminates costly rework later.

The sweeping changes taking place in the electronics and software industries are resulting in dramatic improvements in the functionality, speed, and compatibility of the computer-aided design (CAD) tools that are available. As a result, thermal modeling software is gaining widespread use today and is now part of the standard design process at most major electronics manufacturers around the world.

Modern electronic systems incorporate multitudes of components and sub assemblies, including circuit boards, fans, vents, baffles, porous plates [such as electromagnetic interference (EMI) shields], filters, cabling, power supplies, disk drives, and more. To help designers cope with this complexity, the most advanced thermal modeling solutions provide a comprehensive range of automated soft ware tools and user-friendly menus that provide easier data handling, faster calculations, and more accurate results.

Thermal modeling has migrated from system to PWA, component, and environment levels. Modeling was first applied at the system level in applications such as computer cabinets, telecommunication racks, and laptop computers.

However, as the need for thermal modeling has become more pressing, these same techniques have migrated downward to board- and component-level analysis, which are now commonplace in the electronics design industry. The most advanced thermal modeling solutions allow designers to predict air movement and temperature distribution in the environment around electronic equipment as well as inside it and determine true airflow around the component. A three-dimensional package-level model can include the effects of air gaps, die size, lead frame, heat spreader, encapsulant material, heat sinks, and conduction to the PWA via leads or solder balls. In this way, a single calculation can consider the combined effects of conduction, convection, and radiation.

An example of the complexity of the airflow patterns from the system fan in a desktop PC are illustrated in Figure 1 using a particle tracking technique (see color insert).


FIGURE 1 Airflow through a PC chassis. (See color insert.)

To be most effective, thermal analysis tools must also be compatible and work with a wide range of mechanical computer aided design (MCAD) and electronic design automation (EDA) software. This is easier said than done given the wide range of formats and the different levels of data stored in each system.

Once modeling is complete and prototype products have been built, thermal imaging is used for analyzing potential problems in circuit board designs (see Fig. 24 of Section 3). It can measure complex temperature distributions to give a visual representation of the heat patterns across an application. As a result, de signers may find subtle problems at the preproduction stage and avoid drastic changes during manufacturing. The payback is usually experienced with the first thermal problem uncovered and corrected during this stage. Several examples are presented to demonstrate the benefits of various modeling tools.

Low-Profile Personal Computer Chassis Design Evaluation

The thermal design of a PC chassis usually involves compromises and can be viewed as a process of evaluating a finite set of options to find the best balance of thermal performance and other project objectives (e.g., noise, cost, time to market, etc.). The designer need not be concerned about predicting temperatures to a fraction of a degree, but rather can evaluate trends to assess which option provides the lowest temperature or best balance of design objectives (2).

Use of thermal models can give a good indication of the temperature profile within the chassis due to changes in venting, fan placement, etc., even if the absolute temperature prediction for a component isn't highly accurate. Once a design approach is identified with thermal modeling, empirical measurements must follow to validate the predicted trends.

Some common sources of discrepancy between model predictions and measurements that must be resolved include.

1. Modeled and experimental component power do not match.

2. Fan performance is not well known.

3. Measurement error. Common errors include

a. Incorrectly placed or poorly attached thermocouple

b. Radiation when measuring air temperature. Nearby hot components can cause the thermocouple junction to heat above the ambient air temperature giving a false high reading.


FIGURE 2 Model of personal computer chassis. (From Ref. 2.)


FIGURE 3 Particle traces show that most fan airflow bypasses the microprocessor. (From Ref. 2.)


FIGURE 4 Vent modifications to Figure 3 provide strong airflow over microprocessor.

(From Ref. 2.)


FIGURE 5 Color-coded component surface temperature analysis pinpointing hot spots within a power supply.


FIGURE 6 CFD plot of Pentium II processor and heat sink. (See color insert.)


FIGURE 7 Temperatures in a PC motherboard. (See color insert.)

Figure 2 shows a three-dimensional model of a personal computer desktop chassis. Vents were located in the front bezel, along the left side of the chassis cover, and in the floor of the chassis just in front of the motherboard. Air entering the chassis through the vents flows across the chassis to the power supply (PS), where it is exhausted by the PS fan. A second fan, located at the front left corner of the chassis, was intended to provide additional cooling to the processor and add-in cards. This arrangement is quite common in the PC industry.

Several models were run initially, including one with the front fan on and one with the fan deactivated to determine its effectiveness in cooling the microprocessor. Particle traces are used to show the heat flow across various components and within the chassis enclosure. The particle traces in Figure 3 clearly show that the flow from the fan is deflected by the flow entering from the side vent, diverting the fan flow away from the processor site and toward the PS. This effect virtually negates any benefit from the second fan as far as the CPU is concerned. The flow at the processor site comes up mainly from the side vents.

Therefore, any increase or decrease in the flow through the side vent will have a more significant impact on the processor temperature than a change in flow from the fan.

The ineffectiveness of the system fan was compounded by the fan grill and mount design. Because of the high impedance of the grill and the gap between the fan mount and chassis wall, only 20% of the flow through the fan was fresh outside air. The remaining 80% of the air flow was preheated chassis air recirculated around the fan mount.

The analysis helped explain why the second PS fan reduced the flow through the side vent of the chassis. It also showed that the processor temperature actually declined when the second fan was shut off and demonstrated that the second fan could be eliminated without a thermal performance penalty, resulting in a cost saving. The analysis pointed the way to improving the thermal performance of the chassis. Modifying the chassis vents and eliminating the second PS fan provided the greatest performance improvement. Particle traces of the modified vent configuration demonstrate improved flow at the processor site (Fig. 4).

Use of Computational Fluid Dynamics to Predict Component and Chassis Hot Spots

Computational fluid dynamics using commercially available thermal modeling software is helping many companies shorten their design cycle times and eliminate costly and time-consuming redesign steps by identifying hot spots within an electronic product. An example of such a plot is shown in Figure 5. Figures 6 and 7 show a CFD plot of an Intel Pentium? II processor with a heat sink and a plot of temperature across a PC motherboard for typical operating conditions, respectively (see color insert).

(coming soon) TABLE 1 Power Dissipation Trends 1995-2012

3. IMPACT OF HIGH-PERFORMANCE INTEGRATED CIRCUITS

Increased integrated circuit (IC) functional densities, number of I/Os, and operating speed and higher system packing densities result in increased power dissipation, higher ambient operating temperature, and thus higher heat generation.

Today's ICs are generating double the power they were only several years ago.

(coming soon) TABLE 1 shows the projected power dissipation trend based on the 1997 Semiconductor Industries Association (SIA) National Technology Roadmap for Semiconductors (NTRS). Notice that in the near future, we will see microprocessors with 100-W ratings (something completely unimaginable in the 1980s and early 1990s). They are here now. This means that today's ICs are operating in an accelerated temperature mode previously reserved for burn-in.

The current attention being given to thermal management at the chip level stems mostly from the quest for higher microprocessor performance gained by shorter clock cycles (that is, higher frequencies) and denser circuits. In CMOS circuits, dissipated power increases proportionally with frequency and capacitance as well as with the square of the signal voltage. Capacitance, in turn, climbs with the number of integrated transistors and interconnections. To move heat out, cost conscious designers are combining innovative engineering with conventional means such as heat sinks, fans, heat pipes, and interface materials. Lower operating voltages are also going a long way toward keeping heat manageable. In addition to pushing voltages down, microprocessor designers are designing in various power-reduction techniques that include limited usage and sleep/quiet modes.

Let's look at several examples of what has been happening in terms of power dissipation as the industry has progressed from one generation of micro processors to the next. The 486 microprocessor-based personal computers drew 12 to 15 W, primarily concentrated in the processor itself. Typically, the power supply contained an embedded fan that cooled the system while a passive heat sink cooled the processor. However, as PCs moved into the first Pentium generation, which dissipated about 25 W, the traditional passive cooling methods for the processor became insufficient. Instead of needing only a heat sink, the processor now produced enough heat to also require a stream of cool air from a fan.

The Intel Pentium II microprocessor dissipates about 40 W; AMD's K6 microprocessor dissipates about 20 W. The high-performance Compaq Alpha series of microprocessors are both high-speed and high-power dissipation de vices, as shown in Table 2. The latest Intel Itanium? microprocessor (in 0.18-µm technology) dissipates 130 W (3).

Attention to increased heat is not limited to microprocessors. Power (and thus heat) dissipation problems for other components are looming larger than in the past. Product designers must look beyond the processor to memory, system chip sets, graphics controllers, and anything else that has a high clock rate, as well as conventional power components, capacitors, and disk drives in channeling heat away. Even small ICs in plastic packages, once adequately cooled by normal air movement, are getting denser, drawing more power, and getting hotter.

(coming soon) TABLE 2 Alpha Microprocessor Thermal Characteristics

In the operation of an IC, electrons flow among tens, if not hundreds, of millions of transistors, consuming power. This produces heat that radiates out ward through the chip package from the surface of the die, increasing the IC's junction temperature.

Exceeding the specified maximum junction temperature causes the chip to make errors in its calculations or perhaps fail completely. When IC designers shrink a chip and reduce its operating voltage, they also reduce its power dissipation and thus heat. However, shrinking a chip also means that heat-generating transistors are packed closer together. Thus, while the chip itself might not be as hot, the "power density"--the amount of heat concentrated on particular spots of the chip's surface-may begin to climb.

Although the air immediately surrounding a chip will initially cool the chip's surface, that air eventually warms and rises to the top of the personal computer chassis, where it encounters other warm air. If not ventilated, this volume of air becomes warmer and warmer, offering no avenue of escape for the heat generated by the chips. Efficient cooling methods are required. If not properly removed or managed, this heat will shorten the IC's overall life, even destroying the IC.

Heat buildup from ICs generally begins as the junction temperature rises until the heat finds a path to flow. Eventually, thermal equilibrium is reached during the steady-state operating temperature, which affects the device's mean time between failure (MTBF). As stated previously, a frequently used rule of thumb is that for each 10°C rise in junction temperature, there is a doubling in the failure rate for that component. Thus, lowering temperatures 10 to 15°C can approximately double the lifespan of the device. Accordingly, designers must consider a device's operating temperature as well as its safety margin.

=======

(coming soon) TABLE 3 Methods of Reducing Internal Package theta_jc

Increase the thermal conductivity of the plastic; ceramic; or metal package material, lead frame, and heat spreader.

Improve design of lead frame/heat spreader (area, thermal conductivity, separation from heat source).

Use heat spreaders.

Implement efficient wire bonding methods.

Use cavity-up or cavity-down package.

Ensure that no voids exist between the die and the package. (Voids act as stress concentrators, increasing Tj.)

=======

(coming soon) TABLE 4 Methods of Reducing Package theta_ca

Use high-conductivity thermal grease.

Use external cooling (forced air or liquid cooling).

Use high-performance heat sinks to significantly increase volumetric size such that the size benefit of VLSI circuits can be utilized.

Use materials of matching coefficients of thermal expansion.

Use package-to-board (substrate) heat sinks.

========

(coming soon) TABLE 5 Electrical and Thermal Characteristics of Some Plastic Packages

Junction temperature is determined through the following relationship:

Tj _ Ta _ PD (theta_jc _ theta_ca ) _ Tj _ PD theta_ja (1)

Junction temperature (Tj ) is a function of ambient temperature (Ta ), the power dissipation (PD), the thermal resistance between the case and junction (theta_jc ), and the thermal resistance between the case and ambient (theta_ca ) [junction to ambient thermal resistance (theta_ja ) _ theta_jc _ theta_ca ]. Here the following is assumed: uniform power and temperature distribution at the chip and one-dimensional heat flow.

Maximum junction temperature limits have been decreasing due to higher IC operating speed and thus increased power dissipation. The maximum allow able device operating temperature (Tj ) has decreased from a range of 125-150°C to 90°C for reduced instruction set computing (RISC) microprocessors and to less than 70°C at the core for high-speed complex instruction set computing (CISC) microprocessors, for example. This has a significant impact on device and product reliability. As a result, the IC package thermal resistance (theta_ja ) must decrease.

Since theta_ja consists of two components: theta_jc and theta_ca, both theta_jc and theta_ca must be reduced to reduce theta_ja.

Methods of reducing theta_jc and theta_ca are listed in Tables 3 and 4, respectively.

TABLE 5 (coming soon) lists the thermal resistance of various IC package types as a function of the lead frame material: Alloy 42 and copper. Notice that copper lead frames offer lower thermal resistance.

Many components and packaging techniques rely on the effects of conduction cooling for a major portion of their thermal management. Components will experience the thermal resistances of the PCB in addition to those of the semiconductor packages. Given a fixed ambient temperature, designers can lower junction temperature by reducing either power consumption or the overall thermal resistance. Board layout can clearly influence the temperatures of components and thus a product's reliability. Also, the thermal impact of all components on each other and the PWA layout needs to be considered. How do you separate the heat generating components on a PWA? If heat is confined to one part of the PWA, what is the overall impact on the PWA's performance? Should heat generating components be distributed across the PWA to even the temperature out?

(coming soon) TABLE 6 Thermal Conductivities of Various Materials

(coming soon) TABLE 7 Examples of Junction-to-Case Thermal Resistance

4. MATERIAL CONSIDERATIONS

For designers, a broad selection of materials is available to manage and control heat in a wide range of applications. The success of any particular design with regard to thermal management materials will depend on the thoroughness of the research, the quality of the material, and its proper installation.

Since surface mount circuit boards and components encounter heat stress when the board goes through the soldering process and again after it is operating in the end product, designers must consider the construction and layout of the board. Today's printed wiring assemblies are complex structures consisting of myriad materials and components with a wide range of thermal expansion coefficients.

TABLEs 6 and 7 (coming soon) list the thermal conductivities of various materials and specifically of the materials in an integrated circuit connected to a printed circuit board, respectively.

The demand for increased and improved thermal management tools has been instrumental in developing and supporting emerging technologies. Some of these include High thermal conductivity materials in critical high-volume commercial electronics components Heat pipe solutions, broadly applied for high-volume cost-sensitive commercial electronics applications Non-extruded high-performance thermal solutions incorporating a variety of design materials and manufacturing methods Composite materials for high-performance commercial electronics materials Combination EMI shielding/thermal management materials and components (see Section 3) Adoption of high-performance interface materials to reduce overall thermal resistance Adoption of phase-change thermal interface materials as the primary solution for all semiconductor device interfaces Direct bonding to high-conductivity thermal substrates At the first interface level, adhesives and phase-change materials are offering performance advantages over traditional greases and compressible pads.

Phase-change thermal interface materials have been an important thermal management link. They don't dissipate heat. They provide an efficient thermal conductive path for the heat to flow from the heat-generating source to a heat-dissipating device. These materials, when placed between the surface of the heat generating component and a heat spreader, provide a path of minimum thermal resistance between these two surfaces. The ultimate goal of an interface material is to produce a minimum temperature differential between the component surface and the heat spreader surface.

(coming soon) TABLE 8 Characteristics Affected by Elevated Temperature Operation

5. EFFECT OF HEAT ON COMPONENTS, PRINTED CIRCUIT BOARDS, AND SOLDER

Integrated Circuits

Thermal analysis is important. It starts at the system level and works its way down to the individual IC die. System- and PWA-level analyses and measurements define local ambient package conditions. Heat must be reduced (managed) to lower the junction temperatures to acceptable values to relieve internal material stress conditions and manage device reliability. The reliability of an IC is directly affected by its operating junction temperature. The higher the temperature, the lower the reliability due to degradation occurring at interfaces. The objective is to ensure that the junction temperature of the IC is operating below its maximum allowable value. Another objective is to determine if the IC needs a heat sink or some external means of cooling.

Thermal issues are closely linked to electrical performance. For metal- oxide semiconductor (MOS) ICs, switching speed, threshold voltage, and noise immunity degrade as temperature increases. For bipolar ICs, leakage current in creases and saturation voltage and latch-up current decrease as temperature in creases. When exposed to elevated temperature ICs exhibit parameter shifts, as listed in Table 8. One of the most fundamental limitations to using semiconductors at elevated temperatures is the increasing density of intrinsic, or thermally generated, carriers. This effect reduces the barrier height between n and p regions, causing an 8% per degree K increase in reverse-bias junction-leakage current.

The effects of elevated temperature on field effect devices include a 3- to 6-mV per degree K decrease in the threshold voltage (leading to decreased noise immunity) and increased drain-to-source leakage current (leading to an increased incidence of latch-up). Carrier mobility is also degraded at elevated temperatures by a factor of T_1.5 , which limits the maximum ambient-use temperature junction isolated silicon devices to 200°C.

Devices must also be designed to address reliability concerns. Elevated temperatures accelerate the time-dependent dielectric breakdown of the gate oxide in a MOS field-effect transistor (FET), and can cause failure if the device is operated for several hours at 200°C and 8 MV/cm field strength, for example.

However, these concerns can be eliminated by choosing an oxide thickness that decreases the electric field sufficiently. Similar tradeoffs must be addressed for electromigration. By designing for high temperatures (which includes increasing the cross-section of the metal lines and using lower current densities), electromigration concerns can be avoided in aluminum metallization at temperatures up to 250°C.

Integrated Circuit Wires and Wire Bonds

The stability of packaging materials and processes at high temperatures is an important concern as well. For example, elevated temperatures can result in excessive amounts of brittle intermetallic phases between gold wires and aluminum bond pads. At the same time, the asymmetric inter-diffusion of gold and aluminum at elevated temperatures can cause Kirkendall voiding (or purple plague).

These voids initiate cracks, which can quickly propagate through the brittle inter-metallics causing the wire bond to fracture. Though not usually observed until 125°C, this phenomenon is greatly accelerated at temperatures above 175°C, particularly in the presence of breakdown products from the flame retardants found in plastic molding compounds.

Voiding can be slowed by using other wire bond systems with slower inter diffusion rates. Copper-gold systems only show void-related failures at temperatures greater than 250°C, while bond strength is retained in aluminum wires bonded to nickel coatings at temperatures up to 300°C. Monometallic systems, which are immune to intermetallic formation and galvanic corrosion concerns (such as Al-Al and Au-Au), have the highest use temperatures limited only by annealing of the wires.

Plastic Integrated Circuit Encapsulants

Plastic-encapsulated ICs are made exclusively with thermoset epoxies. As such, their ultimate use temperature is governed by the temperature at which the molding compound de-polymerizes (between 190 and 230°C for most epoxies). There are concerns at temperatures below this as well. At temperatures above the glass transition temperature (Tg ) (160 to 180°C for most epoxy encapsulants), the coefficient of thermal expansion (CTE ) of the encapsulant increases significantly and the elastic modulus decreases, severely compromising the reliability of plastic encapsulated ICs.

Capacitors

Of the discrete passive components, capacitors are the most sensitive to elevated temperatures. The lack of compact, thermally stable, and high-energy density capacitors has been one of the most significant barriers to the development of high-temperature systems. For traditional ceramic dielectric materials, there is a fundamental tradeoff between dielectric constant and temperature stability. The capacitance of devices made with low-dielectric constant titanates, such as C0G or NP0, remains practically constant with temperature and shows little change with aging. The capacitance of devices made with high-dielectric constant titanates, such as X7R, is larger but exhibits wide variations with increases in temperature. In addition, the leakage currents become unacceptably high at elevated temperatures, making it difficult for the capacitor to hold a charge.

There are few alternatives. For example, standard polymer film capacitors are made of polyester and cannot be used at temperatures above 150°C because both the mechanical integrity and the insulation resistance begin to break down.

Polymer films, such as PTFE, are mechanically and electrically stable at higher operating temperatures-showing minimal changes in dielectric constant and insulation resistance even after 1000 hr of exposure to 250°C; however, these films also have the lowest dielectric constant and are the most difficult to manufacture in very thin layers, which severely reduces the energy density of the capacitors.

The best option is to use stacks of temperature-compensated ceramic capacitors. New ceramic dielectric materials continue to offer improved high-tempera ture stability via tailoring of the microstructure or the composition of barium titanate-based mixtures. One particularly promising composition is X8R, which exhibits the energy density of X7R, but has a minimal change in capacitance to 150°C.

Printed Circuit Boards and Substrates

Printed circuit boards (PCBs) and substrates provide mechanical support for components, dissipate heat, and electrically interconnect components. Above their glass transition temperature, however, organic boards have trouble performing these functions. They begin to lose mechanical strength due to resin softening and often exhibit large discontinuous changes in their out-of-plane coefficients of thermal expansion. These changes can cause delamination between the resin and glass fibers in the board or, more commonly, between the copper traces and the resin. Furthermore, the insulation resistance of organic boards decreases significantly above Tg.

Optimization from a heat dissipation perspective begins with the PCB.

Printed circuit boards using standard FR-4 material are limited to temperatures of less than 135°C, although high-temperature versions (for use to 180°C) are available. Those manufactured using bismaleimide triazine (BT), cyanate ester (CE), or polyimide materials can be used to 200°C or more, with quartz-polyimide boards useful to 260°C. Boards made with PTFE resin have a Tg greater than 300°C, but are not recommended for use above 120°C due to weak adhesion of the copper layer. The use of copper improves the PCB's thermal characteristics since its thermal conductivity is more than 1000 times as good as the base FR 4 material.

Clever design of the PCB along with the thoughtful placement of the power dissipating packages can result in big improvements at virtually no cost. Using the copper of the PC board to spread the heat away from the package and innovative use of copper mounting pads, plated through-holes, and power planes can significantly reduce thermal resistance.

Since the PCB is, in effect, a conduit for heat from the package to the exterior, it is essential to obtain good thermal contact at both ends: the package attachment and the PCB mounting to the enclosure. At the package end, this can be achieved by soldering the package surface (in the case of a slug package) to the PCB or by pressure contact. At the other end, generous copper pads at the points of contact between the PCB and enclosure, along with secure mechanical attachment, complete this primary path for heat removal.

Not only are ICs getting faster and more powerful, but PC boards are shrinking in size. Today's smaller PC boards (such as those found in cell phones and PDAs) and product enclosures with their higher speeds demand more cooling than earlier devices. Their increased performance-to-package size ratios generate more heat, operate at higher temperatures, and thus have greater thermal management requirements. These include:

Increased use of any available metal surface to dissipate heat and move heat to an external surface

Increased use of heat pipe-based thermal solutions to move heat to more accessible locations for airflow

Increased demand for highly efficient thermal materials to reduce losses

More difficult manufacturing requirements for product assembly

Solders

Most engineering materials are used to support mechanical loads only in applications where the use temperature in Kelvin is less than half the melting point. However, since the advent of surface mount technology, solder has been expected to provide not only electrical contact but also mechanical support at temperatures well in excess of this guideline. In fact, at only 100°C, eutectic solder reaches a temperature over 80% of its melting point and is already exhibiting Navier-Stokes flow. Above this temperature, shear strength is decreased to an unacceptable level and excessive relaxation is observed. In addition, copper-tin intermetallics can form between tin-lead solder and copper leads at elevated temperatures, which can weaken the fatigue strength of the joints over time.

There are a number of solders that can be used at temperatures to 200°C.

These are listed in Table 9. Thus, the temperature that a PWA can withstand is the lowest maximum temperature of any of the components used in the assembly of the PWA (connectors, plastic ICs, discrete components, modules, etc.), the PCB and its materials, and the solder system used.

(coming soon) TABLE 9 Solidus Levels for High-Temperature Solders

The change to no-lead or lead-free solder, as a result of the environmental and health impact of lead, presents the electronics industry with reliability, manufacturability, availability, and price challenges. Generally speaking, most of the proposed materials and alloys have mechanical, thermal, electrical, and manufacturing properties that are inferior to lead-tin (Pb-Sn) solder and cost more. To date the electronics industry has not settled on a Pb-Sn replacement. Pure tin is a serious contender to replace Pb-Sn. From a thermal viewpoint, lead-free solders require a higher reflow temperature (increasing from about 245°C for Pb-Sn to _260°C for lead-free solder compounds) and thus present a great potential for component and PWA degradation and damage, impacting reliability. Tables 10 through 12 present the advantages and disadvantages of various lead replacement materials; the melting points of possible alternative alloys; and a comparison of the properties of pure tin with several material classifications of lead-free alloys, respectively.

(coming soon) TABLE 10 Lead Alternatives

(coming soon) TABLE 11 Lead-Free Solders

6. COOLING SOLUTIONS

As a result of the previously mentioned advances in integrated circuits, printed circuit boards, and materials-and due to the drive for product miniaturization- it is no longer adequate to simply clamp on a heat sink selected out of a catalog after a PWA or module is designed. It's very important that thermal aspects be considered early in the design phase. All microprocessors and ASIC manufacturers offer thermal design assistance with their ICs. Two examples of this from Intel are Pentium III Processor Thermal Management and Willamette? Thermal Design Guidelines, which are available on the Internet to aid the circuit designer.

Up-front thermal management material consideration may actually enhance end-product design and lower manufacturing costs. For example, the realization that a desktop PC can be cooled without a system fan could result in a quieter end product. If the design engineer strategizes with the electrical layout designers, a more efficient and compact design will result. Up-front planning results in thermal design optimization from three different design perspectives:

Thermal-concentrating on the performance of the thermal material Dynamic-designing or blending the material to operate within the actual conditions

(coming soon) TABLE 12 Grouping Pure Tin with Various Lead-Free Solder Alloys


FIGURE 8 Thermally enhanced flip chip BGA.


FIGURE 9 Folded fin heat sinks offer triple the amount of finned area for cooling.

Economic-using the most effective material manufacturing technology Thermal management design has a significant impact on product package volume and shape. Heat generated inside the package must be moved to the surface of the package and/or evacuated from the inside of the package by air movement. The package surface area must be sufficient to maintain a specified temperature, and the package shape must accommodate the airflow requirements.

Early decisions concerning component placement and airflow can help pre vent serious heat problems that call for extreme and costly measures. Typically, larger original equipment manufacturers (OEMs) are most likely to make that investment in design. Those that don't tend to rely more heavily, often after the fact, on the thermal-component supplier's products and expertise. A component level solution will be less than optimal. If one works solely at the component level, tradeoffs and assumptions between performance and cost are made that often are not true.

In light of the growing need to handle the heat produced by today's high speed and densely packed microprocessors and other components, five approaches to thermal management have been developed: venting, heat spreaders, heat sinks (both passive and active), package enclosure fans and blowers, and heat pipes.

6.1 Venting

Natural air currents flow within any enclosure. Taking advantage of these currents saves on long-term component cost. Using a computer modeling package, a de signer can experiment with component placement and the addition of enclosure venting to determine an optimal solution. When these solutions fail to cool the device sufficiently, the addition of a fan is often the next step.

6.2 Heat Spreaders

All-plastic packages insulate the top of the device, making heat dissipation through top-mounted heat sinks difficult and more expensive. Heat spreaders, which are typically made of a tungsten-copper alloy and are placed directly over the chip, have the effect of increasing the chip's surface area, allowing more heat to be vented upward. For some lower-power devices, flexible copper spreaders attach with preapplied double-sided tape, offering a quick fix for borderline applications. Figure 8 shows a thermally enhanced flip chip ball grid array (BGA) with an internal heat spreader.

Heat spreaders frequently are designed with a specific chip in mind. For example, the LB-B1 from International Electronic Research Corp. ( Burbank, CA) measures 1.12 in. _ 1.40 in. _ 0.5 in. high and dissipates 16.5° C/W.

6.3 Passive Heat Sinks

Passive heat sinks use a mass of thermally conductive material (normally aluminum) to move heat away from the device into the airstream, where it can be carried away. Heat sinks spread the heat upward through fins and folds, which are vertical ridges or columns that allow heat to be conducted in three dimensions, as opposed to the two-dimensional length and width of heat spreaders.

Folding (Fig. 9) and segmenting the fins further increases the surface area to get more heat removal in the same physical envelope (size), although often at the expense of a large pressure drop across the heat sink. Pin fin and cross cut fin heat sinks are examples of this solution. Passive heat sinks optimize both cost and long-term reliability.

Heat Sinking New Packages

While BGA-packaged devices transfer more heat to the PWA than leaded de vices, this type of package can affect the ability to dissipate sufficient heat to maintain high device reliability. Heat sink attachment is a major problem with BGAs because of the reduced package size. As the need to dissipate more power increases, the optimal heat sink becomes heavier. Attaching a massive heat sink to a BGA and relying on the chip-to-board solder connection to withstand mechanical stresses can result in damage to both the BGA and PWA, adversely impacting both quality and reliability.

Care is needed to make sure that the heat sink's clamping pressure does not distort the BGA and thereby compromise the solder connections. To prevent premature failure caused by ball shear, well-designed, off-the-shelf heat sinks include spring-loaded pins or clips that allow the weight of the heat sink to be borne by the PC board instead of the BGA (Fig. 10).


FIGURE 10 Cooling hierarchy conducts heat from the die through the package to the heat sink base and cooling fins and into the ambient.


FIGURE 11 Example of active heat sink used for cooling high-performance microprocessors.

Active Heat Sinks

When a passive heat sink cannot remove heat fast enough, a small fan may be added directly to the heat sink itself, making the heat sink an active component.

These active heat sinks, often used to cool microprocessors, provide a dedicated airstream for a critical device (Fig. 11). Active heat sinks are often a good choice when an enclosure fan is impractical. As with enclosure fans, active heat sinks carry the drawbacks of reduced reliability, higher system cost, and higher system operating power.

Fans can be attached to heat sinks in several ways, including clip, thermal adhesive, thermal tape, or gels. A clip is usually designed with a specific chip in mind, including its physical as well as its thermal characteristics. For example, Intel's Celeron? dissipates a reasonable amount of heat-about 12 W. But because the Celeron is not much more than a printed circuit board mounted vertically in a slot connector, the weight of the horizontally mounted heat sink may cause the board to warp. In that case, secondary support structures are needed.

It is believed that aluminum heat sinks and fans have almost reached their performance limitations and will have no place in future electronic products.

Similarly, because extruded heat sinks or heat spreaders may have small irregularities, thermal grease or epoxies may be added to provide a conductive sealant. The sealant is a thermal interface between the component and the heat sink. Because air is a poor thermal conductor, a semi-viscous, preformed grease may be used to fill air gaps less than 0.1 in. thick.

But thermal grease doesn't fill the air gaps very well. The problem with thermal grease is due to the application method. Too much grease may "leak" on the other components creating an environment conductive to dendritic growth and contamination, resulting in a reliability problem. This has led designers to use gels. The conformable nature of the gel enables it to fill all gaps between components and heat sinks with minimal pressure, avoiding damage to delicate components.

The Interpack and Semi-Therm conferences and Electronics Cooling Magazine deal with thermal grease and heat management issues of electronic systems/products in detail.


FIGURE 12 Diagram of a typical computer workstation showing airflow paths.


FIGURE 13 Diagram showing thermal resistance improvement with and without both heat spreaders and forced air cooling for a 16-pin plastic DIP. The results are similar for larger package types.

6.4 Enclosure Fans

The increased cooling provided by adding a fan to a system makes it a popular part of many thermal solutions. Increased airflow significantly lowers the temperature of critical components, while providing additional cooling for all the components in the enclosure. Increased airflow also increases the cooling efficiency of heat sinks, allowing a smaller or less efficient heat sink to perform adequately.

Figure 12 is a three-dimensional block diagram of a computer workstation showing the airflow using an external fan.

Industry tests have shown that more heat is dissipated when a fan blows cool outside air into a personal computer rather than when it sucks warm air from inside the chassis and moves it outside. The amount of heat a fan dissipates depends on the volume of air the fan moves, the ambient temperature, and the difference between the chip temperature and the ambient temperature. Ideally, the temperature in a PC should be only 10 to 20°C warmer than the air outside the PC. Figure 13 shows the positive impact a fan (i.e., forced air cooling) provides in conjunction with a heat spreader, and Table 13 presents various IC package styles and the improvement in theta_ja that occurs with the addition of a fan.

The decision to add a fan to a system depends on a number of considerations. Mechanical operation makes fans inherently less reliable than a passive system. In small enclosures, the pressure drop between the inside and the outside of the enclosure can limit the efficiency of the fan. In battery-powered applications, such as a notebook computer, the current drawn by the fan can reduce battery life, thus reducing the perceived quality of the product.

Despite these drawbacks, fans often are able to provide efficient, reliable cooling for many applications. While fans move volumes of air, some PC systems also require blowers to generate air pressure. What happens is that as the air moves through the system its flow is hindered by the ridges of the add-on cards and the like. Even a PC designed with intake and exhaust fans may still require a blower to push out the warm, still air.

Fan design continues to evolve. For example, instead of simply spinning at a fixed rate, fans may include a connector to the power supply as well as an embedded thermal sensor to vary the speed as required by specific operating conditions.

(coming soon) TABLE 13 Junction-to-Ambient Thermal Resistance as a Function of Air Cooling for Various IC Package Styles


FIGURE 14 A heat pipe is a highly directional heat transport mechanism (not a heat spreading system) to direct heat to a remote location.

6.5 Heat Pipes

While heat spreaders, heat sinks, fans, and blowers are the predominant means of thermal management, heat pipes are becoming more common, especially in notebook PCs. A heat pipe (Fig. 14) is a tube within a tube filled with a low- boiling point liquid. Heat at the end of the pipe on the chip boils the liquid, and the vapor carries that heat to the other end of the tube. Releasing the heat, the liquid cools and returns to the area of the chip via a wick. As this cycle continues, heat is pulled continuously from the chip. A heat pipe absorbs heat generated by components such as CPUs deep inside the enclosed chassis, then transfers the heat to a convenient location for discharge. With no power-consuming mechanical parts, a heat pipe provides a silent, light-weight, space-saving, and maintenance-free thermal solution. A heat pipe usually conducts heat about three times more efficiently than does a copper heat sink. Heat pipe lengths are generally less than 1 ft, have variable widths, and can dissipate up to 50 W of power.

6.6 Other Cooling Methods Spray Cooling

A recent means of accomplishing liquid cooling is called spray cooling. In this technique an inert fluid such as a fluorocarbon (like 3M's PF-5060 or FC-72) is applied either directly to the surface of an IC or externally to an individual IC package and heat sink. Spray cooling can also be applied to an entire PWA, but the enclosure required becomes large and costly, takes up valuable space, and adds weight. Figure 15 shows a schematic diagram of the spray cooling technique applied to an IC package, while Figure 16 shows the technique being applied to an enclosed multichip module (MCM). Notice the sealed hermetic package required. In selecting a liquid cooling technique such as spray cooling, one has to compare the complexity of the heat transfer solution and technical issues involved (such as spray velocity, mass flow, rate of spray, droplet size and distribution, fluid/vapor pump to generate pressure, control system with feedback to pump, heat exchanger to ambient fluid reservoir, and hermetic environment) with the increased cost and added weight.


FIGURE 15 Schematic diagram of spray cooling system.


FIGURE 16 Spray cooling a multichip module package.


FIGURE 17 Typical single-stage thermoelectric cooler.

Thermoelectric Coolers

A thermoelectric cooler (TEC) is a small heat pump that is used in various applications where space limitations and reliability are paramount. The TEC operates on direct current and may be used for heating or cooling by reversing the direction of current flow. This is achieved by moving heat from one side of the module to the other with current flow and the laws of thermodynamics. A typical single stage cooler (Fig. 17) consists of two ceramic plates with p- and n-type semiconductor material (bismuth telluride) between the plates. The elements of semiconductor material are connected electrically in series and thermally in parallel.

When a positive DC voltage is applied to the n-type thermoelement, electrons pass from the p- to the n-type thermoelement, and the cold side temperature will decrease as heat is absorbed. The heat absorption (cooling) is proportional to the current and the number of thermoelectric couples. This heat is transferred to the hot side of the cooler, where it is dissipated into the heat sink and surrounding environment. Design and selection of the heat sink are crucial to the overall thermoelectric system operation and cooler selection. For proper thermoelectric management, all TECs require a heat sink and will be destroyed if operated without one. One typical single-stage TEC can achieve temperature differences up to 70°C, and transfer heat at a rate of 125 W.

The theories behind the operation of thermoelectric cooling can be traced back to the early 1800s. Jean Peltier discovered the existence of a heating/cooling effect when electric current passes through two conductors. Thomas Seebeck found that two dissimilar conductors at different temperatures would create an electromotive force or voltage. William Thomson (Lord Kelvin) showed that over a temperature gradient, a single conductor with current flow will have reversible heating and cooling. With these principles in mind and the introduction of semi conductor materials in the late 1950s, thermoelectric cooling has become a viable technology for small cooling applications.


FIGURE 18 Mounting a TEC with solder.


FIGURE 19 Effectiveness of various thermal cooling techniques.

Thermoelectric coolers (TECs) are mounted using one of the three methods: adhesive bonding, compression using thermal grease, or solder. Figure 18 shows a TEC with an attached heat sink being mounted with solder.

Metal Backplanes

Metal-core printed circuit boards, stamped plates on the underside of a laptop keyboard, and large copper pads on the surface of a printed circuit board all employ large metallic areas to dissipate heat. A metal-core circuit board turns the entire substrate into a heat sink, augmenting heat transfer when space is at a premium. While effective in cooling hot components, the heat spreading of this technique also warms cooler devices, potentially shortening their lifespan. The 75% increase in cost over conventional substrates is another drawback of metal core circuit boards.

When used with a heat pipe, stamped plates are a cost-effective way to cool laptop computers. Stamped aluminum plates also can cool power supplies and other heat-dissipating devices. Large copper pads incorporated into the printed circuit board design also can dissipate heat. However, copper pads must be large to dissipate even small amounts of heat. Therefore, they are not real estate efficient.

F 20 Comparison of heat transfer coefficients for various cooling techniques.

Thermal Interfaces

The interface between the device and the thermal product used to cool it is an important factor in implementing a thermal solution. For example, a heat sink attached to a plastic package using double-sided tape cannot dissipate the same amount of heat as the same heat sink directly in contact with a thermal transfer plate on a similar package.

Microscopic air gaps between a semiconductor package and the heat sink caused by surface nonuniformity can degrade thermal performance. This degradation increases at higher operating temperatures. Interface materials appropriate to the package type reduce the variability induced by varying surface roughness.

Since the interface thermal resistance is dependent upon applied force, the contact pressure becomes an integral design parameter of the thermal solution. If a package/device can withstand a limited amount of contact pressure, it is important that thermal calculations use the appropriate thermal resistance for that pres sure. The chemical compatibility of the interface materials with the package type is another important factor. Plastic packages, especially those made using mold release agents, may compromise the adherence of tape-applied heat sinks.

In summary, the selected thermal management solution for a specific application will be determined by the cost and performance requirements of that particular application. Manufacturing and assembly requirements also influence selection. Economic justification will always be a key consideration. Figure 19 summarizes the effectiveness of various cooling solutions. Figure 19 is a nomograph showing the progression of cooling solutions from natural convection to liquid cooling and the reduction in both thermal resistance and heat sink volume resulting from this progression. Figure 20 compares the heat transfer coefficient of various cooling techniques from free air convection to spray cooling.

7. OTHER CONSIDERATIONS

7.1 Thermal Versus Electromagnetic Compatibility Design Tradeoffs

Often traditional thermal and electromagnetic compatibility (EMC) solutions are at odds with one another. For example, for high-frequency processor signals, EMC requirements call for some kind of enclosure, limiting cooling air to hot devices. Since EMC is a regulatory requirement necessary to sell a product and thermal management is not, it is most often the thermal solutions that must be innovative (e.g., use of enclosure surfaces as heat sinks or shielding layers in the PWA as heat spreaders), especially in the shortened design cycles encountered today.

At the package level, a design that may be excellent for electrical performance may be thermally poor or may be far too expensive to manufacture. Trade offs are inevitable in this situation and software tools are becoming available that allow package designers to use a common database to assess the effect of design changes on both thermal and electrical performance.

7.2 Overall System Compatibility

In the telecommunications and networking industries, a cabinet of high-speed switching equipment may be installed in close proximity to other equipment and so must be shielded to prevent both emission and reception of electromagnetic radiation. Unfortunately, the act of shielding the shelves has an adverse effect on the natural convection cooling, which is becoming a requirement for designers.

These divergent needs can only be achieved with a concurrent design pro cess involving the electrical designer, the EMC engineer, the mechanical de signer, and the thermal engineer. Thermal design must therefore be part of the product design process from concept to final product testing.

ACKNOWLEDGMENTS

Portions of Section 2 were excerpted from Ref. 1. Used by permission of the author.

I would also like to thank Future Circuits International for permission to use material in this section.

REFERENCES

1. Addison S. Emerging trends in thermal modeling. Electronic Packaging and Production, April 1999.

2. Konstad R. Thermal design of personal computer chassis. Future Circuits Int 2(1).

3. Ascierto J, Clenderin M. Itanium in hot seat as power issues boil over. Electronic Engineering Times, August 13, 2001.

FURTHER READING

1. Case in Point. Airflow modeling helps when customers turn up heat. EPP, April 2002.

2. Electronics Cooling magazine.

See also www.electronics-cooling.com.

Top of Page PREV.   NEXT Article Index HOME