2010 Issues

New 3D thermal quantities help designers address thermal problems as they arise.

Electronics thermal management is the discipline of designing electronics systems to facilitate the effective removal of heat from the active surface of integrated circuits to a colder ambient environment. In doing so, heat passes from the package both directly to the surrounding air and via the PCB on which the IC is mounted. The PCB and, to a lesser extent, the surrounding air thermally couple the various heat sources.

Heat coupling increases as components and PCBs become smaller and more powerful. Designers must take remedial action to bring all components within their respective thermal specifications, but this step is becoming more challenging and constrained, even when preventative measures are taken early in the design process.

For the past 20 years, computational fluid dynamics (CFD) techniques have provided 3D thermal simulations that include views of the air-side heat transfer that predict component junction and case temperatures under actual operating conditions. Designers routinely use these predicted temperatures to judge thermal compliance simply by comparing the simulated temperatures to maximum rated operating temperatures. If the operating temperature exceeds the maximum rated value, there will be at least a potential degradation in the performance of the packaged IC, and at worst, an unacceptable risk of thermo-mechanical failure.

Simulated 3D temperature and flow fields provide detailed and useful information, but give little physical insight into why the temperature field is the way it is. Examining heat flux vectors can yield some insight into the heat removal paths. But the heat flux vector direction and magnitude data do not provide a measure of the ease with which heat leaves the system. Nor do they provide insight as to where and how the heat flux distribution might be better balanced or reconfigured to improve performance.

How easily heat passes from the various sources to the ambient will determine the temperature rise at the sources and all points in between. Heat flow paths are complex and three-dimensional, carrying portions of the heat with varying degrees of ease. Paths that carry a lot of heat but offer large resistances to that heat flow represent bottlenecks. A redesign can relieve these bottlenecks, permitting heat to pass to the ambient more easily and reducing temperature rises along the heat flow path all the way back to the heat source. In addition, there may be unrealized opportunities to introduce new heat flow paths that would permit heat to pass to colder areas and out to the ambient. So a redesign informed by the right information can do more than alleviate bottlenecks; it can also introduce thermal “shortcuts” to bypass them.

Sans a way to identify such thermal bottlenecks and shortcut opportunities within a design, PCB design teams have faced a stark choice. Either bring in thermal experts to resolve thermal problems, or rely on being able to add heat sinks later. Lacking direction from the simulation results as to appropriate remedial action, thermal engineers have traditionally relied on experience and engineering judgment to guide their search for design improvements. Today their work is often supplemented by design of experiments and automatic design optimization capabilities within the thermal simulator. Such approaches take time.

An innovative way to view the thermal behavior of a populated PCB uses two new 3D thermal quantities aptly known as the BottleNeck (Bn) Number and the ShortCut (Sc) Number that, taken together, guide designers to take appropriate, targeted remedial action to address thermal problems as they are encountered.

Vectors, Bottlenecks and Shortcuts

Heat flow can be defined in terms of a heat flow through a given cross-sectional area. This measure is known as a heat flux. The presence of a heat flux vector will always be accompanied by a temperature gradient vector. The temperature gradient field is taken to be an indicator of conductive thermal resistance as, for a given heat flux, the greater the temperature gradient is, the larger the thermal resistance will be.

The dimensionalized Bn number is the dot product of these two vector quantities. At each point in space (Figure 1) where there exists a heat flux vector and temperature gradient vector, the Bn scalar at that point is calculated as:

Bn = Heat flux magnitude × Temperature gradient magnitude × sting bottlenecks.

This is not always true, of course, especially for multiple heat sources (as found on almost every PCB) where heat flow topologies for widely separated components can be quite independent of each other.

To illustrate how this process works in practice, consider a typical air-cooled PCB (Figure 2). The central BGA has highest temperature rise above specification, followed by the two TO-263s above and to the right.



Though it depicts the same PCB, Figure 3 is not the same kind of thermal view as Figure 2. Instead, it shows the Sc number distribution mapped at a point just above the tops of the packages on the board. Although in Figure 3 the largest Sc numbers are associated with the hottest component, this is not always the case. A component might be hot due to the temperature of the surrounding air, rather than its own internal power dissipation. In the case of this centrally located BGA, the large Sc values on its top surface indicate relatively efficient convective heat transfer locally. Therefore, the obvious remedial design action is to add a heat sink. The heat sink acts as an area extender, making it even easier for heat to leave the top of the component and to be carried away by the air. Introducing this design change reduces the BGA’s junction temperature rise by 70%, taking it well below its maximum safe operating limit. With the BGA running thermally compliant, let’s turn our attention to the TO-263 components.



Figure 4 is a Bn plot depicting the Bn distribution in the top signal layer of the PCB. We can see that, after adding the BGA heat sink, the largest thermal bottlenecks exist near the tabs of the two TO-263 devices. Recall that large Bn values do not mean this is the hottest area. Instead, the Bn figures and the plot reveal areas in which a lot of heat flows “downstream” from the heat source, and is highly restricted. Knowing exactly where the bottleneck is, a large copper pad can be added to cover that high bottleneck area, providing a targeted solution to a specifically identified problem. That is effective, efficient engineering.

Having made this modification, best practice methods call for an updated thermal simulation and inspection of the new Bn and Sc distributions. Figure 5 shows Sc on a cross-section through one of the two TO-263s after the addition of the copper pad. The expanded inset view shows large Sc values on the signal layer and the power and ground planes below the new copper pad and TO-263 tabs, indicating a shortcut opportunity between these layers. This agrees with a designer’s intuition, since heat spreads readily in the metallic layers of the PCB, while the dielectric’s low thermal conductivity acts as an effective barrier to heat transfer. Adding thermal vias to create a new heat transfer path down to the buried ground plane is an excellent, practical way to take advantage of this shortcut opportunity. Note that the Sc field pinpoints exactly where the thermal vias should be added for maximum effect.

By examining the Bn and Sc variations in and around the TO-263, the exact shape of the copper pad and the location of an array of thermal vias (shown schematically in Figure 6) can be determined quickly, without resorting to numerous “what if” studies. In this case, adding the pad and vias yielded a 30% drop in the temperature rise of the TO-263 devices, again taking them below their maximum rated temperatures.



The Bn and Sc fields together provide invaluable insight and comprehension about temperature distribution and behavior in an electronics system. By detecting and mapping both thermal constrictions (bottlenecks) and potential shortcuts for more efficient heat transfer, these parameters enable engineers to quickly determine the most promising thermal design changes – those most likely to provide the most efficient, effective results – without years of thermal experience and intuition.
In the example, three Bn- and Sc-inspired thermal design changes were identified quickly, and the resulting “fixes” dramatically reduced the temperature of the three overheating components discovered in the initial simulation. It is an approach that delivers important gains in simulation productivity. Rather than simulating all possible remedial actions for thermal problems and choosing the best one, engineers can see immediately where they need to focus their thermal design effort.

Byron Blackmore, John Parry and Robin Bornoff are with Mentor Graphics’ Mechanical Analysis Division (mentor.com); This email address is being protected from spambots. You need JavaScript enabled to view it..

The second of a two-part series looks at how timing and PCB trace lengths affect different real systems, and design tricks for tuning timing.

On topology diagrams, we can easily visualize or specify the delays between any driver/receiver pair on multi-point nets. Some standards specify PCB design rules this way, for example, DDR-SDRAM DIMM memories (various Jedec JESD21-C documents) or Chipset Design Guides. Some design programs specify the constraints on these diagrams, like the Cadence Allegro Signal Explorer. The topology may be defined graphically, or as a spreadsheet for the point-to-point min./max. or relative length rules.12



Add-in cards: If a bus is routed through multiple boards, then the timing and length rules have to be correct for the whole system together (Figure 20). If different individuals or companies design the boards, they have to agree in the way of dividing the constraints between the boards, as a form-factor standard. In case of a clock tree, if the add-in card clock trace length is closely the same for all cards, then the skew can be controlled only by the motherboard design.



To control the ref./data signal arrival times at the capture flip-flop, control the delays on PCB traces. If two signals have similar drivers and trace lengths (matched), then the propagation delay and the transition delay also will be very similar, and propagation delay matching ensured by simple trace length matching.

The PLL (Phase-Locked Loop, on-chip device) can be used on continuously running clocks to introduce phase delays, negative delays or frequency multiplication. PLLs usually contain some kind of modifier element in their feedback loop. If this modifier element is a frequency divider (by M), then the PLL will generate an output clock, which has a frequency of M*f_in. If the modifier is a DF phase delay element, then the PLL inputs will be DF delayed from the output, so the output will be 360°- DF delayed from the input. If we have a PCB trace as the modifier element, then it will cause the inputs to be late by t_pd comparing to the output, so it looks like if the output was delayed by –1*t_pd from the input.

The DLL (Delay Locked Loop, on-chip device) is a fixed or adjustable delay element. It has “taps”; each tap has a unit delay value. The number of taps connected into the signal path determines the DLL delay.

To maintain the best timing for all on-chip data paths, the chips contain clock networks as balanced trees, so the clock will arrive to each flip-flop within a tight t_pd range. Normally there is an option to place a PLL before the clock tree to achieve zero clock propagation delay through the clock network. In some I/O applications, it might be useful, for example, if the clock network delay is a lot longer than the data path delay.

On bidirectional buses, minimize the clock skew between all the chips. If the clock is generated inside one of the chips, then the clock propagation delay to that chip would be zero, while to the other chips it would be based on the on-board routing. To avoid that, we introduce the same clock propagation delay to the chip generating the clock, by using the feedback clock. This simply routes the clock back to the same chip. A data path with a certain delay in it can be divided into two separate paths by inserting a register or flip-flop into it. Both parts will have the same available time for signal propagation as the original path had, but with only a part of the original delay. Note that the data will arrive one bit time or one clock cycle later to the final capture flip-flop, but it will be captured with better timing margins. This technique is usually used on high-speed on-chip data processing, and on registered DIMM (RDIMM) memory card designs.



Usually the DLL/PLL delays are controllable, and we can also insert register stages in the datapath. These can be fixed by chip design or can be software programmable, but in most of the cases, they are adjusted automatically by a state machine. Examples are the DDR3-SDRAM memory Read and Write Leveling features.2

On-Chip Timing Design

Some of the above methods are really chip-design methods. The chip or ASIC/FPGA designers have to design their I/O interfaces to be operational with realistic board design. To achieve this, they set up timing constraints, use guided logic placement or floorplanning, do careful chip pinout design, use DLLs/PLLs, use localized high-speed IO clock networks, use asynchronous FIFOs and design clever architecture for backend data paths.8,11,13

The different devices in synchronous systems use the same clock source to run their I/O and on-chip flip-flops. There is always one clock generator, and its output is distributed to every device on the same bus. If the bus is bidirectional, then the best way to balance the read/write setup/hold margins is to balance/match the clock propagation delays. If there is a clock skew between two chips, then one of the margins is decreased by the value of the clock skew. The clock skew may be known as uncertainty (peak to peak), or as an absolute value (with a sign).

DDR-SDRAM memory interfaces have source-synchronous data buses (lanes), and they have a unidirectional synchronous address/command/control (ACC) bus. We have two types of implementations: the DIMM socketed card, and the Memory-Down, where the chips are soldered onto the motherboard. Either designing a DIMM card or a memdown, we usually follow the layout design rules specified in the appropriate card type from the JEDEC JESD21-C standard.2,9,12

The data bus timing is valid in every lane separately between the DQ/DM and DQS strobe signals. The DQS path has a DLL delay in the memory controller chip, so the DQS is delayed before entering to the PCB for write transactions, while for reads it is delayed only after it has arrived onto the controller chip.

The address/command/control (ACC) bus is sampled by the memory chips at the rising edges of the clock signal provided by the memory controller. In case of 2T clocking mode, every second rising edge is used to sample the ACC bus. The ACC bus is routed to every memory chip in a memory channel, so it can have a very heavy loading, which creates very slow transition delays. If the load is above a certain value, then we need registered/buffered DIMM memories. DDR1 and DDR2 standards use balanced-tree clock/ACC topology to make sure all chips get the clock/ACC in the same time, while DDR3 uses the Fly-By topology to minimize SNN and to have only one end where we can terminate them.



Reference-Reference Timing

Although the main I/O matching is between the data-strobe and the clock-ACC, there are also clock-to-strobe design rules. These are based on the chip timing design behind the I/O flip-flops.

The memory chips expect the first valid databit to arrive a certain time after they have captured a write command. The controller puts the first databit to the bus with the right timing, but the board design has to make sure that this timing is still maintained when the signals arrive to the memory. This requires a length matching between the clock and the strobe signals. For this, there is an output guaranteed skew timing parameter from the controller data sheet, and an input maximum skew parameter from the memory chip data sheet. This input parameter of the memory is the t_DQSS, which is +/- t_clk/4, between the rising edge of the clock and the rising edge of the DQS signal. They also specify a clk-rising to DQS-falling-edge input rule, which is the t_DSS and the t_DSH parameters together. For DDR3 memories, the write leveling feature can compensate for this.

The memory controller has to pass the captured data from the DQS clock domain to the internal clock domain. This clock domain crossing requires the data to arrive to the controller within a specified time window. This limits the maximum length of the bus, since if the bus is longer, then the data arrives later, decreasing the setup margin in the backend flip-flops. The memory chip data sheet specifies the maximum skew between the input clock and the output DQS, as t_DQSCK. The controller data sheet specifies a maximum skew of the output clock and the input. Both the clock and the DQS trace lengths increase this.

Some FPGA implementations handle this by calibrating the delay with DLLs and registers for all the read DQ/DQS signals.9

Timing calibration. We can include delay circuits in the DQ/DQS paths. These can be fixed, or adjusted by a hardware state machine or by software to achieve optimal timing. For example, if we extend the delay of a reference signal to t_clk, then the effect is like if the reference signal was not delayed at all (in the aspect of STA), although the controller has to expect the data in the next cycle (in the aspect of protocol). The board/chip delays are mostly static for a given board, although they vary between boards and over temperature. That is why we calibrate after power-up. We can measure signal quality by adjusting DLL delays step-by-step, capturing the data and seeking for the DLL value where the captured data is different than in the previous step. This way we can find the boundaries of the Data Valid Window. Then we can set the final delays in the middle of the region.

Write leveling. This process compensates for the clock-to-strobe matching issues, and skew caused by the fly-by ACC topology. The controller puts the memory chips into write leveling mode. Then the memory will sample the CLK using DQS edges; then it sends the captured value to the controller on the DQ0 line. The controller finds the two DLL values where the sampled value changes, then sets the DLL half way.

Read leveling. This process balances the data bus read setup/hold margins by adjusting the DQS delay. In read leveling mode, the controller writes a fixed test pattern into the general purpose register in the memory, reads it back again and again, seeking for the minimum and maximum delays where it can still read the correct data. Then it sets the DLL half way.

All DQ/DQS DLL calibration: FPGA-based memory controller implementations can have a separate DLL on each data line. This way we can compensate on-chip for board/chip mismatch.9



Arbitrary examples. In Compact-PCI systems, a single board computer may be in a system controller or in a peripheral slot. In system controller mode, it has to supply the clocks to all other cards, and in peripheral mode, it has to take the clock from the backplane to clock its backplane I/O circuits. In both cases the clock signals have to be matched with a given tolerance. The system controller slot has 3 to 7 clock output signals, each routed to a different peripheral slot on the backplane with a length of 6.3"+/-0.04". The peripheral cards have to route this clock to their backplane interface circuits with 2.5"+/-0.04" length.

The MB86065 D/A Converter from Fujitsu receives the data as LVDS differential signals from the host (e.g an FPGA), and provides the I/O bit clock to the host. The DAC requires the data and the clock to be in phase + 90° at the DAC pins. The trick is to use a PLL feedback net on the PCB with a delay equal to the clock+data length on the PCB, creating a negative delay for the launch flip-flop. The PLL needs to have a 0° and a 90° output: the 0° for the feedback loop, and the 90° for the launch flip-flop for the extra alignment. This interface is a unidirectional synchronous interface, but the clock is provided by the receiver chip.14



When multiple lanes are used in the high-speed serial interfaces, in the receiver chip each lane has its own CDR (Clock-Data-Recovery) circuit, so each lane’s SerDes will clock its parallel output with a different clock. These have a phase relationship based on the lane-to-lane skew on-board and on-chip. The parallel data are passed to the core clock domain. If that clock domain is derived, for example, from Lane-0 clock, then it will capture the Lane-0 parallel data with proper timing, but the other lanes will be early/late by the lane-to-lane skew. This is usually handled by a clock-DLL for lower speeds or by using asynchronous FIFOs for each lane. In case of a DLL, the max lane-to-lane skew is defined by STA at the clock domain crossing. In case of FIFOs, the maximum lane-to-lane skew is limited by the FIFO depth and the protocol. Some protocols define FIFO under/overflow control by transmitting align characters. The max skew can be t_skew < N * k * t_bit_serial, where they use “k” bits per symbol, and “N” is half the portion of FIFO depth allocated for deskew.12

Calculating PCB Trace Length Constraints

Trace length constraints can be calculated from the timing margins of the pre-layout timing analysis. These constraints are specified to ensure certain propagation delays. For multi-point buses, define pin-to-pin delay rules, or rules for “all pin pairs.” Sometimes the signal travels through a series element: for example, a damping resistor or an AC coupling capacitor. The design program has to be able to measure the pin-to-pin lengths even in these cases.

Specify min./max. absolute or relative (matching) trace propagation delay or trace length rules, depending on the interface type. For the absolute data signal lengths, consider an already specified (by floorplanning) or routed reference signal length. Matching rules cannot be used for them, since the matching offset+tolerance would depend on the reference signal’s length. The relative constraints for data signals specify trace length difference from the reference signal. For them, the reference length need not be specified in advance.

The min./max. data propagation delay can be derived directly from timing margins, since the margins have been calculated using t_pd_data = 0 for absolute rules, or delta_t_pd = 0 for relative rules. Transform the smaller of the RD/WR margins to t_pd by checking what would cause zero margin. If the t_pd_data is a degrading parameter, then transform t_SU_MAR => t_pd_data_max. If t_pd_data is an improving parameter, then transform the -1*t_H_MAR => t_pd_data_min.
In case of timing graphs with existing propagation delays, increase/decrease any PCB trace by the above in the data path, or by the opposite for the reference path. If the two traces are on different types of layers, then they cannot be length matched; they have to be propagation delay matched. If a signal is partially routed on different layers, divide the t_pd for the two layers and calculate lengths separately.

For chips in bigger packages, like x86 chipsets or large FPGAs, the manufacturer provides “package length” information. This is a spreadsheet of routing lengths inside the package for every signal pin. For board design, the package lengths have to be included in the total length. For example, the Cadence Allegro PCB design software handles it as “pin delay.”

Length constraints also can be signal quality-based, for example, to minimize crosstalk, reflections, stub-length and losses. The crosstalk noise voltage and the insertion loss are proportional to the trace length, and are normally simulated as per-unit-length PCB trace parameters. The SI-based rules are much less sensitive to the exact length than the t_pd based rules.

We can simulate two parallel traces at a unit length to get the crosstalk as an S-parameter in dB, then considering the maximum crosstalk-noise voltage we would permit, calculate a maximum parallel-segment length:



Loss-based length constraints use the per-unit-length insertion loss at the signalling frequency:

Longer PCB traces have stronger inter-symbol interference as well, which affects propagation delays through the transition time increase. Differential-pair phase tolerance skew slows down the differential slew rate, closing the eye from the corners. If skew exceeds rise time, then it closes the eye from the sides as well.

Typical PCB Design Rules

Usually the reference path and data path are handled separately. Specify maximum clock skew (in case of a central clock source), or just calculate min./max. data length based on the already routed clock length (if the clock is supplied by one of the chips). To have all the constraints in advance, then based on the floor plan, the clock length can be specified that is the shortest possible but still easily routable and then its value set as a tight absolute length range. Then, use t_pd_clk_min./max. as input parameters to the timing margin calculations. The amount of clock skew (in case of a central clock source) tolerable can be calculated from a pre-layout timing margin with zero skew, and permit 10% of that margin to be clock skew.

Usual PCB design rules:

  • Min./max. data bus length.
  • Min./max. clock trace length or max clock skew.

An asynchronous interface also has min./max. absolute length rules. The reference signal is always supplied by the master chip. The design rules are min./max. trace lengths for the data signals based on predefined strobe trace lengths.

Usual PCB design rules:

  • Min./max. data bus length.
  • Min./max. strobe trace length.

Source synchronous systems are designed in such a way to ensure the data and reference signal (strobe) paths have similar delays on-board and on-chip, except the DLL inserted into the reference path. This means the goal is to keep the data signal length within a +/-delta_length window around the strobe trace length. This is the simplest to design, since we are not restricted to using a predefined reference length.

Usual PCB design rules:

  • Maximum strobe-to-data skew: as a relative length comparing to the strobe signal’s length. As speed increases, both the min./max. delta length values get closer to zero. In a usual DDR3 memory interface, specify a maximum 0.125 mm delta length.

The usual design constraint is “matching with an offset.” A simple explanatory equation can be derived from the generalized setup and hold equations:
The data t_pd has to be roughly between the clock t_pd and the clock t_pd plus the clock period.

Calculate minimum and maximum length difference of the data signal trace length relative to the clock trace length. It will be asymmetric.

Usual PCB design rules:

  • Maximum clock-to-data skew.
  • Clock skew: If the clock generator is not inside the transmitter chip, then we have to balance the setup/hold margins with clock delay control.

Clock forwarding interfaces work in the same way as the unidirectional synchronous type, just that they support both read and write operations with separate clock signals for them.

Usual PCB design rules:

  • Maximum clock-to-data skew, separately for read and write.

The only trace length rules are signal quality-based and lane-to-lane matching rules.

First calculate min./max. propagation delays for the data signals based on the table below, then calculate lengths. Finally, apply some overdesign so after the layout has been designed, much greater-than-zero timing margins can be expected.

The source synchronous system timing can be handled in an absolute or in a relative way. The equations can be written in the same way as the synchronous systems, then the improving t_pd parameters changed to degrading parameters and multiplied by –1. After this, both the data and the reference t_pd will be degrading, sitting next to each other in the equations. Define delta_t_pd(+)  = t_pd_data - t_pd_str, and replace the t_pd to these.

Steps for absolute rules:
1. Choose an absolute reference signal length (with a tolerance) or a maximum clock skew constraint.
2. Determine the reference signal t_pd from signal integrity simulation.
3. Determine the transition delay of the data signal using an estimated trace length.
4. Calculate all the timing margins, where the data signal t_pd is zero. Use t_pd_ref_min./max. as input parameters. If t_pd_clk is improving, then use minimum value, otherwise use maximum.
5. Convert the timing margins to min./max. t_pd for the data signals based on Table 2.
6. Calculate min./max. lengths for the data signals based on t_pd.

Steps for relative rules:
1. Calculate all timing margins, where the data signal and reference signal t_pd both are zero.
2. Transform the timing margins to min./max. t_pd for the data signals, relative to the ref.signal.
3. Determine the transition delays for both the data and the reference signal, based on estimated trace lengths.
4. Calculate min./max. delta_lengths for the data signals.

The length calculation. If the driver/receiver circuits of the data and reference signals are the same, then exclude the transition delay from the relative length calculation, since their transition delays will be near equal. This way their propagation delay matching is simplified to be trace length matching. If L_min > L_max or L_max < 0, then it is not possible to design a board at the given parameters.

Steps:
1. Get the transition delays at the receiver by a signal integrity simulation using estimated trace lengths, both minimum and maximum.
2. The propagation velocity (v) has to be calculated at the signalling frequency: 

where c is the speed of light (3*10^8 meter/sec), Sr_eff is the effective dielectric constant of the materials surrounding the PCB trace.4
3. For absolute rules Length_min = v * (t_pd_min – transition_delay_min) and Length_max = v * (t_pd_max – transition_delay_max). For relative rules delta_Length(+) = v * (t_pd_max - transition_delay_data + transition_delay_ref)  and delta_Length(-) = v * (t_pd_min - transition_delay_ref + transition_delay_data). Use min transition delay for maximum length, and maximum transition delay for minimum length, but only if the two signals are not driven by the same chip. Otherwise, both min. or both max.

The overdesign factor (OVDF). After calculating trace length constraints are ensuring minimum zero timing margins, make the system more robust by applying some overdesign. Here we introduce the Overdesign Factor (OVDF= {1.1…20}) for the tightening.
Length_range = Length_max – Length_min
Length_min_new = Length_min + 0.5 * Length_range * (1 - 1/OVDF)
Length_max_new = Length_max - 0.5 * Length_range * (1 - 1/OVDF)

Transforming and summing constraints is simple algebra, but it might not be straightforward. To transform a min./max. length rule to an Offset+/-Delta, use the simple formulae:

Offset = (length_min + length_max)/2

Delta = length_max – Offset

The second case calls for merging two constraints. For example, the chipset design guide provides direct trace length rules for interfacing a DIMM memory to the processor, and we want to design a memory-down layout based on JESD21C guidelines. In such cases, transform both constraints to Offset+/-Delta description, then sum the offsets and deltas separately.

Conclusions

High-speed digital board design requires control of the trace lengths pin-to-pin on multipoint signal nets. To achieve this, software supports detailed complex trace length constraints. Sometimes designers can use standard trace length rules specified by chip manufacturers or standards, while other times they calculate them from pre-layout timing analysis. If the board designer did not use proper length constraints, the boards may never even start up in the lab. Often, timing parameters for the chips on the board are needed, but just not available. In those cases, timing parameters defined at package-pins can be used. What the post-layout timing analysis reveals is not whether the prototype board will start in the lab, but if it will operate reliably in the field at all times. If this verification is absent in product development, the risk is untraceable errors in products will be detected by customers.

1. J. Bhasker and R. Chadha, Static Timing Analysis for Nanometer Designs, Springer, April 2009.
2.  JESD79-xx, “DDR-SDRAM Memory Standards,” www.jedec.org
4. Dielectric Constant Frequency Compensation Calculator, buenos.extra.hu/iromanyok/E_r_frequency_compensation.xls.
9.  Xilinx DDR-SDRAM controller application notes: XAPP858, XAPP802, xilinx.com/support/documentation/application_notes.htm.
11. David Robert Stauffer et al, High Speed Serdes Devices and Applications, Springer, October 2008.
12. Jedec, JESD21-C, “Jedec Configurations for Solid State Memories,” jedec.org.
13. Steve Kilts, Advanced FPGA Design, Wiley-Interscience, March 2006.
14. Xilinx DAC/ADC interfacing application notes, XAPP873, XAPP866.

Istvan Nagy is with Bluechip Technology (bluechiptechnology.co.uk); This email address is being protected from spambots. You need JavaScript enabled to view it..

Save real-estate using digital pre-distortion to linearize power amplifiers.

In multiple channel applications of the next-gen communication systems, it is often desirable to develop an antenna system to combine many of these antennas into fewer antennas by using modern wide-band antennas and power amplifiers for multiple transmitters. This can be extremely helpful because in many locations, the deck of a vessel for instance, real estate is so limited it is not possible to mount every antenna required for each possible frequency range and application. Through the use of wide-band antennas and power amplifiers (PA), the power, weight and the amount of real estate consumed by antennas and electronics can be significantly reduced. This would permit smaller vessels to have the same capability as larger vessels and increase their tactical advantage.

To maximize efficiency, modern wide-band power amplifiers must operate in a nonlinear region. However, this creates large variations in the instantaneous output power, a condition described as high peak-to-average ratio (PAR). As a result, signals are distorted. To compensate for this distortion, linearization techniques must be applied to minimize spectral re-growth or intermediation products created by the nonlinearity of wide-band amplifiers. A method called digital pre-distortion is used to distort the signal prior to the input of the power amplifier. The signal is distorted such that the composite output of the power amplifier appears to have linear amplification over the desired frequency range, without distortion. Figure 1 shows the digital pre-distortion method, which is used to linearize power amplifiers over very broad bandwidths. Figure 2 shows an example of the effects of power amplifier nonlinearities on a transmitted spectrum and the potential spectral benefits of applying digital pre-distortion.

Figure 1. Linearization of a PA using digital pre-distortion. Courtesy Northrop Grumman.

Figure 2. a) Ideal (undistorted) spectrum; b) spectrum distorted by nonlinear amplifier; and c) post-amplification spectrum with ideal RF pre-distortion.

The effectiveness of digital adaptive pre-distortion is that it enables high power amplifier linearization of spectrum to combine multiple transmitters across entire VHF and UHF ranges. Harmonic or intermediation products are reduced by more than 15 dBm. Additionally, cancellation improves receiver dynamic range.
In the proposed antenna combining, pre-distortion is performed by digital signal processing techniques using open architectures that enable reduced cost and increased flexibility of the system. System level control, switch control for communication equipment selection, oscillator control and antenna selection are supplied by a standard single board computer utilizing a Versa Module European (VME) bus-based chassis. Also, populated in this VME chassis is a software-defined radio board, a wideband transceiver board consisting of field-programmable gate arrays, analog-to-digital converter, and digital-to-analog converter (DAC).

The digital signal processing for the pre-distorter is performed by FPGAs using a down converted and sampled signal. The signal is mixed down to an intermediate frequency for sampling by an analog to digital converter for input to the FPGA for filtering. Then the filtered digital data are sent to a DAC, which is then mixed back to the carrier frequency for input to the power amplifier. Thus an undistorted signal is output to the antenna for improved signal integrity. If the bandwidth of the pre-distorter is large enough, then multiple communication systems within the same frequency range can be combined to utilize the same amplifier and antenna set.
Some precautions need to be taken to ensure the communication systems are not utilizing the same frequencies, and have ample channel separation. Since the pre-distorter is implemented in reconfigurable hardware FPGA, modifications to the pre-distorter are possible even after the system has been produced.

ACI Technologies Inc. (aciusa.org) is the National Center of Excellence in Electronics Manufacturing, specializing in manufacturing services, IPC standards and manufacturing training, failure analysis and other analytical services. This column appears monthly.

Budget cuts will directly affect the US PCB supply chain.

With the US deficit running above $13 trillion and continuing economic uncertainty, the federal government has begun instituting significant cuts in military programs. These cuts will have a direct effect on the North American interconnect industry, both PCB fabricators and assemblers.

The just-issued Quadrennial Defense Review (QDR) foresees major cuts across the board in defense hardware. To some extent this includes already announced cuts, such as reductions in F-22 fighter plane acquisitions (to 180 total airframes) and to F-35 aircraft as well. Inventories of aircraft such as the A-10 attack plane, B-1B bomber, and the venerable B-52 also will be reduced.

At sea, while the government states its commitment to a 313-ship Navy, analysis of projected budgets over the next 10 years suggests that, between replacement construction and new classes of ship, the US Navy more likely will be reduced to approximately 270 vessels. The ASDS submarine program is in jeopardy, while carrier battle groups are projected to shrink to 10 from 15. Overall capital budgets for military programs may drop as much as 30% over the next five years, much of this in electronics.

Cost overruns on major programs have been horrific as well. The Littoral Combat Ship (LCS) program has in three years’ time ballooned from $220 million/ship to $480 million/ship – and that’s before the contracts have even been awarded. The F-35 fighter program budget has increased almost $100 billion over the past decade, and development remains 2.5 years behind schedule. While some of this is related to a massive flow of change orders and technology development issues, much of the problem lies with a hollowed-out workforce and knowledge base. These overruns cut further into available funds for new hardware. Former Defense Secretary Donald Rumsfeld entered office with a mandate to streamline the Pentagon procurement process. Instead, his efforts were devoured by the military-industrial complex.

What will be the effect on the US interconnect manufacturing base? The North American industry has shrunk from a high of $10 billion in 2000 to approximately $2.8 billion today. One reason it did not shrink further was the cushion of defense electronics. Many manufacturers went through the rigorous steps to qualify for MIL-PRF-31032, MIL-55110 and other standards, and the shift from commercial to military production kept many companies alive through the recession. These trends will require manufacturers to anticipate, adapt and innovate.

At the same time, many military systems have been commoditized, some successfully, others not so much. In other cases, subsystem manufacturing has been shifted offshore either because of commercial off-the-shelf technology (COTS) programs or efforts by some contractors to reduce costs.

Bluetooth and iPhone-type applications in military applications have been growing exponentially, but the loss of the handheld market to Asia means that much of the hardware technology has migrated offshore. This will have a long-term effect on programs such as Land Warrior. Because of security concerns, several major defense contractors have established secure communications projects, but these will have minimal impact on the demand for interconnect product.

Indeed, the proliferation of consumer-level high technology is amazing. The capabilities of handheld devices – whether for communications, data recording, conversion and transmission, or imaging – are beyond Buck Rogers. The mythical Star Trek Tricoder today is close to reality. And yet the defense technology establishment has lost its connection with the underlying electromechanical roadmap.

Discussions with key defense interconnect manufacturers have established there are significant opportunities, especially at the leading edge. Yet there is cognitive dissonance at the top. The role of enabling technology, especially at the interconnect level, has been forgotten. In addition, technology security has become a significant issue.

On the security side, the cyberwarfare arena has, according to many reports, been underfunded and misunderstood. Recent headlines in the telecommunications market, where governments have expressed deep concerns about potential technology breaches when purchasing networking gear that may be susceptible to trapdoors, viruses, and remotely executable programs, illustrate real-world issues with non-trusted components and hardware.

From every challenge springs an opportunity. There are real interconnect hardware-based solutions available both to protect systems and provide enhanced security, not only from penetration, but of intellectual property.

High-density interconnect combined with embedded active and passive component technology offers a security platform only dreamed of even five years ago. If critical IP or security is on an embedded chip, it can be shielded both physically and electronically, increasing the difficulty of corruption by orders of magnitude. Hard encryption keys can be made even more secure. Not only that, it makes a better, more reliable, more cost-effective product. Ask Apple or RIM or Sony.

Espionage efforts against the North American technology base have cost our national defense trillions of dollars when considering the loss of advantages in fields ranging from nuclear weapons design to aircraft design to under-the-hood applications such as the interconnect. Even simple applications such as software keys can be made more secure using a technology that, while slightly more expensive, saves much more on the back end. Think of the cost of the recent Wikileaks scandal. If adversaries cannot penetrate systems and/or copy technology or documents, how much is that worth in the long run?

The downsizing of the military electronics market is virtually inevitable. However, there is a chance to redefine and rebrand the interconnect, and with it, to add real value to the end-user. The North American industry, for once, can build on the experience of others. The inflection point for economically competitive leading-edge technology with reduced risks of implementation is close at hand.

The 1990s saw massive outsourcing and then offshoring based on the commoditization of the interconnect. The reality even then was that the interconnect is the cardiovascular system of every single electronics device in the world.

The electronics interconnect industry owes much of its life to the military sphere when OEMs such as Ford, Philco and Motorola found new and vital applications such as artillery fuses during World War II for what were then called printed wiring boards. Integrated design, engineering and manufacturing was critical to the success, and to an extent, to the winning of the war. We may have come full circle. Defining the interconnect as an active component changes the game and will require a much closer relationship between the manufacturing floor, the packaging designer, and the systems designer. This is one potential avenue for the interconnect industry, but will require significant re-engineering of management thinking and investment in plant and equipment.

However, the payoff in jobs, in productivity, in profits, is real and demonstrable. Over $3 trillion was spent in stimulus funding and to stabilize the US banking system. Virtually every dime spent, though, has been to maintain the status quo. We must spend our defense dollars especially wisely today. At the same time, innovation has taken a massive hit. Innovation will help grow our economy for the long term, and electronics are at the heart of the modern economy. The interconnect is the heart of the electronics device. Where better to invest?

The alternatives are not especially palatable. While electronics content will continue to grow as a percentage of overall systems cost, the hardware portion has to a great extent plateaued. In addition, with commoditization and design rules that are in many cases 30 years old, where is the potential for profit that keeps a company viable? Where will growth, or even stability, come from in a declining market?

Matthew Holzmann is president of Christopher Associates (www.christopherweb.com); This email address is being protected from spambots. You need JavaScript enabled to view it..

Some finishes are more resistant than others, but lack of a standard hinders testing.

A coworker asked yesterday why he was chosen to work on a defect that was less than a fraction of a percent in occurrence. It was like finding a needle in a haystack, he said. I know he was looking for sympathy, but I could not offer it, as my entire team has been working on creep corrosion for the past four years. For some, creep corrosion also occurs at less than a fraction of a percent defect rate. There is no industry standardized test to replicate it, and during the onset, the only thing known was that it predominately occurred in paper mills, fertilization plants, tire factories and car modeling studios. The answer to the question is even one unhappy customer is reason for concern.

When this column was turned over to me by John Swanson, he wisely instructed me not to tackle anything crazy on my first attempt. “Nothing like creep corrosion,” he said. This is the day I offer about 1000 words on the matter. I can and will talk at great lengths on this subject, so catch me at the next conference for the full dissertation.

Creep corrosion is a migration of copper sulfide across a circuit board. The SEM and EDS images (Figure 1) show a section of an annular ring. The pad is plated with silver (green) creep corrosion growing over the soldermask. The creep analyzes as rich in copper (red) and void of silver. Additional EDS mapping explains the creep area also is rich in sulfur.


Figure 1. SEM (left) and EDS photos of annular ring show creep corrosion (in green) growing over soldermask. The creep analyzes as rich in copper (red) and void of silver.

The defect reveals itself as an electrical short or open. Shorts occur when the volume of migrated copper has reduced the volume of a copper trace or via. A short occurs, of course, when copper sulfide travels to its neighboring trace or pad. For creep corrosion to occur, there must be exposed metal area in the final circuit. This is normally an area that was not soldered or not completely covered by solder. The environmental exposure must be high in sulfur and humidity for the corrosion to manifest. As stated, the greatest challenge is that a standardized test does not exist to recreate creep. IPC conducts a biweekly meeting to attack this very issue. After researching the environments where electronics were located, even those expected to be “clean” (such as data storage centers), it was determined there is a lot of bad air out there.

Initial attempts to create a controlled test for creep corrosion will be conducted in mixed flowing gas (MFG) chambers. The test centers around Battelle Class 4 conditions, but with almost 10x the concentration of hydrogen sulfide. Battelle uses 200ppb of H2S. This test uses 1500 to 2000ppb. Also, the operating temperature will be increased from 35° to 40°C. Unfortunately, not every MFG chamber can handle these test conditions. Some detectors are not calibrated to measure high levels of sulfur properly; some equipment cannot resist corroding under these conditions, and some lab managers just don’t want to expose their workers. At this point, a red flag should go up, and the alarm should be ringing in your head. It is difficult to accept, but our environments are getting worse, and these aggressive conditions are turning into reality. Electronics life expectations are increasing, so prolonged exposure will ultimately lead to high levels of corrosion on unprotected parts.

Of the surface finishes, immersion silver sees the greatest threat from creep corrosion. Topcoats have been formulated to mitigate creep. These topcoats act as barriers for the metal from condensation and contamination in the air. The topcoats are designed to protect both the silver surface and any copper metal that may be exposed. All these attributes have been achieved without detriment to the other functional performance characteristics of immersion silver. The immersion silver plus topcoat parts withstand all traditional MFG exposure. They even resist corrosion in more aggressive versions of MFG that veer from traditional conditions. Due to the limitations of some MFG chambers and the high cost of the test, OEMs and chemical suppliers have created internal chambers to recreate creep corrosion in a lab environment. These tests accelerate the conditions of mixed flowing gas and introduce sulfur in particulate form. Elemental sulfur is common to many of the environments where creep corrosion has occurred. It is a point that should be taken into consideration during test development.

The performance and acceptance of these topcoats has been well received, but some will hesitate until the real-world environments are truly understood and can be replicated. Questions are continuously raised about the ISA classification these coatings withstand, but perhaps the better question is, Should these classifications be revisited?

Lenora Toscano is final finish product manager at MacDermid (macdermid.com); This email address is being protected from spambots. You need JavaScript enabled to view it.. Her column appears quarterly.

Altium’s latest acquisition pushes PCB CAD to a new plane.

On Sept. 16, Altium (altium.com) announced plans to acquire Morfik Technology, a provider of cloud-based software applications, in an all-stock transaction worth an estimated AU$3.3 million. (The deal is pending due diligence.) The move is groundbreaking in that it reveals the full extent of the EDA company’s commitment to developing so-called cloud-based computing, whereby resources, including software, are shared on demand via the Web, vs. stored on millions of discrete company-specific servers.

PCD&F’s Mike Buetow spoke with Altium associate director of corporate communications Alan Smith the day of the announcement.

PCDF: How many employees does Morfik have?
AS: Twelve total, nine of whom are software developers. 

PCDF: When is the deal scheduled to close?
AS: We expect this to be in the next few weeks, but we can’t be precise because of the due diligence that forms part of the process. Having said that, we are also accelerating the acquisition by bringing the Morfik development team into Altium’s offices to work with the Altium team on the development of the ecosystem infrastructure that we now plan to build into our software.

PCDF: Will it be accretive to revenues within the next fiscal year?
AS: No, there’s no significant impact on Altium’s revenues.

PCDF: What specifically made Morfik so attractive to Altium?
AS: Altium has been working with Morfik for a number of years, so Morfik’s attraction, based on what it has done with the development of tools to create dynamic cloud-based ecosystems, is well understood. It has to do with Morfik’s software engineering approach to the creation of object-oriented connectivity to the Internet. The two key phrases here are dynamic and object-oriented. These are essential for managing large amounts of content, which is what you have in electronics design: content down to the scale of individual components.

PCDF: What are the top technical challenges behind moving electronics design to cloud-based architectures, and how does the Morfik acquisition help Altium overcome them?
AS: Two points: First, tactically, to help Altium establish its ecosystem today for subscription-based content delivery. Second, longer-term, Morfik is about developing cloud-based applications; these are the applications that will be the highest layer, running on the “sea of connected devices” in this ecosystem. If we are to deliver the tools and solution to help the designer move from device design to ecosystem design, then cloud-application development is a fundamental part of this solution.
PCDF: When it comes to cloud computing, Altium is on record as seeing Google as being superior to Microsoft. Will customers have to choose a platform based on one or the other, or will electronics design in the cloud be platform-independent?
AS: We don’t really know. Amazon (Amazon Web Services) is more significant than either Google or Microsoft because it provides a much broader industrial-scale infrastructure for cloud-based application that is provided as a utility service for organizations that want to provide these kinds of solutions.

PCDF: Would a cloud-based infrastructure force any change to the subscription model that most ECAD companies now rely on?
AS: Not specifically. Cloud-based (Internet-based) models allow a real-time connection between customers and suppliers, as well as between customers and customers. This creates the possibility of having a much stronger and timelier value exchange between these parties. In the future, they can move the basis of software subscriptions away from being primarily based on software upgrades (new features, etc.) to a much stronger focus on content that can be delivered over time. In electronics design this includes component models, designs, price and availability from part suppliers, as well as feature upgrades being delivered in a “plug-in” model.

PCDF: Would cloud-based electronics design help resolve the persistent library and documentation issues that today cost designers extraordinary amounts of time?
AS: An emphatic “yes.” Vastly improved data management is already a big focus of the next release of Altium Designer, in beta today, and cloud-based data exchange is a big part of this. Initial focus is on management of data between the board-design process, and the fabricators and assemblers of those boards, especially when the parties are in different regions and speak different languages.



Seeing clearly: Alan Smith

Page 3 of 18