Length Matching Routing for PCB Busses

April 09, 2022

Shortening and folding traces takes creativity and persistence, as long as the timing budget is met.

PRINTED CIRCUIT BOARDS are becoming more complex, with high-speed interfaces more common. Whether it is a PCIe, Ethernet, USB or memory of some kind, clock nets proliferate across the board. Those clocks have kindred spirits in nets that want to hit the receiver in conjunction with the ticking clock.

Crucial parameters of a group of traces include the target length or maximum. Less is more. Most other signals on the board will switch periodically. Meanwhile, the clock switches all the time. The clock uses the same voltage, but the constant stream of “10101010101…” creates more energy fields than a seemingly random sequence of ones and zeros. These constantly shifting reactive clock net fields are the reason we shield the clock, giving it space to do its thing.

Shorter traces equal lower electromagnetic emissions. Shorter clocks have comparatively lower emissions and are less lossy. This gives rise to the use of available length matching tolerance to minimize the length of the clock, starting with finding the longest member of the group. Look at that net; locate any extra bends or places where it can be shortened.

Remember 45° angles are nice but not necessary, provided acute angles are avoided in the routing. Stretch that trace like a rubber band around the obstacles until it is as short as possible. Ideally, it falls to second place or farther down the list, sorted by length. Now, can you make the second longest one any shorter using the same process? Keep massaging the traces with a focus on shortening the longest ones. Once those longer traces are optimized, the ideal clock length can be calculated by subtracting the tolerance of the timing budget from the length of the longest connection.

“SHORTER CLOCKS HAVE COMPARATIVELY LOWER EMMISSIONS AND ARE LESS LOSSY.”

FIGURE 1. One of my favorite routing tasks: single-ended with 128 lanes, using two out of 12 layers. Due to placement, there was a lot of tuning, with two more layers full of differential pairs.

For example, the longest trace in the group is 18.5mm, and the length-matching requirement is all traces are equal to the clock plus/minus 0.5mm. That indicates a clock length of 18mm. Why wouldn’t we match all the lengths exactly? For one thing, that’s beyond the spec. Second, it would compel the naturally shorter traces to grow to the full 18.5mm, rather than meandering to the point where it approaches the 18mm of the clock, minus the 0.5mm tolerance. The full range of the traces is 18.5 to 17.5mm, with the clock straddling the difference. Again, this ideal length for the clock is calculated by subtracting the tolerance (or most of it) from the longest trace once everything is optimized. The caveat is any editing of the clock or traces on the edge of the tolerance band is likely to upset the timing budget.

The upside is this uses the minimum amount of copper, every segment of which is a potential emission concern. This template could use the least amount of real estate as well, folding the traces to take up blank spaces in the overall maze. Once the traces meet the timing budget, it’s not hard to add a wrinkle here and take one away from the same trace elsewhere. The wrinkle that goes away leaves room for the next trace to follow the new contours. It may seem tedious, but it’s one of my favorite PCB design tasks, as it rewards creativity and persistence.

FIGURE 2. Some situations call for outer-layer routing, such as this DDR3 implementation, where the microcontroller is pinned out to mostly match the memory chip.

FIGURE 3. Another outer-layer approach driven by the series elements over a 4-layer PCB routing solution.

Match every trace in the group as close as possible. At times, the length tolerance is so small as to render these gains irrelevant. One example is EMMC, where the total number of wires is six, and only five of those are matched. They are well matched and one of those occasions in which I want every trace to be tightly constrained.

In that event, the gameplan is to make every trace the same length as the longest natural trace. Placement becomes critical, so the connections have a similar path. Signal integrity engineers generally prefer critical traces like these have all their tolerance available, meaning zero – or nearly zero – slack in the matching rules. Call it risk aversion, but sometimes you only get one chance to shine for the customer.

Time of flight rather than length of trace. Thus far, this discussion has been about the length of traces. In absolute terms, what we’re really talking about is propagation delay. Delay is not measured in millimeters; it’s measured in milliseconds. When the tolerances get unusually thin, we want to account for the physics in play, where traces on the outer layers allow data to flow faster than the traces on internal layers.

Calculating the time of flight involves taking the topology into account. We typically prefer the routing to go on innerlayers to reduce electromagnetic interference, even though the outer layers are faster in terms of propagation delay. This is slightly more complicated than measuring the trace lengths. Limiting the exposed traces to fanout areas is a simple way to manage the disparity.