A between-pair skew compensator for parallel data communications is presented. It can detect time skew between two independent data sequences using continuous-time correlations and then automatically align the two using a voltage controlled wide-bandwidth data delay line. A 5Gb/s sub-bit between-pair skew compensator in 0.13μm CMOS occupies 0.03mm active die area and dissipates 22.5mW. Introduction Parallel links are widely used in high-performance computing systems to enable high-throughput data communications. However, mismatches between channels can cause data transmitted in one channel to arrive at a different time compared with that in another channel. This is referred as between-pair skew (BPS) or inter-pair skew. BPS can significantly reduce the receiver timing margin and limit the data transmission rate. To achieve a higher per pin data transmission rate, either restrictions on channel matching (which limits transmission distance or requires high-cost high-quality transmission media) are enforced to reduce BPS or per channel clock & data recovery (which consumes large power and area) is required. To avoid these issues, previous designs [1-3] have used one local core clock to generate multiple phases. For each channel, the individual clock phase is determined during a calibration period when clock signals are sent along all data lines as training signals. Additional circuit, i.e. additional power, is then needed to align recovered data from all channels. This paper presents a BPS compensator that can automatically detect skew between two independent data sequences and align the two. With the proposed compensator, a single clock recovered from a reference channel can be used to directly sample all data channels and recovered data from all channels are naturally aligned. Circuit Design Fig. 1 presents the proposed automatic BPS compensation scheme. It addresses sub-bit BPS compensation only, as the integral-bit BPS portion can be compensated in a higher communication layer. The compensator includes a continuous-time voltage controlled data delay line (VCDDL) for data de-skew. It also includes a BPS detector to detect skew and generate an appropriate feedback control voltage Vctrl for the VCDDL. The reference channel can be a clock or a data sequence. Fig. 2 presents the differential VCDDL design. Since data signals have much wider bandwidth compared with clock signals, the VCDDL needs to provide a flat magnitude response and tunable group delay across the major signal spectrum to avoid distortion when delaying data signals. Wide-bandwidth delay elements have been previously studied as delay units for tap delay lines in continuous-time FIR equalizers [4, 5]. However, varying the group delay inevitably results in a change in magnitude. In this application, it is desirable to have a delay element that maintains its magnitude when its group delay is adjusted. The proposed VCDDL consists of six cascaded differential delay elements (DE) to provide a maximum 200ps tunable delay for data in order to achieve a full sub-bit BPS compensation for 5Gb/s data. Each DE has a fast path (-gm1 with R1C1) and a slow path (-gm2 with R2Cvar followed by -gm3 with R1C1). The R2Cvar term is intended to provide adjustable delay and is much larger than R1C1. The magnitude of the DE transfer function H(ω) is only determined by R1C1 as magnitudes of 1±jωR2Cvar terms cancel out. Meanwhile, the group delay is mainly determined by the R2Cvar term. Cross-coupled differential pair with source degeneration capacitor C0 improves signal bandwidth. Fig. 1. Automatic sub-bit between-pair skew compensation scheme.