

## 19.7 A CMOS Transceiver Analog Front-End for Gigabit Ethernet over CAT-5 Cables.

Pierte Roo, Sehat Sutardja, Shuran Wei, Farbod Aram, Yi Cheng

Marvell Semiconductor, Inc., Sunnyvale, CA

As Fast Ethernet (100Base-T) proliferates and becomes the leading standard in local area networks (LAN), demand for gigabit ethernet increases rapidly. Although fiber-optic gigabit transceivers (1000FX) are available, the high cost of fiber modules and fiber cables limits deployment of 1000FX transceivers to switch uplinks and backbone connections. An integrated CMOS transceiver for gigabit ethernet over unshielded twisted pair (UTP) category-5 (CAT-5) cables significantly reduces cost while providing high data bandwidth at relatively low power consumption. A full-duplex transmission allows up to 2Gb/s data throughput. Furthermore, compatibility with existing cabling infrastructure allows rapid deployment of 1000Base-T transceivers on desktop and laptop PCs.

To transmit 1Gb/s of data over 4 pairs of 100m CAT-5 cables while maintaining sufficient signal-to-noise ratio (SNR) for a bit-error rate (BER) less than  $10^{-10}$ , the transmit signal requires multi-level signaling combined with a forward error correction (FEC), which is referred to as 4 dimensional 5-level pulse amplitude modulation (4D-PAM5) [1]. An 8b word is trellis coded into 9b using 2b-to-3b convolutional coding. The 9b word is then mapped into 4 groups of data samples. Each sample contains five possible discrete levels,  $\{+2, +1, 0, -1, -2\}$ . The data samples are filtered by a partial-response filter ( $0.75+0.25D$ ) and then transmitted at 125MSample/s to maintain backward compatibility with existing 100Base-T. The channels are transformer-coupled to remove DC coupling between transceivers. Transmitting high-frequency signal over twisted-pair cabling results in transmission line effects where the complex line impedance delays and attenuates the transmitted signals; transmitting in duplex mode over the same pair of wires produces echoes; not shielding the twisted pairs causes crosstalk; and finally, using transformer coupling results in baseline wander. All those effects combined make data detection difficult. Trellis coding and extensive digital signal processing (DSP) make it possible to recover data in the presence of signal attenuation, echoes, crosstalk, baseline wander, and other alien noises [2]. To achieve the specified performance, the analog front-end must share the tasks of echo cancellation, baseline correction, and low-pass filtering.

Each 1000Base-T transceiver contains 4 identical channels of transmitters and receivers, as shown in Figure 19.7.1. Four receive paths converge into one sequence detector. Each channel contains an active echo canceller, a baseline-correction circuit, one echo canceller, three cross-talk cancellers, a feed-forward equalizer, and a feed-back equalizer, as shown in Figure 19.7.2. When a signal is received the active echo canceller removes transmit signal and the analog baseline correction circuit restores baseline. An anti-aliasing low pass filter (LPF) with self calibration shapes the signal before the analog-to-digital converter (ADC). In the digital domain, the echo and cross-talk components are removed first. A finite impulse response (FIR) filter and a decision feedback equalizer (DFE) restore the signal to five discrete levels. The digital filter taps are adapted using least-mean-square (LMS) algorithm. All four equalized signals converge into one sequence detector that decodes the trellis code.

Several echo cancellers are proposed in different hybrid configurations of transformers and resistors [3]. Cost and SNR degradation make those solutions undesirable. An active echo canceller using a replica transmit signal effectively removes the transmit signal from the receive signal while maintaining signal integrity. In a current-drive transmitter, a digital-to-analog converter (DAC) converts digital transmit codes into transmit currents, as shown in Figure 19.7.3. A replica driver produces a replica transmit signal. Because the replica driver mirrors the main driver, any pulse shaping resulted from the DAC and C (Figure 19.7.3) is replicated in the replica signal. A transmit pulse is shown in Figure 19.7.4a. The receive signal is obtained by

subtracting the replica signal from the transmit signal,  $V_{RCV} = V_{TX} - V_{TXR}$ . A receive signal contains about a 10% residue from the transmit signal, as shown in Figure 19.7.4b. Two large pulses at the front result from mismatches in rise and fall times between the main transmit and replica transmit signals. The transmit signal contains high-frequency boost from the transformer that is difficult to replicate, and therefore, the residual pulses must be reduced by a LPF and removed later by the digital echo canceller.

The DSP calculates the gradients for timing recovery and baseline correction. Baseline wander is caused by the high-pass response of transformer coupling, where the time constant is a function of the inductance and termination resistance,  $\tau = L/R$ . Since this time constant is on the order of microseconds, a relatively slow transient response, an analog integrator is suitable. Also, removing the baseline wander in the analog domain is desirable, because it reduces the signal dynamic range before the ADC. The analog baseline correction circuit consists of a charge-pump integrator and a gain stage. The digital gradient is integrated in analog circuits using a charge-pump integrator. The integrated value represents a scaled magnitude of the baseline wander. A gain stage scales the correction value, which is then summed with the receive signal to restore baseline.

Before converting the analog signal to a digital signal, an anti-aliasing filter is needed to remove high-frequency components as well as attenuate any other out-of-band noise. Since the sampling frequency is four times the dominant data component, a second-order LPF is sufficient to shape the receive signal. However, variation in process and a wide range of operating temperature make the filter cutoff frequency unpredictable, and therefore, self-calibration is required. A calibration circuit provides the correct C for a desired RC product, using a fixed R and a variable C as shown in Figure 19.7.5. The RC product corresponds to the desired bandwidth. In the calibration, a switched-capacitor circuit, controlled by two non-overlapping clocks  $\Phi 1$  and  $\Phi 2$ , uses a constant current  $I_c$  to charge  $C_1$  for a fixed time T to produce a voltage,  $V_c = I_c T / C_1$ . The difference between  $V_c$  and a reference voltage is integrated. A switched-capacitor integrator adjusts the current until the integration loop reaches equilibrium [4]. This current is mirrored by a factor k and applied to a resistor to obtain a voltage,  $V_r = k I_c R_1$ , that is compared against a reference voltage, the same voltage used for the integrator. A digital logic uses the comparison result to adjust the value of  $C_1$  in discrete steps.  $R_1 C_1 = kT$  when the calibration loop settles to a stable condition and the correction code stops changing. The correction code is applied continuously to the filter and thus compensates for temperature variation, as well. The desired RC product is set by choosing values of  $R_1$  and  $C_1$ . The RC product is chosen to correspond to a cutoff frequency of around 30MHz. A low cutoff frequency helps reduce noise and clock jitter, by removing high-frequency components from transmit pulses, echoes, and crosstalk. However, an excessively low cutoff frequency can reduce signal magnitude to a point where SNR is reduced and overall performance degrades.

After the LPF, a 9b pipeline ADC converts the analog signal to a 9b binary code. The ADC architecture is based on a 1.5b/stage design with digital error correction [5]. To achieve differential non-linearity (DNL) less than a quarter of the least significant bit (LSB) at 9b level, the opamps must settle beyond 11b accuracy; therefore, the opamps must have DC gain  $>65dB$  and settle within 3.5ns. To achieve this, a two-stage opamp with cascode compensation is used. An opamp with cascode compensation requires smaller feedback capacitance than that of Miller compensation, and therefore, this opamp can operate at lower power. Using a 7.8125MHz input sinusoidal tone, the ADC achieves a DNL  $<0.25$  LSB, an integral non-linearity (INL)  $<0.8$  LSB, and a signal-to-noise and distortion ratio (SNDR) of 50.56dB.

Transceiver test results show 170m maximum transmit distance at a BER  $<10^{-10}$ , exceeding the specification [1]. The eye diagrams for one channel before and after signal processing are captured in Figure 19.7.6, for a cable length of 140m. Before signal processing, the signal levels are incoherent and show no discernible information.

tion. However, after signal processing, the eye diagram shows 5 distinct discrete levels, a PAM5 signal. The spaces between the discrete levels in the eye diagram indicate that sufficient SNR remains to ensure low BER. The test results show that the mixed-signal processing circuits remove all the channel impairment associated with data transmission over CAT-5 cables. Transceiver measured performance is summarized in Table 19.7.1, and a chip micrograph is shown in Figure 19.7.7. Internal regulation allows for operation with a single 3.3V power supply. The chip is fully compliant with IEEE Standard 802.3ab specified for 1000Base-T [6].

#### References:

- [1] "Physical Layer Parameters and Specifications for 1000 Mb/s Operation Over 4-Pair of Category 5 Balanced Copper Cabling, Type 1000Base-T", IEEE Std. 802.3ab, 1999.
- [2] He, R., et al., "A DSP-Based Receiver for Gigabit Ethernet over CAT-5 Cables," ISSCC Digest of Technical Papers, Feb. 2001.
- [3] ISSCC99 Short Course on Gigabit Ethernet over CAT-5 cabling, 1999.
- [4] Gregorian, R. and G. C. Temes, Analog MOS Integrated Circuits for Signal Processing. New York: Wiley-Interscience Publication, pp 270-276, 1986.
- [5] Lewis, S., "A Pipelined 9-stage Video-rate Analog-to-Digital Converter," IEEE J. Solid-State Circuits, vol.27, pp.351-358, March 1992.
- [6] University of New Hampshire InterOperability Lab Report, June 2000.



Figure 19.7.1: Gigabit transceiver system block diagram.



Figure 19.7.2: Single channel block diagram.



Figure 19.7.3: Transmitter with replica driver for active echo cancellation.



Figure 19.7.4: Transmit signal before active echo cancellation and receive signal after echo cancellation.

Continued on Page 458



Figure 19.5.6: Chip micrograph.



Figure 19.6.7: Die micrograph.



Figure 19.7.5: LPF calibration.



Figure 19.7.7: Chip micrograph.

Figure 19.7.6: Measured eye diagram before and after signal processing at 140m (CAT-5).

| Parameters                                  | Values                  |
|---------------------------------------------|-------------------------|
| BER at 100m                                 | <1e10 <sup>-14</sup>    |
| BER at 170m                                 | <1e10 <sup>-10</sup>    |
| Max. cable length                           | 170m                    |
| Transmit jitter (pk-to-pk)                  | 300ps                   |
| ADC DNL/INL ( $f_{in} = 7.8125\text{MHz}$ ) | 0.25 LSB/ 0.8LSB        |
| ADC SNDR ( $f_{in} = 7.8125\text{MHz}$ )    | 50.56 dB                |
| Power supply                                | 3.3V/ 1.8V              |
| Total chip power dissipation                | 1.8W                    |
| Analog power dissipation                    | 1.1W                    |
| Process                                     | 0.18 $\mu\text{m}$ CMOS |
| Chip die size                               | <25mm <sup>2</sup>      |

Table 19.7.1: Measured performance summary.



Figure 20.1.7: Instruction window buffer die micrograph.



Figure 20.2.6: (a) 32b adder delay comparison with conventional adders. (b) 32b adder power consumption comparison with conventional adders.



Figure 20.2.7: Power-delay product comparison with conventional adders.