

# *Chapter 6*

---

# CMOS COMPARATORS

*In this chapter we shall deal with the design of CMOS comparators. A comparator is the basic component mainly used in analog-to-digital converters. Ideally, it generates an output logic signal as an instant response to the sign of an analog input (voltage or current). Obviously, a real circuit doesn't achieve the ideal function. The most important limits are the finite sensitivity, the offset and the finite speed. All the above limitations affect the performance of systems where comparators are used, especially when it is required to achieve high speed (or a high conversion rate) and high resolution.*

*Analog designer must know well how to properly face various design issues. This chapter will provide a number of suitable guidelines for that purpose.*

## **6.1 INTRODUCTION**

The electrical function of a comparator is to generate an output voltage which value is *high* or *low* depending on whether the sign of the input is positive or negative (Fig. 6.1). We can have two different types of input: voltage or current. In the former case the input voltage is measured with respect to a given reference level. Therefore, the comparator determines whether the amplitude of the input is higher or not than a reference. When the current is the input varia-



Fig. 6.1 - Symbol and ideal transfer characteristics of a comparator.

ble the comparator determines whether the input current is flowing in or out the input terminal.

A logic signal denotes the output. The amplitude of electrical representation of the *high* or the *low* state should match the convention used in the associated digital logic to clearly distinguish between a logic *1* and a logic *0*.

### REMEMBER

A comparator can be “continuous time” or “sampled data”. In the latter case the use of a clock to control the comparator operation permits us to achieve higher speed and to properly handle the offset issue.

When a comparator is used in a sampled data system a clock controls the action of the circuit. The comparator provides the output with a given periodicity synchronous with the clock. Therefore, a given time interval is available to achieve the result. Often, the fast variation of the input signal and the defined speed of the circuit used determine the need to separate the two functions inherent to the comparison process: to “catch” the value of the input signal and to generate the logic output. A sampled data system favour this disjunction: the clock period can be divided into two (or more) phases: one completes the sampling of the input and the other transforms the result into the logic signal. The latter non linear operation can take advantage of the use of a latch. A latch effects a regenerative amplification of the input and, thanks to its positive feedback, preserves the achieved output.

## 6.2 PERFORMANCE CHARACTERISTICS

Fig. 6.2 shows a quite general architecture of a sampled data comparator. It is the cascade connection of three blocks: a sample and hold, an amplifier and



Fig. 6.2 - Block diagram of a clocked comparator and its phase control scheme.

a latch. The circuit samples the input during  $\Phi_1$ . The gain stage amplifies the hold value during  $\Phi_2$ , and at the end of the phase  $\Phi_2$  the regenerative action of the latch starts. In addition during  $\Phi_1$  or  $\Phi_2$  the circuit can take some action against limits like the offset, or the saturation caused by a previous overdrive.

Only few applications require a continuous-time operation. In such cases the architecture of the comparator becomes a simple amplifier which gain is large enough to achieve an output voltage representing the logic levels.

We discuss below the most important features of the comparator. We will mainly refer to the architecture of Fig. 6.2. However, the features that apply to the amplification block characterize continuous-time comparators as well.

**Sensitivity:** it is the minimum input voltage (or current) that produces a consistent output signal within the expected comparison time. For modern applications the sensitivity required is pretty low. For example, a 10 bit data converter requires a sensitivity of  $1/2 \text{ LSB}$  (less significant bit) that corresponds to  $1/2048$  the reference voltage (or current) used. Thus, for 10 bit and 1 V reference we need 0.5 mV sensitivity.

**Input offset ( $V_{os}$ ):** It is the voltage that must be applied to the input to obtain the crossing point between low and high logic level. The feature is analogous to the offset of op-amps. Even the causes of the limitation are similar.

**Amplifier response time ( $t_r$ ):** It is the minimum time-interval required to achieve the proper logic output as a response to a minimum input step. In the architecture of Fig. 6.2 the latch needs a given signal at its inputs to ensure the logic output. The pre-amplifier stage achieves this level in a time that depends on the input step amplitude: the step response of a gain stage is a ramp during the slewing that turns into an exponential in the linear region.

**Overdrive recovery time ( $t_{rec}$ ):** When the input signal is pretty large the gain stage (or part of it) saturates to the positive or negative rails. If the input becomes small with the opposite sign, the gain stage takes some time to react and generates the voltage required to produce the output logic. The time required is higher than the response time and the extra time required is called

overdrive recovery time. Often, the recovery from overdrive is much more time consuming than the amplification. Therefore, designer should put a special attention to this limitation.

**Latching compatibility:** The output of the gain stage should properly drive the latch. Therefore, depending on the specific scheme used it is necessary to provide the effective voltage levels at the input of the latch.

**Power supply rejection:** This is a feature equivalent to the one already discussed for op-amps. Spur signals affecting the power supply lines can modify the input sample and produce wrong outputs. The power supply rejection describes the ability of the circuit to avoid the limit.

**Power Consumption:** The clocked operation leads, in addition to the static power consumption, to a dynamic contribution. It depends, like in digital circuits, on the clock frequency and the capacitances that the preamplifier and the latch are required to charge and discharge.

**Hysteresis:** The comparison threshold for input signals changing from low to high can be different from the threshold to signals changing in the opposite direction. The difference between the two thresholds is the hysteresis. The effect can be a limit or a benefit, depending on the applications. For continuous-time applications the crossing of a noisy input signal through the reference may produce many transitions high and low of the output voltage. A hysteresis larger than the noise level avoids the effect and results are beneficial. When the comparator is used in data converters, hysteresis is a limitation: it can cause different output codes depending on the sign of the input signal derivative.

### ***Example 6.1***

*A simple comparator is made by the cascade of four digital inverters. Perform simulations using a the digital inverter of a library available and estimate the response time and the delay due to the recovery from overdrive. Simulate the response time using a step input changing by  $\pm 2$  mV around the threshold. For the recovery from overdrive use a step input that bounces from - 30 mV below to 2 mV above the threshold.*

#### ***Solution:***

*The reader that doesn't have the access to a digital library should design the inverter by herself. Use an n-channel transistor whose W/L is two times the minimum allowed. The p-channel transistor must bring the inverter threshold at 1.65 V. Use 3.3 V supply voltage.*

*The figure of the next page shows the schematic used for simula-*



tions. A proper bias circuit drives the cascade of the four inverter. It includes an inverter whose output is connected to the input to provide the threshold voltage. A voltage-controlled-voltage-source copies the threshold and a second VCVS adds up the input. The schematic includes two equal versions of the same circuit. The network on the top measures the response time; the one on the bottom determines the overdrive delay.

The figure below shows the simulation results. OUT2 indicates that the response time is 1.7 nsec, a pretty low value. However, we have



*to remember that the input step is as large as 4 mV. The second trace (OUT1) marks the low to high transition at 3.15 nsec. Therefore, the recovery from overdrive is 1.45 nsec, a significant fraction of the response time.*

*Additional simulations show that as the overdrive becomes more negative the recovery time increases. When the overdrive reaches about -300 mV the recovery time saturates at 2.2 nsec. Below that input level the output of the first inverter becomes  $V_{DD}$ .*

*The input waveform exhibits an inverter threshold of 1.481 V. A bit lower than  $V_{DD}/2$ .*

---

### 6.3 GENERAL DESIGN ISSUES

Many applications ask for a comparator sensitivity equal or below 1 mV. However, the offset of a typical CMOS amplifier is larger than 1 mV (the systematic and the random contributions come to 5-10 mV or so). Therefore, offset compensation is one of the most important design problems. We will study appropriate techniques that alleviate the problem shortly.

An important limitation to the speed is the time to recover from overdrive. When a gain stage goes into saturation it requires current and time to charge (or discharge) nodes that have been pushed close to  $V_{DD}$  or ground. A frequent remedy consists in the clipping of voltage nodes. The clipping limits the swing of nodes and prevents saturation. Moreover, when the timing of the system consents a dedicated time-slot the problem can be solved by resetting the critical nodes before every comparison.

Continuous-time comparators are difficult to design because they can not take advantage of the regenerative action of a latch. The gain must be large enough to generate the logic signals with the minimum input. By contrast, a sampled-data comparator require less effort: using a latch the gain stage of Fig. 6.2 must generate relatively low outputs (often differential). Typically a differential signal of  $\pm 100$  mV is large enough to drive the latch and to account for any possible mismatch. Accordingly, the amplification of the gain stage can be relaxed by 20 - 30 dB with respect to the continuous-time case.

For high frequency applications the static gain is not the key parameter. The time required to obtain a proper output level is normally much less than the time constant of the gain stage. Therefore, just the initial part of the step response is effective. What is important is not the asymptotic value but the voltage amplitude that the output reaches in the time-slot available.

Noise over the power supply (or the substrate) and the noise of electronic components affect the output. An op-amp is required to control the noise over the entire band of the signal. Instead, a clocked comparator must minimize the noise when the latch starts up. Therefore, it is important to use a proper clock timing. Namely, the clock used by the latch must be suitably apart from the clock controlling digital sections.

## CALL UP

Very high speed comparators exploit just the initial part of the gain stages time response. Successive stages and a latch take care of further signal amplification.

### 6.3.1 Architecture of the Gain Stage

Two different approaches permit us to obtain the required gain: to employ a single complex amplifier or to use the cascade of many simple stages. In the design of op-amps (or OTA's) stability requirements limit the number of stages that we can use. By contrast a comparator operates in open loop conditions. Therefore, it is possible to use a cascade with any number of stages.

The small signal equivalent circuit describes the speed limitation of a linear network. A comparator mainly handles large signals. However, to study the speed performance of comparators we assume the signals small enough to justify a small signal analysis.

Let us consider the cascade of  $n$  equal gain stages. The one-pole equivalent circuit of Fig. 6.3 represents the small-signal behaviour of each stage. The response to an input step is an exponential with time constant  $R_o C_o$

$$V_o(t) = V_i g_m R_o (1 - e^{-t/R_o C_o}) \quad (6.1)$$

For high-speed applications it is not possible to wait for a long time and, typically, only the initial part of the response is used. Therefore, (6.1) can be approximated by

$$V_o(t) \approx V_i t \frac{g_m}{C_o} \quad \text{for} \quad t \ll R_o C_o \quad (6.2)$$

showing that the output voltage changes linearly with slope  $g_m/C_o$ . Observe that the output resistance (and the DC gain) does not influence the transient response in the initial part.

The output for two equal stages is the convolution of an exponential with an exponential or, for times lower than the time constant, the convolution of a



Fig. 6.3 - One-pole equivalent circuit of a single gain stage.

ramp with a ramp

$$V_{o,2}(t) = V_i \frac{(g_m/C_o)^2}{2} t^2 \quad (6.3)$$

in general, for  $n$  equal stages the output voltage is

$$V_{o,n}(t) = V_i \frac{(g_m/C_o)^n}{n!} t^n \quad (6.4)$$

Fig. 6.4 represents equation (6.1) for 1, 2, 3, and 4 stages. At the very beginning the output of the first stage is higher than the others. At  $t = 2C_o/g_m$  the output of the second stage becomes the higher one; later, at  $t = 3C_o/g_m$ , the output of the third stage takes over and so on. It turns out that there is an optimum number of stages that achieves a given required gain in the minimum time.

The use of equation (6.4) leads to the optimum transient performance for  $n$  cascaded stages

$$V_{o,n,opt} = V_i \frac{(n+1)^n}{n!} \quad (6.5)$$

that occurs at the time

$$t_{n,opt} = (n+1) \frac{C_L}{g_m} \quad (6.6)$$

showing that, for example, a small-signal gain larger than 10.6 and lower than 26 is optimal achieved with the cascade of 4 stages.

The above analysis provides just a design hint: the equations used imply a small signal operation. Since the sensitivity of typical applications is in the order of fractions of mV, the above results are useful for gains up to few tens.



Fig. 6.4 - Step response of the cascade of  $n$  equal stages. A single pole model describes each stage. The normalized input is  $tC_0/g_m$ ; the normalized output is  $V_o/V_{in}$ .

## 6.4 OFFSET COMPENSATION

The auto-zero technique is the basis of all the schemes used to compensate the offset. Fig. 6.5 depicts the underlying concept. The approach is appropriate for sampled-data operations being the scheme controlled by two non-overlapped phases. During the phase 1 the sample-and-hold reads the offset,  $V_{os}$ . During phase 2 the input signal,  $V_{in}$ , and the stored offset are summed up and applied to the inverting terminal of the gain stage. Therefore the differential input becomes

$$V_d = V_{os} - (V_{in} + V_{os}) = -V_{in} \quad (6.7)$$

showing an inverting operation and, more important, the cancellation of the offset contributions.

Equation (6.7) assumes that the offset at the sampling time and the one during phase 2 are equal. A more precise analysis leads to

$$V_d\left(nT + \frac{1}{2}T\right) = V_{os}\left(nT + \frac{T}{2}\right) - V_{in}\left(nT + \frac{T}{2}\right) - V_{os}(nT) \quad (6.8)$$

Showing a subtraction of the actual offset and its delayed version. The



Fig. 6.5 - Conceptual scheme of the offset cancellation method.

method is effective anyhow: the offset is a *dc* or a very slow varying signal. It mainly comes from static errors (geometrical mismatch or technological disuniformity) and temperature drifts. However, if we incorporate the effect of the input referred noise a given frequency dependency will result. The delay between the two offset terms in (6.8) leads to the following offset transfer function

$$H_{os}(\omega) = 1 - e^{-sT/2} \quad (6.9)$$

or, using the z-transform

$$H_{os}(z) = 1 - z^{1/2} \quad (6.10)$$

that, using  $z = e^{j\omega T}$ , leads to

$$F_{os}(\omega) = \frac{e^{-j\omega T/4}}{2j} \sin\left(\frac{\omega T}{4}\right) \quad (6.11)$$



Fig. 6.6 - Offset transfer function of the circuit in Fig. 6.5.

Equation (6.11) denotes a high pass transfer function: the *dc* component vanishes while the low frequency terms are significantly attenuated. By contrast, at  $f=\pi/T$  the noise transfer function (depicted in Fig. 6.6) shows an amplification by a factor 2.

The described method is often named *correlated double sampling technique*. As a matter of fact, the circuits sample the offset two times: one by the sample-and-hold and a second time by the input terminal of the gain stage. The correlated part of the two samples is cancelled out. The other name, *auto-zero* technique, describes the capability of the method to measure and zeroing the offset without any external help.

### TAKE NOTE

The correlated double sampling technique (or auto-zero) procures a high pass transfer function. The technique is beneficial only for the low frequency components of the input referred disturbances.

#### 6.4.1 Implementation of the Auto-zero Technique

Fig. 6.7 shows a possible circuit capable to implement the auto-zero technique. The circuit uses a gain stage, a capacitor  $C_A$  and three switches controlled by two non-overlapped phases. During phase 1 the switch  $S_1$  connects the gain stage in the unity gain configuration. Assuming the gain large enough, the voltage of the inverting terminal equals the offset,  $V_{os}$ . During phase 2 the gain stage goes in the open loop configuration and the switch  $S_3$  connects the left terminal of  $C_A$  to the input voltage,  $V_{in}$ . We assume that at the first approximation the capacitor operates like a level shifter. Therefore, the inverting terminal of the gain stage is a shifted replica by  $V_{os}$  of the input voltage, as requested.

Strictly speaking the voltage at the inverting terminal during phase 1 is not the offset but



Fig. 6.7 - Block diagram used to implement the offset cancellation approach

$$V_{\text{os}} = \frac{V_{os}(A_0 - 1)}{A_0} \quad (6.12)$$

therefore the circuit doesn't perform a complete compensation of the offset. The residual offset is

$$V_{os, res} = \frac{V_{os}}{A_0} \quad (6.13)$$

That is negligible, if the gain of the stage is large enough.

The circuit in Fig. 6.7 has two drawbacks

- During phase 1 the stage is connected in the unity gain configuration: this may require to compensate the stage.
- The opening of the switch  $S_1$  causes an injection of charge on  $C_A$  because of the clock feed through effect.

The first limit can be detrimental to the speed of the circuit. During the phase 1 the switch  $S_1$  connects the circuit as a unity gain buffer, calling for a compensation capacitor. During the phase 2 the stage works as a comparator. Thus, every clock period the output node swings from analog ground to a large positive or negative level causing a periodic charging and discharging of the compensation capacitance.

Fig. 6.8 shows a possible remedy to the problem. Assume that the gain stage is a single stage architecture. The capacitor  $C_C$  loading the output node ensures stability. Since during the phase 2 the gain stage is in the open loop condition no compensation is required. The switch  $S_4$  disconnects  $C_C$ , thus avoids charging and discharging of  $C_C$ . The same solution used for a two stages amplifier requires to disconnect during the phase 2 the pole-splitting capacitance.

The second limit affecting the circuit of Fig. 6.7 concerns the clock feedthrough. The critical node of the network is the inverting terminal. When



Fig. 6.8 - Disconnecting the compensation capacitance during  $\Phi_2$  speeds up the circuit,



**Fig. 6.9** - Circuit for estimating the residual offset caused by the clock feedthrough from  $S_1$ .

the switch  $S_1$  opens it injects a charge that is trapped in the inverting terminal node and reduces the effectiveness of the offset cancellation. By contrast, the charge injected on the other terminal of  $C_A$  by the opening of  $S_2$  doesn't affect the operation of the circuit. After a short period of time  $S_3$  is closed and the control of the voltage of the right side of  $C_C$  is taken by the low-impedance input generator  $V_{in}$ .

The amount of charge injected by  $S_1$  depends on the fall time of the clock phase controlling  $S_1$  and the boundary conditions on the two sides of the switch. Namely, the boundary conditions at the virtual ground side depends whether  $S_2$  is closed or opened. Since  $S_1$  and  $S_2$  are controlled by the same phase, what may happen is that when  $S_1$  opens  $S_2$  is in an undefined condition. It is therefore convenient to use for the control of  $S_1$  and  $S_2$  slightly delayed phases. The two switches do not open at the same time and the boundary conditions for  $S_1$  are firmly established. It is recommended to open  $S_1$  before. The left plate of  $C_A$  will be secured to ground by  $S_2$  closed. Otherwise, the left plate of  $S_2$  can be at a voltage that is not so well controlled.

The parallel of capacitance  $C_A$  and the parasitic load  $C_p$  (see Fig. 6.9) receive the charge,  $Q_{inj}$ , injected by  $S_1$ . The residual offset is

$$V_{os, res, inj} = \frac{Q_{inj}}{(C_A + C_p)} \quad (6.14)$$

Moreover, the parasitic loading the non inverting input attenuates the input signal by the factor  $C_A/(C_A + C_p)$ .

---

### Example 6.2

Determine, with Spice simulations, the residual offset caused by the clock feedthrough. Use the scheme of Fig. 6.8 and  $C_A = 1 \text{ pF}$ ;  $C_C = 2 \text{ pF}$ . Moreover, use the op-amp of Example 5.4 with a supply voltage of 3.3 V. Introduce an artificial offset equal to 4 mV by a proper



mismatch of the active load transistors. Use the switches considering the following three cases: complementary transistors which aspect ratio is  $(W/L)_n = 5\mu/0.3\mu$ ;  $(W/L)_p = 5\mu/0.3\mu$ ; only n-channel transistor; only p-channel transistor.

### Solution:

The op-amp of Example 5.4 requires a compensation of 3 pF. The proposed solution provides the same output load:  $I_{pF}$  given by  $C_A$  and 2 pF by  $C_C$ . The control of the switches require no-overlapped complementary phases. The reader can achieve them by a set of pulse generators or by a non-overlapped phases generator driven by a master clock. The above figure shows the circuit diagram. The switch constitute a sub circuit. The choice facilitates the replacement of complementary transistors with an n-channel or a p-channel as required.

A number of trial simulations determines the mismatch that brings the offset to 4 mV: the width of one of the active load must be reduced to  $334\mu$ .

The charge injected by each transistor of the switch is estimated by

$$Q_{inj} = \frac{1}{2} C_{ox} WL \cdot V_{DD} = \frac{2.1 \cdot 5 \cdot 0.3 \cdot 3.3}{2} 10^{-15} = 5.2 \cdot 10^{-15} \text{ Coul}$$

where the gate oxide specific capacitance is  $2.1 \text{ fF/m}^2$ . The charge  $Q_{\text{inj}}$  integrated over  $C_A = 1 \text{ pF}$  will produce an offset of  $5.2 \text{ mV}$ . The use of an n-channel element leads to a negative residual offset, the use of a p-channel transistor causes a positive offset, the use of complementary transistors leads, in first approximation to a compensation of the charge injections. The simulations give the following results: only p-MOS  $5.63 \text{ mV}$ ; only n-MOS  $-5.71 \text{ mV}$ ; complementary transistors  $-0.14 \text{ mV}$ .

### 6.4.2 Auto-zero in Multi-stages Architectures

The amplifier of the block diagram in Fig. 6.2 can be a single amplifier whose gain is large enough or a cascade of gain stages with a relatively low gain. In the latter case the auto-zero technique is not very effective because, according to equation (6.13), the residual offset is inversely proportional to the gain. Moreover the offset of the second stage is referred to the input divided by  $A_1$ . Therefore, the attenuation of  $V_{os,2}$  can be insufficient. When the value of residual offset is non acceptable it is necessary to use the auto-zero both in the first and in the second stage.

A straight use of the auto-zero technique would lead to the schematic of Fig. 6.10 a). An auto-zero network made by the auto-zero capacitor and three switches operate on the second amplifier. However, one can observe that the left plate of capacitor  $C_2$  is connected to the output of  $A_1$  during phase 1 and to



Fig. 6.10 - a) Direct use of the auto-zero in a cascade of gain stages. b) improved solution.



Fig. 6.11 - Phase  $\Phi_1$  and  $\Phi'_1$  useful for the control of the switches  $S_1$  and  $S_2$  of Fig. 6.10 b).

the analog ground during phase 2. These two voltages are the ones that the output node of  $A_1$  develops, assuming negligible the effect of  $V_{os,1}$ . Therefore, it is convenient to remove the two switches in the second auto-zero network to obtain the schematic of Fig. 6.10 b).

The circuit of Fig. 6.10 b), is not only less complex than the one in Fig. 6.10 a), it provides an additional benefit. In reality, during the phase 1 the capacitor  $C_2$  is charged at the difference between the offset of the second stage and the offset of the first stages. Therefore, it samples at the same time the offset of the second stage and the consequence of  $V_{os,res,1}$ . It turns out that  $V_{os,res,1}$  is auto-zeroed by  $C_2$  and the total the residual offset becomes

$$V_{os,res} = \frac{V_{os,2}}{A_1 A_2} \quad (6.15)$$

The opening of the switches  $S_1$  and  $S_2$  produces a clock feedthrough injection into the auto-zero capacitors  $C_1$  and  $C_2$ . The injected charges cause an additional input referred residual offset

$$V_{os,res,ck} = \frac{Q_{inj,1}}{C_1} + \frac{Q_{inj,2}}{A_1 C_2} \quad (6.16)$$

### COMMENT ON FIG. 6.10 B

The residual offset of the first stage is auto zeroed only if its effect is stored on the auto-zero capacitor of the second stage. This, in turn, impose the use of the phase scheme of Fig. 6.11.

offset  $V_{res,inj,1} = Q_{inj,1}/C_1$  amplified by  $A_1$  comes out at the output of the  $A_1$ . Assuming that residual offset and gain  $A_1$  are not too large, the output of  $A_1$  remains out of the region where the gain drops. Moreover, since  $S_2$  is still

The already discussed benefit of the circuit of Fig. 6.10 b) can be exploited to cancel the first term in (6.16). For this it is just necessary to use the clock phases  $\Phi_1$  and  $\Phi'_1$  shown in Fig. 6.11 to drive  $S_1$  and  $S_2$  respectively. The following operation results:  $S_1$  opens before and injects its clock feedthrough charge,  $Q_{inj,1}$ , into  $C_1$ . The residual



Fig. 6.12 - Possible circuit implementation of the auto zeroed comparator of Fig. 6.10 b).

closed the auto-zero capacitor  $C_2$  samples  $A_1 V_{res,inj,1}$ . Therefore, the auto-zero operation cancels the residual offset of the first stage, the clock feed through of the first stage and, partially the offset of the second stage. Accounting for all the discussed effects, one obtains an input referred offset given by

$$V_{os,res} = \frac{V_{os,2}}{A_1 A_2} + \frac{Q_{inj,2}}{A_1 C_2} \quad (6.17)$$

Fig. 6.12 shows a possible circuit implementation of the scheme of Fig. 6.10 b). Simple inverters realize the gain stages. They have only one input (not a differential input) as outlined in Fig. 6.10. Actually, a differential input is not necessary, the feedback connections established by  $S_1$  and  $S_2$  bring about the offsets at the input terminals anyway. Moreover, a reference connection to the virtual ground is not necessary. The auto-zero capacitor provides a possible level shift between the input of the gain stage and the node at which the right plate is connected.

The gain of the inverters used in Fig. 6.12 is typically small (around 10). Moreover, in order to keep low the power consumption, the designer must use pretty long transistors. Another limit of the circuit in Fig. 6.12 is the poor power supply rejection. On the other hand, the circuit doesn't need any bias voltage. Therefore, the circuit in Fig. 6.12 is suitable for architectures that use a large number of comparators like, for example, flash converters but don't require a very high accuracy.

### 6.4.3 Fully Differential Implementation

A fully differential architecture increases the defence of op-amp against common mode disturbances. The same technique can be conveniently used with comparators. Fig. 6.13 shows the fully differential version of the auto-zero pre-amplifier. It uses a fully differential gain stage and duplicates the auto-zero



Fig. 6.13 - Fully-differential auto-zeroed gain comparator.

network. The injection of charge at the opening of  $S_1$  and  $S_2$  causes a common mode signal at the input that  $CMRR$  rejects. Only a possible mismatch between the switches  $S_1$  and  $S_2$  may affect the circuit. Different injected charges produce a differential signal that generates an offset. Assuming that the matching accuracy of minimum size transistors is in the order of 5%, the fully differential architecture improves by a factor 20 the residual noise generated by the clock-feedthrough.

The use of a fully differential architecture provides four input terminals. In Fig. 6.13 two of them are connected to the analog ground and the remaining two are used for the inputs. However, the designer can use the four input to compare the combination of four voltages: the difference of the voltages applied to the positive input compared to the difference of the voltages applied to the negative input.

An important design issue concerns the choice of the gain-stage architecture. It can be made by the cascade of low-gain stages or by the use of stages with moderate-high gain. Fig. 6.14 a) shows a simple gain stage. Its gain is

$$A_v = \frac{g_{m1}}{g_{m3}} = \sqrt{\frac{\mu_n(W/L)_1}{\mu_p(W/L)_3}} \quad (6.18)$$

A proper choice of the transistors' aspect ratio easily leads to gain in the order of 10.

An interesting feature of the circuit in Fig. 6.14 a) is that the output impedance is relatively low. The diode connected elements  $M_3$  and  $M_4$  control the output quiescent voltage; therefore, the common mode feedback is not necessary. We know that for high-speed applications, it is not the gain that is the most important parameter but the  $g_m/C_L$  ratio. The load capacitance of the stage in Fig. 6.14 a) mainly comes from the loading capacitors  $C_1$  and the gate capacitance of  $M_3$ - $M_4$ .



Fig. 6.14 -a) Fully differential stage with low gain b) Fully differential stage with moderate gain and common mode feedback.

$$C_L = C_{ox}(WL)_3 + C_{db,3} + C_1 \quad (6.19)$$

The gate area of  $M_3$  can be large, thus limiting the speed performances. The use of the circuit of Fig. 6.14 b) partially solves the problem. Fig. 6.14 b) is a single stage differential amplifier with active load. As it is known the gain is given by

$$A_v = \frac{g_{ml}}{g_{ds,3}} \quad (6.20)$$

there is no need any more to use long active loads. Moreover the parasitic contribution of transistor  $M_3$  to  $C_L$  is just from  $C_{db,3}$ . However, since the impedance of the output node is pretty high a common node feedback is necessary. The circuit of Fig. 6.14 b) uses one of the continuous-time solution discussed in the previous chapter. Transistors  $M_A$  and  $M_B$  degenerate the tail current generator,  $M_5$ . A differential output doesn't affect the current, while a common mode output changes the bias conditions.

Observe that the common mode feedback loads the output thus affecting  $C_L$ . Being  $M_A$  and  $M_B$  in the triode region, we have

## OBSERVATION

In order to speed up the operation of a comparator it is essential to bring at the minimum the capacitances that integrate the signal current. Loads made by small transistors and a careful layout help in achieving the target.



Fig. 6.15 - Fully differential gain stage with common mode feedback after decouple buffers.

$$C_L = C_{db1} + C_{db3} + \frac{1}{2}C_{ox}(WL)_A + C_I \quad (6.21)$$

That is likely smaller than the load capacitance expressed by (6.19). In fact, the drain to bulk capacitances are normally negligible and the area of transistor  $M_A$  can be smaller than the one of  $M_3$  in Fig. 6.13 a).

Fig. 6.15 shows another realization. It uses a differential amplifier with active load like the solution of Fig. 6.14 b). However, it employs a different common mode solution. Two-source follower replicates the outputs. The obtained signals are used as input of the source-coupled pair  $M_9-M_{10}$  which source, in turn, controls the gate of  $M_5$ . The circuit of Fig. 6.15 enjoys two benefits. The common mode feedback doesn't load any more the output nodes of the gain stage (nodes A and B). Moreover, the use of the source followers leads to minimum capacitances loading the nodes A and B. Up to its unity gain frequency the followers provide a good replica of the input. Therefore, a bootstrap on  $C_{gs7}$  and  $C_{gs9}$  results. The capacitors, seen from nodes A and B, becomes

$$\begin{aligned} C_{gs7,boot} &= C_{gs7}(1 - A_B) \\ C_{gs9,boot} &= C_{gs9}(1 - A_B) \end{aligned} \quad (6.22)$$

Where  $A_B$  is the gain of the source follower. The capacitance loading nodes A and B becomes

$$C_L = C_{db1} + C_{db3} + C_{gs7}(1 - A_B) \quad (6.23)$$

Remember that the common mode feedback used in Fig. 6.15 has quite a limited region of linear operation. A relatively small differential signal com-

pletely imbalances the stage. One transistor turns off and the common mode source follows the gate of the other. This malfunctioning can cause a significant change of the tail current.

### Example 6.3

Design the fully differential gain stage shown in Fig. 6.15. It is required to obtain an output signal 50 times bigger than the input in 50 nsec. Estimate the effect of a capacitance loading the nodes A and B. Use the Spice models of Appendix C.

#### Solution:

The transient response of the gain stage depends on the transconductance of the input pair and the capacitive loads on the nodes A and B (Fig. 6.15). The time available is pretty small; therefore, we expect to use only the initial part of the exponential transient. The use of equation (6.2)

$$V_o(t) \approx V_i t \frac{g_m}{C_o} \quad \text{for} \quad t \ll R_o C_o$$

permits us to determine the required value of  $g_m/C_L = 10^9$ .

The transconductance of the input pair increases with the square root of the input pair aspect ratio. However, the load capacitance increases because of the  $C_{db}$  contribution. Therefore, there is a trade-off between the transconductance increase and the worsening of the load capacitance.

The above considerations are not (on purpose) accounted for in the used design. The reader is asked to work at the schematic and



improve the performances of the circuit.

The below figure shows the schematic. Observe that the aspect ratio of the input pair is as large as  $(450\mu/1\mu)$ . That size and a tail current of  $200\mu\text{A}$  lead to  $g_m = 1.7\text{ mA/V}$ .

The use of small transistors in the common mode feedback network determines a large overdrive of the source coupled transistors. This, in turn, produces a reasonably large range of operation of the common mode feedback.

The below figure shows the transient response for a  $\pm 1\text{ mV}$  step input. The curves refer to different capacitive loads at nodes A and B. The results indicate that in order to reach  $\pm 50\text{ mV}$  the circuit needs 30 nsec with zero load; 55 nsec with  $0.5\text{ pF}$ ; 80 nsec with  $1\text{ pF}$ , and so forth. The above results permit us to estimate the value of  $C_L$ . It is approximately  $0.6\text{ pF}$ . The relative large value of  $C_L$  suggests us that, as already mentioned, there is some room for improvement.



#### 6.4.4 Use of an Auxiliary Stage

The auto-zero technique studied in the previous sub-section is based on the measurement of the offset at the input of the gain stage. An alternative possi-



Fig. 6.16 - Conceptual scheme of the offset cancellation technique with auxiliary stage.

bility is to short the inputs and to measure the effect at the output. The gain stage amplifies the offset and this makes the compensation easier. The only limit is that the value of the offset and the gain must be small enough to avoid pushing the output voltage near to the supply voltages limits.

Fig. 6.16 shows the conceptual scheme of the idea. The switch  $S_1$  connects the inverting terminal to the non-inverting one (assumed tied to the virtual ground). At the output of  $A_1$  a feedback loop processes the signal. Assuming  $S_2$  closed, the gain stage  $A_2$  equals its two inputs. Therefore, at first approximation the output voltage becomes  $V_{os,2}$ . That means that  $A_2$  generates a voltage capable to almost compensate  $V_{os,1}$  amplified by  $A_1$ . A more accurate analysis of the circuit leads to the following balance equation

$$A_1 V_{os,1} + A_2 (V_{os,2} - V_o) = V_o \quad (6.24)$$

That results in

$$V_o = \frac{A_1}{1 + A_2} V_{os,1} + \frac{A_2}{1 + A_2} V_{os,2} \quad (6.25)$$

The role of the two inputs can be interchanged: the negative terminal can be connected to the analog ground and the positive input to the input signal. Moreover, the switch  $S_3$  connects the input generator to the input of the comparator. The impedance of the input generator is finite; therefore, a possible charge injected when  $S_1$  opens is discharged through  $S_3$ . Therefore, the clock feedthrough produced by  $S_1$  leads just to a glitch in the response. To procure this result it is necessary to open  $S_1$  after  $S_2$ ; otherwise the effect of the charge injection of  $S_1$  is captured by the feedback loop and stored on  $C_S$ . The opening of the switch  $S_2$  causes a charge injection on the storing element  $C_S$ . The injected charge,  $Q_{inj}$ , is integrated over  $C_S$  and amplified by  $A_2$ . when it is referred to the input, it determines a residual offset given by



**Fig. 6.17** - Four input OTA suitable for cancelling the offset cancellation by the technique that uses auxiliary stage.

$$V_{os,res} = \frac{Q_{inj}}{C_S} \left( \frac{A_2}{A_1} \right) \quad (6.26)$$

The designer minimizes the residual offset by using a pretty large capacitor  $C_S$  and designing the gain  $A_1$  larger than the one of the auxiliary amplifier,  $A_2$ . When the gain stage  $A_2$  is a single stage amplifier, capacitor  $C_S$  operates as its compensation element during the unity gain connection.

### NOTICE

The best benefit brought by the use of auxiliary stages in the auto-zero consists in the disconnection between the storing element and the input node: we can use a direct connection of the input signal and we can use pretty large storing capacitors to minimize the residual offset (assuming the speed of the auxiliary amplifier large enough).

The technique discussed above can be implemented with a one-to-one translation of the basic scheme of Fig. 6.16. However, it is possible to identify solutions that achieve a more efficient implementation. Fig. 6.17 shows a fully differential solution that implements the two amplifiers of the scheme in Fig. 6.16. The two differential pairs realize the input stage of the main amplifier and the one of the auxiliary amplifier. The output current of the two stages are combined together and

transformed into voltage thanks to the high impedance resistance of the two output modes. Therefore, the circuit, instead of generating voltages that must be summed up afterwards, combines the output currents and then it transforms the result into a voltage. This strategy simplifies significantly the circuit schematic. The used transistor sizes and the tail currents in the two differential stages control the two gains. Therefore, the designer can easily fulfil the



Fig. 6.18 - Four input OTA based on the current mirror degeneration.

request to have  $A_1 > A_2$ .

Fig. 6.18 shows a second circuit solution. It is based on a mirrored differential gain stage. The circuit includes two additional transistors  $M_7 - M_8$  degenerate the current mirror  $M_5 - M_6$ . A possible unbalance of the auxiliary inputs changes the mirror factor, thus modifying the output voltage.

The small signal differential gain of the main amplifier depends on the transconductance of the input pair, the mirror factor of  $M_4 - M_{11}$  and the output resistance

$$A_1 = g_{m1} \frac{(W/L)_{11}}{(W/L)_4} r_{out} \quad (6.27)$$

The small signal differential gain of the auxiliary stage depends on the transconductance of the additional transistors  $M_7 - M_8$ .

$$A_2 = g_{m8} r_{out} \quad (6.28)$$

Since  $M_7$  and  $M_8$  operate in the triode region, their transconductance is typically lower than the one of the input pair. Therefore, as expected,  $A_1 > A_2$ .

## 6.5 LATCHES

The last block of the architecture in Fig. 6.2 is the latch. It furnishes supplementary gain to generate the logic output. In addition, it provides a stable output synchronous with the clock that brings into operation the regenerative loop.



Fig. 6.19 - Simple latch configuration.

Fig. 6.19 shows a simple latch implementation. Assume that the strobe signal is low. The section  $M_5 - M_6 - M_7$  of the circuit doesn't operate and the transistor pairs  $M_1 - M_3$  and  $M_2 - M_4$  form two inverters with active load. The voltage of node 1 and 2 are close to  $V_{DD}$  if the input voltages induce currents in  $M_1$  and  $M_2$  lower than the one drained by  $M_3$  and  $M_4$ . When the common mode input exceeds a given level the currents in  $M_1$  and  $M_2$  become large and the voltage of node 1 and 2 drops down to ground.

Let assume that the common mode input is below the above-mentioned level and that the two inputs differ by a given extent. When the strobe control goes up the transistors  $M_5$  and  $M_6$  become active and they start the regenerative operation. Since the voltages of node 1 and 2 are pretty high the action of  $M_5$  and  $M_6$  will be significant. However, one of the voltages is more effective than the other and becomes more and more dominant with respect to the other.

If the input voltages are such that the node 1 and 2 are at a low level, at the limit, below the threshold of  $M_5$  and  $M_6$ , when the strobe control is active the cross coupled pair  $M_5$  and  $M_6$  doesn't react properly or, at the limit remain in the sub-threshold region of operation. Therefore, the circuit in Fig. 6.19 works properly for a given range of the common-node input.

The nominal currents in  $M_3$  and  $M_4$  control the power consumption of the circuit.

---

#### Example 6.4

Design the latch of Fig. 6.20. The nominal current in  $M_3$  and  $M_4$  is  $100 \mu\text{A}$ ; the supply voltage is  $3.3 \text{ V}$ . The input signals differ by  $100 \text{ mV}$ . The circuit must operate properly with a common mode input voltage ranging from  $0.95 \text{ V}$  and  $1.25 \text{ V}$ . Determine the time required to get to the logic level low ( $0.3 \text{ V}$ ) and the logic level



high. (3 V). Use the Spice models of Appendix A.

**Solution:**

The description of the behaviour given above recommends a high voltage level for nodes 1 and 2 in the preset phase. This requires to use small input transistors relatively. The highest input voltage must generate less than the nominal current of  $M_3$  and  $M_4$ . Some Spice simulations determine  $(W/L)_1 = (W/L)_2 = 4\mu/1\mu$ . Transistor x are chosen larger than x to ensure a solid regenerative action. The resulting Spice list is shown below.

```

LATCH
M1 1 3 0 0 MODN W=4U L=1U
M2 2 4 0 0 MODN W=4U L=1U
M3 1 7 6 6 MODP W=4U L=1U
M4 2 7 6 6 MODP W=4U L=1U
M5 1 2 5 0 MODN W=8U L=1U
M6 2 1 5 0 MODN W=8U L=1U
M7 5 8 0 0 MODN W=8U L=1U
MBP 7 7 6 6 MODP W=4U L=1U

IBP 7 0 0.1M
VDD 6 0 3.3
V8 8 0 PULSE (0 3.3 0. 1n 0.02n 0. 1n 50n 100n)

VINP 3 0 0.9
VINN 4 0 1.0

.MODEL . . . . .
```

The figure displays the output responses for three different cases. They evidence the proper operation in the two limits of the required

range of operation. The time required to achieve the high level is between 1 nsec and 2 nsec. The swing in the downward direction is faster, between 0.9 nsec and 1.4 nsec. Observe that when the strobe goes on the voltages of node 1 and 2 both drops down but after a while the stronger one takes over and goes to  $V_{DD}$ .

The third pair of curves shows the response for input voltages higher than the required range. It corresponds to a critical point of the transition region. The voltages of nodes 1 and 2 during the preset phase are down to 0.5 V and 1.5 V. This is not enough for a quick establishment of the logic level. The higher voltage is still below 1.5 V at 4 nsec.

Fig. 6.20 shows another latch configuration. It uses a p-channel transistors for the regenerative loop and n-channel transistors for preset phase. The circuit works as follow. When the strobe is high (that corresponds to the preset phase: the strobe controls the gate of a p-channel transistor) the differential pair  $M_1$  and  $M_2$  discharges the output nodes. The current generator  $M_5$  determines the bias current in the circuit. A possible unbalance of the input voltages ties down to ground one node better than the other. This determines the unbalance that, when the strobe goes down, makes different the initial value of the outputs, thus driving the positive feedback in the right direction. In the latch phase the current of  $M_5$  controls the power consumption.

In the circuit of Fig. 6.21 the input signal discharges the drains of  $M_2$  and  $M_3$  during the preset phase. Also, the drain of  $M_8$  and  $M_9$  are brought to  $V_{DD}$  by



Fig. 6.20 - Simple latch configuration with the control of the tail current.



Fig. 6.21 - Latch with a double positive feedback.

the action of  $M_7$  and  $M_{10}$  which gates controlled by the strobe signal. During the latch phase two regenerative feedback loop enforce one each other to obtain the output voltage.

The circuit of Fig. 6.22 is a combination of a gain stage and latch. It achieves its gain during the preset phase thanks to the transconductance of the pair  $M_1 - M_2$  and the resistance at the output nodes and performs the latch function during the next phase by the use of  $M_6 - M_7$ . Transistors  $M_5$  and  $M_6$  switch the tail current established by  $M_9$  toward the input pair  $M_1 - M_2$  or toward the cross-coupled pair  $M_6 - M_7$ . Transistor  $M_9$  controls the power consumption in the two phases of operation. In order to secure a proper gain in the preset phase the circuit requires the use of a common mode feedback (not shown in the circuit).



Fig. 6.22 - Combination of a gain stage and a latch.

### Example 6.5

Study, with Spice simulations, the cascade of a pre-amplifier and a latch shown in the figure. Verify that the transistor sizes determine a pre-amplifier gain of 16. Determine the output response for 1 mV differential input. Estimate the kick-back on the output of the pre-amplifier that the latch activation causes.



#### Solution:

Equation (6.18) provides the gain of the pre-amplifier

$$A_v = \frac{g_{m1}}{g_{m3}} = \sqrt{\frac{\mu_n(W/L)_1}{\mu_p(W/L)_3}}$$

where  $M_1$  and  $M_3$  are the input pair and the diode connected active load respectively. Assuming that the electron mobility is two times the one of holes, the above equation leads to a gain of 53. The simulation reveals a pretty lower result ( $A_v = 16$ ) showing the limit of the approximate equation for pretty large transistors.

All the transistors of the latch have the same sizes ( $10\mu/0.3\mu$ ). This choice facilitates the layout of the circuit.

The transient simulation of the circuit determines the result shown in the next figure. The first trace displays the strobe signal. The second one shows the outputs of the pre-amplifier and the last trace plots the output of the latch. The results show that after waking up the latch the output voltages drop down. This brings down the outputs of the pre-amplifier from the initial levels (1.048 V,



1.064 V) by 15 mV or so. After about 0.5 nsec the outputs of the latch reach the logic levels. By contrast the recovery of the outputs of the pre-amplifier is quite slow. Even, the curves cross one each other. The next diagram shows that the transient last for about 20 nsec. Therefore, the kick back produced by the latch disables the normal function of the pre-amplifier thus affecting the next comparison phase.



## 6.6 REFERENCES

- R. Gregorian, *Introduction to CMOS OP-AMPS and Comparators*, J. Wiley and Sons, New York, NY, 1999.
- B. Razavi and B.A. Wooley, Design techniques for high-speed, high-resolution comparators , IEEE Journal of Solid-State Circuits vol, 27, 1992, pp, 1916-1926.
- Jieh-Tsorng Wu and B.A. Wooley, *A 100-MHz pipelined CMOS comparator*, Journal Solid-State Circuits, Vol. 23, 1988, pp. 1379-1385.
- S. Tsukamoto, W. G. Schofield, and T. Endo, *A CMOS 6-b, 400-MSample/s ADC with Error Correction*, IEEE Journal of Solid-State Circuits vol, 33, 1998, pp, 1939-1947.
- M. P. Flynn, Member and B. Sheahan, *A 400-Msample/s, 6-b CMOS Folding and Interpolating ADC*, IEEE Journal of Solid-State Circuits vol, 33, 1998, pp, 1938-1938.
- I. Mehr, and D. Dalton, *A 500-MSample/s, 6-Bit Nyquist-Rate ADC for Disk-Drive Read-Channel Applications*, IEEE Journal of Solid-State Circuits vol, 34, 1999, pp, 912-920.

## 6.7 PROBLEMS

- 6.1** Repeat Example 6.1 but use an inverter with active load as basic gain stage. Use the following design parameters:  $(W/L)_n = 20\mu/0.3\mu$ ;  $(W/L)_p = 30\mu/0.4\mu$ ;  $V_{DD} = 1.8$  V. The n-channel element is the input device. Use the transistors models of Appendix C.
- 6.2** Simulate with Spice the cascode of three identical gain stages modelled by the equivalent circuit of Fig. 6.3. Determine the normalized response to a step input.
- 6.3** Repeat Example 6.2 but use the two stages op-amp designed in Example 5.2. Estimate the limitation to speed caused by the compensation network and show the benefit resulting from the switching-off of the compensation network during the comparator phase.
- 6.4** Simulate the circuit in Fig. 6.10 b). Use the gain stage of example 6.1 and  $C_1 = C_2 = 1$  pF. The switches are n-channel transistors whose sizes are  $(W/L) = 3\mu/0.4\mu$ . Verify that the residual offset (including the one caused by the clock feedthrough) of the first stage is cancelled by the use of phases like the ones shown in Fig. 6.11.

- 6.5** Design, using Spice and the models of Appendix C, the gain stage of Fig. 6.14 b). The gain must be larger than 25 and the output voltage must swing around  $2.4\text{ V}$ .  $V_{DD} = 3.3\text{ V}$ .
- 6.6** Repeat Example 6.3. Modify the circuit design for an optimum speed. The current of the pre-amplifier can increase by a factor 2.
- 6.7** Design the latch of Fig. 6.20. The nominal current of  $M_5$  is  $150\text{ mA}$ . The input unbalance is  $30\text{ mV}$  around  $V_{DD}/2$ . Using the models of Appendix A and  $V_{DD} = 5\text{ V}$  try to obtain a latching time below  $0.7\text{ nsec}$ .
- 6.8** Design the combination gain stage-latch of Fig. 6.22. The sizes of the transistors  $M_1 - M_2$  are  $(W/L) = 40\mu/0.3\mu$  and the one of  $M_6 - M_7$  are  $(W/L) = 10\mu/0.3\mu$ . The tail current is  $50\text{ }\mu\text{A}$ . Determine the sizes of  $M_3$  and  $M_4$  that leads to  $A_I = 50$ . Design a proper common mode feedback and estimate the latch time.