

# Written examination in **Integrated Circuit Design MCC091**

Monday August 24, 2015, at 14.00-18.00 at the Mechanical Engineering Bldg

**Staff on duty:** Lena Peterson, D&IT, phone ext: 1822, or mobile 0706-268907. Will visit around 14.30 and 17.00.

**Administration:** Send exams to Lena Peterson D&IT, and send lists to Susannah Carlsson, MC2.

**Technical aids for students:** This is a closed-book, no-calculator examination. Pen and paper allowed.

**The results** from the examination will be sent to you via the Ladok system within three weeks. The reviews of this exam will take place Friday September 11 2015 12.30-13.30 in room 4128 and Monday September 14 12.30-13.10 in the same room. Solutions will be posted on the course web site in PingPong shortly after the examination. Any student who does not have access to the 2014-2015 PingPong page can contact Lena Peterson (via e-mail to [lenap@chalmers.se](mailto:lenap@chalmers.se)) to obtain the solution.

---

The written examination contains six problems, each worth 10 points. You need 30 points to pass, 40 points for grade “4” and 50 points for grade “5”. Any bonus points from the fall 2014 course instance will be added for the higher grades.

---

**1) Layout, static CMOS logic, compound gates.** A fundamental concept in this course is the correspondence between logical function, circuit schematic and layout. In this problem in each task you are given one of these and are to supply one of the other ones.

a) Below is the circuit schematic for a compound gate that implements a 5-input logical function. Draw the corresponding layout in the template supplied. The layout should correspond exactly to the schematic – logical equivalence is not enough. Label all inputs and outputs. Indicate clearly any contacts to metal. Copies of the template suitable for hand-in with your other solutions can be found at the end of the exam thesis. (5 p)



b) Below is the layout for a compound static CMOS implementation of a 4-input logical function. Draw the circuit schematic for the gate and find the Boolean expression for the logical function. (5 p)



2) **Inverters.** In this task you will be given some simulated characteristics of NMOS and PMOS devices and you will be asked to extract certain circuit parameters from the diagrams.

a) Diagram A below shows some output characteristic data for an n-channel MOSFET used in an NMOS inverter/amplifier circuit with a resistive load. The resistive load line is also shown in the diagram. What is the voltage gain of this amplifier if biased with an input voltage of 0.6 V? (2 p)



b) Diagrams B and C, on the next page, repeat the same simulated n-channel MOSFET data together with the corresponding data simulated for a p-channel device in the same CMOS technology. What would be the maximum voltage gain of an inverter/amplifier designed with these two devices? (3 p)

c) Draw a diagram of the complete voltage transfer characteristic (VTC) for the CMOS inverter/amplifier circuit. (3 p)

d) List two advantages that CMOS gates have over their nMOS counterparts. (2 p)



3) **Inverter delay, power and frequency** A ring oscillator comprises an odd number of inverters connected in a ring, as shown in the figure below. The period  $T$  of the oscillation is  $T = 2 * N * tpd$  where  $tpd$  is the propagation time through one inverter and  $N$  is the number of inverters. The factor 2 is there because a full cycle requires both a low-to-high and a high-to-low transition. Assume for this task that in a 65-nm CMOS process we have  $\tau = 5$  ps and that X4 inverters have an input capacitance  $C_{X4} = 1.5$  fF. For simplicity we assume that the inverter output parasitic capacitance is the same size as its input capacitance. Finally, we assume that the square-law MOSFET model can be used to model the MOSFET devices. The supply voltage  $V_{DD}$  is 1.2 V, and the transistors in the inverters have  $V_{Th} = -V_{Tp} = 0.3$  V (that is  $V_{DD}/4$ ).



- a) Calculate the oscillation frequency and dynamic power consumption for a ring oscillator made up of five X4 inverters. (2 p)
- b) What if we, in the same CMOS process, used X32 inverters instead of X4 inverters? How would the oscillation frequency and the dynamic power consumption change? (2 p)
- c) What if we lowered  $V_{DD}$  from 1.2 V to 1.0 V? How much would the oscillation frequency and power consumption decrease relative to the previous values obtained for  $V_{DD}=1.2$  V? (6 p)

4) **Wire delay, wire and inverter delay** The figure below shows parts of a clock tree from a chip. The width of all wire segments is 200 nm. The wire capacitance for 200 nm wide wires is 0.1 fF/ $\mu$ m. For the inverters we can assume that the input and output capacitances are equal, and their values for X2 and X8 inverters are 5 fF and 20 fF, respectively. The FO4 delay in this CMOS process is 28 ps.



- a) Assuming that the wire segments do not have any resistance, i.e., the sheet resistance  $R_s=0$ , what would be the propagation delay between nodes A and C? Draw the circuit model on which you base your calculations. (3 p)
- b) What if we replaced the X8 inverter with an X16 inverter? How is your circuit model modified and how would the propagation delay from A to C change? (2 p)
- c) Assuming a wire sheet resistance of  $0.1 \Omega/\square$ , what would be the propagation delay between nodes A and C? Show clearly the circuit model on which your calculations are based! (3 p)
- d) Under the same assumptions as in task c), how much longer is the propagation delay from A to D, if one assumes a fanout of 4 at node D? (2 p)

5) **Logical effort, path delay, gate sizing** The figure below shows a path from node A to node B formed by NAND, NOR and 2-1 OR-AND-INVERT gates. Your task as a designer is to size the gates used in this path for minimum delay. You already know that a 2-input NAND gate has a logical effort  $g_{\text{NAND}} = 4/3$  for both of its inputs, and that a 2-input NOR gate has a logical effort  $g_{\text{NOR}} = 5/3$  for both of its inputs, when designed with MOSFETs sized for equal pull-up and pull-down resistances in all paths (assuming the driving capability of a pMOSFET being half that of an nMOSFET).



- a) Draw the circuit diagram for the 2-1 OR-AND-INVERT gate and size the MOSFETs the same way MOSFETs were sized in the NAND and NOR gates for equal pull-up and pull-down resistances in all paths. Then calculate the logical effort for the input used in the circuit above. (2 p)
- b) Calculate the optimal stage effort for minimum delay for the path from A to B. (4 p)
- c) Find the input capacitances for the gates in the path required to achieve the optimal stage effort found in task b). (4 p)

6) **Adders** In the figure below you see a bit-serial adder made up of only two cells. As usual, the full-adder cell, FA, has three input signals,  $a$ ,  $b$ , and  $c_i$  and two output signals  $s$  and  $c_o$ . In this type of adder the bits making up the input data are applied serially and the outputs are also serial. The other cell in the adder is a 1-bit register (a D flip-flop). Its output is set equal to its input when the clock signal goes high and it keeps its value until the clock signal goes high the next time. Assume that we have designed the two cells to that their equivalent resistances  $R_{eff} = 10 \text{ k}\Omega$ , in the adder and register cells pull-up and pull-down networks. Also assume that the capacitive load is,  $C = 20 \text{ fF}$ , at the register input as well as at the adder carry input.



- Determine the maximum operating clock frequency. (1 p)
- Assume a word length of 32 bits. Determine the total propagation delay,  $t_{add}$ , to process a complete bit-serial addition of the two input signals,  $a$  and  $b$ . (1 p)
- The designers found that the clock signal with the maximum operating clock frequency was difficult to distribute on the chip. Therefore, they decided to design a new digit-serial adder, using 8-bit digits. Show the straight-forward architecture for such a digit-serial adder implemented with only the two types of cells shown the figure above. (2 p)
- Determine the total propagation delay,  $t_{add}$ , to process the complete 8-bit digit-serial addition of the 32-bit inputs,  $a$  and  $b$ . Use the architecture from task c) above and assume that same values for  $R_{eff}$  and  $C$  as before. (2 p)
- Drawing from your knowledge of more advanced adders, describe one way to modify your solution for task c) so that the total propagation delay,  $t_{add}$ , calculated in task d) decreases. Now you are allowed to use any other cells. Draw a diagram of your proposal and briefly explain why your solution has a lower total propagation delay,  $t_{add}$ , than does your solution for task c). Detailed delay calculations are not required! (4 p)

THE END!

# Solution to Exam Integrated Circuit Design MCC091

## Monday August 24, 2015

---

### 1. Layout, static CMOS logic, compound gates

a) There are several solutions to this layout problem. Below you see one possibility. The schematic is repeated for your convenience.



b) Below is the schematic with the order of the transistors shown. The logical function of the gate is

$Y = \overline{ABC + D(A + B + C)}$ . The reason the circuit is symmetrical is that the direct inverse of this function is  $\overline{Y} = (A + B + C)(D + (ABC)) = D(A + B + C) + ABC$



## 2. Layout, static CMOS logic, compound gates

- a) The small-signal voltage gain can be found as the slope of the transfer diagram. The maximum gain value is the steepest slope of the transfer diagram. From diagram A we find the steepest part as:  $|A_{V_{max,NMOS}}| = |(0.4-0.64)/(0.65-0.6)| = 0.24/0.05 \approx 5$ .
- b) From diagram C we calculate  $|A_{V_{max,CMOS}}| = |(0.15-0.85)/(0.6-0.575)| = 0.6/0.025 \approx 24$ . We would need even more curves to get close the real maximum (see diagrams below) but still it is quite a bit higher than the gain value found in a).
- c) The two resulting transfer diagrams for the CMOS and NMOS inverters are both shown in the diagram below for completeness. The blue curve is the one asked for in task c). These curves are the results from simulations, not curves drawn from the supplied diagrams.



If we take the derivate of the entire curves for both NMOS and CMOS inverters we get the two curves below. Also here we see that the  $A_{V_{max,NMOS}} \approx 5$  whereas  $|A_{V_{max,CMOS}}| \approx 55$  which is higher than what we found in b).



- d) Some disadvantages are: For high input voltages the current through the NMOS inverter is much higher than that of the CMOS inverter making the static power dissipation of NMOS gates much worse. The NMOS inverter has much lower gain and a much wider transfer region. Therefore its noise margin is much worse than that of the CMOS inverter. (Remember that one should look at the points where the voltage gain is -1.) Resistors take a lot of chip area if their resistances are to be large so NMOS logic is not as compact as CMOS.

### 3. Inverter delay, power and frequency

- Because  $p_{inv}$  is 1,  $t_{pd}$  is  $2\tau$  (because the inverter load is  $2C_{in_{inv}}$ ), which is 10 ps. The oscillation period is  $10t_{pd} = 100$  ps. The oscillation frequency is then 10 GHz. The general expression for the dynamic power consumption is  $\alpha f C V_{DD}^2$ . Because the oscillator inverters switch always we have the activity factor  $\alpha = 1$ . Each inverter contributes 1.5 + 1.5 fF of capacitance that has to be recharged. The total capacitance for the oscillator is then 15 fF. Consequently, the power consumption is then =  $1*10*15*1.44$  [GHz\*fF\*V<sup>2</sup>] = 216  $\mu$ W.
- The oscillation frequency will remain the same because the load capacitance increases by the same factor as does the charging and discharging current, but the dynamic power consumption will increase by a factor 8 since the capacitance increases with this factor
- With the quadratic current equation the expression for the drain current in saturation is:  $I_D = k(V_{GS} - V_T)^2$ . The maximum current we can have, when either of the two transistors is fully on, is:  $I_D = k(V_{DD} - V_T)^2$ . We also have  $\tau = R_{eff}C$  with  $R_{eff} = 0.7 V_{DD}/I_D$ . Thus, we can write  $R_{eff} = 0.7 V_{DD}/k(V_{DD} - V_T)^2$ . With  $V_T = V_{DD}/4$  we have  $R_{eff}$  with the original  $V_{DD}$ :  $R_{eff1} = K V_{DD}/(V_{DD} - V_{DD}/4)^2 = K/V_{DD} * 16/9 \approx K/V_{DD} * 1.8$ . And with the new  $V_{DD}$ , which is  $5/6$  of the old one, we get:  $R_{eff2} = K 5/6 V_{DD}/(5/6V_{DD} - V_{DD}/4)^2 = K/V_{DD} * 5/6/(20/24 - 8/24)^2 = K/V_{DD} * 120/49 \approx K/V_{DD} * 2.4$ . So the resistance with the lower  $V_{DD}$ ,  $R_{eff2}$ , is around  $4/3$  of the one with the original  $V_{DD}$ . Thus,  $\tau$  will be that much longer (since the gate capacitance does not change) and the oscillation frequency for the ring oscillator with the lower  $V_{DD}$  will be  $3/4$  of the one with the original  $V_{DD}$ . The power will decrease since both  $V_{DD}$  and  $f$  decrease. With the original power being  $P_1 = fCV_{DD}^2$  we have  $P_2 = 3/4*f*C*(5/6V_{DD})^2 = 25/48*P_1 \approx 1/2*P_1$ . So in conclusion, we gain half the dynamic power but lose a quarter of the speed (if the quadratic current model is valid).

### 4. Wire delay



- For the equivalent electric circuit schematic, see figure below.

Neglecting wire resistance: Elmore  $RC=R \times (20+190+2 \times 5)=0.4 \times 220=88$  ps.  
Hence, the delay is =61.6 ps



b) With X16 driver inverter and same parasitic output capacitance, the delay would be cut in half, i.e. 30.8 ps. But if we upscale the parasitic capacitance accordingly, from 20 fF to 40 fF, we get another  $0.7 \times 0.20 \times 20 = 2.8$  ps delay. Total delay is then  $0.7 \times 0.20 \times 240 = 33.6$  ps.

c) Wire resistivity is  $r=0.5 \text{ k}\Omega/\text{mm}$ . Wire capacitance is  $c=100 \text{ fF/mm}$   
(I choose to stay with  $\text{k}\Omega$  and ps, because I know it will give me picoseconds)



Elmore RC = 88 ps from above + RC due to wire resistance=88+ $0.5 \times 150 + 0.25 \times 28 = 88 + 82 = 170$  ps.  
Delay= $0.7 \times 170 = 119$  ps $\approx 120$  ps.

d) In this case we just add the FO4 delay to the wire delay. Hence, we add 28 ps. Total delay $\approx 150$  ps.

## 5. Logical effort, path delay, gate sizing

a) The circuit diagram for the 2-1 OAI was drawn as the solution to task 1a). The resulting transistor sizes are shown in the figure below.



b) The logical efforts are  $6/3 = 2$  for the OR inputs (B and C above) and  $4/3$  for the AND input (A in the figure above)

c) The optimal stage effort is  $f=4$ . For the path we have  $G = 2 * 4/3 * 2 * 5/3 = 80/9$ ,  $B = 3 * 4 = 12$  and  $H = 2.4C/C = 2.4$ . Thus the path effort  $F = G * H * B = 80/9 * 12 * 2.4 = 256$ . It is optimal to have the same stage effort in all stages. Since there are four stage in the path we have  $f^4 = 256 \Rightarrow f=4$ .

The resulting input capacitances are shown in the figure below. They can be found by starting either at the input or the output and applying the condition that  $f=gh=4$  in each stage. From the input of the path we get: Stage 1)  $2 * h_1 = 4$  gives  $h_1 = 2$ , so total load capacitance should be  $2C$  but this capacitance is divided among the three NAND gates so each of them gets  $2/3C$ . Stage 2)  $4/3 * h_2 = 4$  gives  $h_2 = 3$  so load capacitance should be  $2/3C * 3 = 2C$ . Stage 3)  $2 * h_3 = 4$  gives  $h_3 = 2$  so total load capacitance becomes  $4C$  but that capacitance is divided among the four NOR gates which gets  $C$  each. Stage 4)  $5/3 * h_4 = 4$  gives  $h_4 = 12/5 = 2.4$  and the load capacitance should be  $2.4C$  which it is.



## 6. Adders



a) 
$$f_{max} = \frac{1}{2 \times 0.7 R_{eq} C} = \frac{1}{2 \times 0.7 \times 10000 \times 20e^{-15}} = 3.57 \text{ GHz}$$

b) You have to repeat the 1-bit operation  $n$  times to get the result of the entire addition therefore the time for one addition is:

$$t_{add32} = 32 \times t_{add1} = \frac{32}{f_{max}} = 32 \times 2 \times 0.7 R_{eq} C = 8.96 \text{ ns} \approx 9 \text{ ns}$$

c) Figure of 8-bit-word-serial architecture can be seen below.



d) Since, a bit unrealistic, the capacitance in each point in the carry chain is still exactly the same, the time to add the 32-bit words is the same as in b).  
e) There are many possibilities, for example to use an 8-bit Sklansky adder for each 8-bit word.

Anonymous code: \_\_\_\_\_

Label all inputs and the output clearly!

