# A DIGITAL-TO-TIME CONVERTER FOR TIME-MODE SUCCESSIVE-APPROXIMATION REGISTER ANALOG-TO-DIGITAL CONVERTERS by #### Daniel Junehee Lee Bachelor of Engineering: Electrical, McGill University, Montreal, Quebec, 2014 #### A thesis presented to Ryerson University in partial fulfillment of the requirements for the degree of Master of Applied Science in the Program of Electrical and Computer Engineering Toronto, Ontario, Canada, 2019 © Daniel Junehee Lee 2019 #### AUTHOR'S DECLARATION FOR ELECTRONIC SUBMISSION OF A THESIS I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I authorize Ryerson University to lend this thesis to other institutions or individuals for the purpose of scholarly research. I further authorize Ryerson University to reproduce this thesis by photocopying or by other means, in total or in part, at the request of other institutions or individuals for the purpose of scholarly research. I understand that my dissertation may be made electronically available to the public. A Digital-to-Time Converter for Time-Mode Successive-Approximation Register Analog-to-Digital Converters Master of Applied Science 2019 Daniel Junehee Lee Electrical and Computer Engineering Ryerson University #### Abstract The 8-bit digital-to-time converter (DTC) to be used for a time-mode successive-approximation register analog-to-digital converter (SAR ADC) with a minimum power consumption and silicon area is presented. The architecture and the drawbacks of a conventional voltage-mode SAR ADC are discussed. The principle of time-mode circuits and benefits of their applications to mixed-signal circuits are explained. The architecture of a time-mode SAR ADC is presented. The need for an area and power-efficient DTC to be used for a time-mode SAR ADC is discussed. The principle of a DTC is explained and prior works on a DTC are reviewed. The principle of a phase interpolator (PI), to be used for a DTC, is explained and prior works on digital PIs are reviewed. The design of the proposed DTC is presented. Each block of the proposed DTC is explained using schematic and layout views. Optimal slope of the input of the PI and the condition for linear phase interpolation are investigated. Simulation results of the proposed DTC designed in TSMC 65 nm 1.0 V CMOS technology are provided. According to simulation results with BSIM4.4 device models only, the time resolution of 0.33 ps, a maximum operation frequency of 2.53 G Hz, the power consumption of 1.38 mW, and peak differential nonlinearity (DNL) and integral nonlinearity (INL) less than 0.14 least significant bit (LSB) and 0.49 LSB, respectively, for a nominal process (TT) and a temperature condition (27 C\*) are achieved. #### Acknowledgements I would like to acknowledge Professor Fei Yuan for his supervision of me during my master studies. Ever since I was a novice in the field of semiconductors, he has provided me a great amount of support. I would like to acknowledge Professor Gul N. Khan for having supported me as a co-supervisor during my master studies. I was able to learn fundamentals in digital circuits from Professor Andy G. Ye, Professor Lev Kirischian, and Professor Vadim Geurkov from their exceptional teachings. I thank my colleagues Yue Li, Parth Parekh, and Rashed Siddiqui for providing me prompt and genuine assistances during my master studies. I thank my wife Hyerin Kim and families for always supporting me. # Contents | | Dec | laration | i | |---|------|---------------------------------------|---| | | Absi | tract | i | | | Ack | nowledgements | i | | | List | of Tables | i | | | List | of Figures | i | | 1 | Intr | roduction | 1 | | | 1.1 | Challenges in SAR ADCs | 1 | | | 1.2 | Time-Mode Circuits | 2 | | | 1.3 | Time-Mode SAR ADCs | 4 | | 2 | Dig | ital-to-Time Converters | 7 | | | 2.1 | Delay-Line-Based DTCs | 7 | | | 2.2 | VDL-Based DTCs | 8 | | | 2.3 | PI-Based DTCs | 8 | | | 2.4 | current source array (CSA)-Based DTCs | 9 | | | 2.5 | Capacitor Array-Based DTCs | 0 | | | 2.6 | ON-Resistance-Based DTCs | 1 | | | 2.7 | Multipath Pre-Skewed Delay Line DTCs | 2 | | | 2.8 | Multi-Step DTCs | 3 | | | 2.9 | Performance Comparison | 7 | | 3 | Dig | ital Phase Interpolators | 9 | | | 3.1 | Gated Inverter PIs | 9 | | | 3.2 | Harmonic Rejection PIs | 2 | | | 3.3 | Voltage Division PIs | 3 | | | 3.4 | Pipelined PIs | 4 | | | 3.5 | Design Considerations | 6 | | | | 3.5.1 Input Signaling | 6 | | | | 3.5.2 Output Signaling | 7 | | | | 3.5.3 Input-Output Isolation | 7 | | 3.6 | 3.5.4 Shoot-Through | | |-----------|---------------------------------------------|--| | | Performance Comparison | | | 4 Th | e Proposed DTC | | | 4.1 | The Prior Design | | | 4.2 | The Architecture of the Proposed DTC | | | | 4.2.1 The Pre-Skewed Delay Line | | | | 4.2.2 The TG-Based 17:2 Multiplexer | | | | 4.2.3 The Intermediate Buffer Stage | | | | 4.2.4 The Slope Control | | | | 4.2.5 The PI | | | 4.3 | Analysis | | | | 4.3.1 Optimal Slope of the Input of the PI | | | | 4.3.2 Linear Time Interpolation | | | 5 Sin | nulation Results | | | 5.1 | The Prior Design | | | | 5.1.1 Simulation with BSIM4.4 Device Models | | | 5.2 | The Proposed Design | | | | 5.2.1 Simulation with BSIM4.4 Device Models | | | | 5.2.2 Post-Layout Simulation | | | 5.3 | Performance Comparison | | | | nclusions and Future Work | | | 6 Co | Conclusions | | | 6 Co: 6.1 | | | | | Future Work | | | 6.1 | Future Work | | # List of Tables | 2.1 | Per-stage delays with $0/1/2/3$ pre-skewing signals | 13 | |-----|-------------------------------------------------------------------------------------|----| | 2.2 | Performance comparison of DTCs | 18 | | 3.1 | Performance comparison of digital PIs | 29 | | 5.1 | Parameters of the proposed DTC obtained from simulation with BSIM4.4 device models. | 56 | | 5.2 | Performance comparison of DTCs | 58 | # List of Figures | 1.1 | Architectures of a flash analog-to-digital converter (ADC) (left), a delta-sigma ADC (top- | | |------|--------------------------------------------------------------------------------------------|----| | | right), and a SAR ADC (bot-right) | 2 | | 1.2 | Architectures of flash (left) and VDL flash TDCs (right) [24] | 3 | | 1.3 | An architectures of an all-digital PLLs [22] | 4 | | 1.4 | The timing diagram of a 2-bit TDC using a binary digital output representation | 4 | | 1.5 | An architecture of a SAR TDC | Ę | | 1.6 | An architecture of a SAR TDC using a CTDSA method [26] | ٠ | | 2.1 | A time variable generated by a DTC | 7 | | 2.2 | A conventional architecture of a delay-line-based DTC [28] | 8 | | 2.3 | An architecture of a VDL-based DTC | | | 2.4 | An architecture of a conventional PI-based DTC | 1( | | 2.5 | An architecture of a CSA-based DTC | 10 | | 2.6 | An architecture of a capacitor array-based DTC | 11 | | 2.7 | An architecture of a ON-resistance-based DTC | 12 | | 2.8 | A pre-skewed delay line | 13 | | 2.9 | Dependence of per-stage delay and power consumption of a pre-skewed delay line on the | | | | number of pre-skewing signals | 14 | | 2.10 | The DNL of a pre-skewed delay line. Legends : TT (dashed), FF (hashed), SS (dotted). $$ . | 14 | | 2.11 | The INL of a pre-skewed delay line. Legends : TT (dashed), FF (hashed), SS (dotted) | 15 | | 2.12 | An architecture of a DTC that consists of a course time tuning stage followed by a fine | | | | time tuning stage. | 15 | | 2.13 | The 5-bit DTC that consists of a course and fine time tuning stages | 16 | | 2.14 | Output of the MUX of the DTC | 16 | | 2.15 | Output of the DTC with 4-level time interpolation | 17 | | 2.16 | Output of the DTC with 4-level time interpolation. Legends : Inputs (solid), TT (dashed), | | | | FF corner (dotted) and SS corner (dash-dot) | 17 | | 2.17 | DNL and INL of the 5-bit DTC with 4-level time interpolation. Legends : TT (solid line), | | | | FF (dashed line), SS (dotted line) | 18 | | 3.1 | Time interpolation using digitally weighted biasing current of differential amplifiers. Left - Switched current arrays [53]. Right - Switched differential pairs [52] | 20 | |------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------| | 2.0 | Digital PIs using gated inverters. (a) Gated static inverters [54]. (b) Switched static | 20 | | 3.2 | inverters [55]. (c) Tri-state static inverters [38, 49]. (d) Current-starved inverters [37, 56]. | | | | | 21 | | 9 9 | | <b>Z</b> 1 | | 3.3 | Output of PIs. Legends: Inputs (solid), TT (dashed), FF corner (dotted) and SS corner (dash dot) | า1 | | 9 1 | (dash-dot) | 21 | | 3.4 | | 23 | | 3.5 | | | | 3.6 | Digital PIs using grated static inverters [49] | | | 3.7 | Digital PIs using gated static inverters [54] | | | 3.8 | Harmonic rejection PIs [38] | | | 3.9 | | 25 | | 3.10 | (a) Hierarchical-tree PIs [18]. (b) Gated inverter array PIs [49] | 20<br>27 | | | | 21 | | 3.12 | (a) Nonlinearity due to small slope of signals to be interpolated. (b) Linearity improvement using slope control | 20 | | 9 19 | <u> </u> | 28 | | 3.13 | (a) Shoot-through. (b) Elimination of shoot-through in tri-state inverter interpolating cell [49]. (c) Elimination of shoot-through in current-starved inverter interpolating cell [51]. | 20 | | | cen [49]. (c) Eminimation of shoot-through in current-starved inverter interpolating cen [51]. | 20 | | 4.1 | The architecture of the proposed DTC | 31 | | 4.2 | The layout of the proposed DTC | 31 | | 4.3 | The architecture of the previously designed DTC | 32 | | 4.4 | The schematic of pre-skewed delay line with 2 pre-skewing signals. Circuit parameters of | | | | an unit size inverter : $W_p/L_p = 0.27 \ um/0.06 \ um, \ W_n/L_n = 0.12 \ um/0.06 \ um.$ | 32 | | 4.5 | The layout of the pre-skewed inverter | 33 | | 4.6 | The layout of the pre-skewed delay line with 2 pre-skewing signals | 34 | | 4.7 | The schematic of the TG-based 17:2 MUX. Circuit parameters : an unit size inverter - | | | | $W_p/L_p=0.27~um/0.06~um,~W_n/L_n=0.12~um/0.06~um,~{\rm a~transmission~gate~(TG)}$ - | | | | $W_p/L_p = 0.24 \ um/0.06 \ um, \ W_n/L_n = 0.12 \ um/0.06 \ um. \ \dots \dots \dots \dots$ | 35 | | 4.8 | The layout of the TG | 36 | | 4.9 | The layout of the TG-based 17:2 MUX | 36 | | 4.10 | The layout of the min. sized inverter used for a buffer | 37 | | 4.11 | The layout of of the inverter with a circuit parameter of $W_p/L_p=0.27\ um/0.06\ um$ and | | | | $W_n/L_n = 0.12 \ um/0.06 \ um$ with 4 fingers | 37 | | 4.12 | The layouts of of the inverters with circuit parameters of $W_p/L_p=4.32\ um/0.06\ um$ | | | | and $W_n/L_n=1.92\ um/0.06\ um$ (left) and $W_p/L_p=4.32\ um/0.06\ um,\ W_n/L_n=1.92$ | | | | $um/0.06 \ um $ with 4 fingers (right) | 38 | | 4.13 | The characteristics of the current density at the output of slope control block using a | | |------|-----------------------------------------------------------------------------------------------------------------|----| | | low-voltage cascode current mirror (top), a cascode current mirror (middle), and a basic | | | | current mirror (bot). Legends: current (solid), time-interpolated output (dotted), output | | | | of slope control block (dashed) | 39 | | 4.14 | The schematic of the slope control block. Circuit parameters: $W_{1,2}/L_{1,2} = 0.12 \ um/0.06$ | | | | $um, W_{3,4}/L_{3,4} = 0.27 \ um/0.06 \ um, W_{5,6}/L_{5,6} = 1.2 \ um/0.06 \ um, W_{7,8}/L_{7,8} = 2.7$ | | | | $um/0.06 \ um, \ W_{9,10}/L_{9,10} = 1.2 \ um/0.06 \ um, \ W_{11,12,13}/L_{11,12,13} = 2.7 \ um/0.06 \ um$ with | | | | 10 fingers, $W_{14,15,16}/L_{14,15,16} = 1.2 \ um/0.06 \ um$ with 10 fingers | 40 | | 4.15 | The layout of the slope control block. | | | 4.16 | The layout of $M1$ and $M2$ in Fig. 4.15 | 42 | | 4.17 | The layout of $R1$ and $R2$ in Fig. 4.15 | 42 | | 4.18 | The layout of $M3$ and $M4$ in Fig. 4.15 | 42 | | 4.19 | The layout of $M5$ , $M6$ , $M9$ , $M10$ in Fig. 4.15 | 43 | | | The layout of $M7$ and $M8$ in Fig. 4.15 | | | 4.21 | The layout of $M11$ , $M12$ , and $M13$ in Fig. 4.15 | 44 | | 4.22 | The layout of $M14$ , $M15$ , and $M16$ in Fig. 4.15 | 44 | | 4.23 | The characteristics of the input signals of 4b time interpolator with the new cell in this | | | | work (left) and a tri-state inverter cell from Fig. 3.2. (c) (right) as LSBs change | 45 | | 4.24 | The differences between each time when the input signal of 4b time interpolator becomes | | | | 0.8 V from 0 V and the average of the times over the change of LSBs (left) and INL of | | | | 4b time interpolator (right). Legends: the cell in this work (solid), the cell from Fig.3.2. | | | | (c) (dashed) | 45 | | 4.25 | The schematics of the 4-bit PI (left) and the PI cell (right). Circuit parameters: $W_{1,2,3}/L_{1,2,3}$ | | | | = 0.27 $um/0.06$ $um$ and $W_{4,5,6}/W_{4,5,6} = 0.12$ $um/0.06$ $um$ | 46 | | 4.26 | The layout of the PI cell | 47 | | 4.27 | The layout of the 4-bit PI | 47 | | 4.28 | Analyze the impact of slope of inputs of the PI. The transistors of PIs are assumed to | | | | operate in saturation when $0 \le v_{in} \le V_{DD}$ | 48 | | 4.29 | Analyze the impact of slope of inputs of time interpolator. The NMOS and PMOS | | | | transistors of interpolators are assumed to operate in saturation when the input falls | | | | $V_T \leq v_{in} \leq V_{DD}$ and $0 \leq v_{in} \leq V_{DD} - V_T$ , respectively | 40 | | 4.30 | Dependence of INL of a PI on the slope of the inputs | 50 | | 4.31 | DNL (left) and INL (right) of the PI with $T_{in}=5.2$ ps and $\tau=17$ ps | 51 | | | | | | 5.1 | Outputs of the the previously designed DTCs with a regular delay line and a pre-skewed | _ | | | | 54 | | 5.2 | DNL of the previously designed DTCs with a regular delay line and a pre-skewed delay | | | | line with 1 pre-skewing signal. Legends: TT (solid line), FF (dashed line), SS (dotted | ٠. | | | line) | 55 | | 5.3 | INL of the previously designed DTCs with a regular delay line and a pre-skewed delay | | |------------|----------------------------------------------------------------------------------------|----| | | line with 1 pre-skewing signal. Legends: TT (solid line), FF (dashed line), SS (dotted | | | | line) | 55 | | 5.4 | DNL and INL of the proposed DTC obtained using Spectre with BSIM4.4 models. Legends | | | | : TT (solid line), FF (dashed line), SS (dotted line) | 56 | | 5.5 | Output of the 4-bit pre-skewed delay line with 2 pre-skewing signals | 56 | | 5.6 | Output of the 17:2 MUX | 57 | | 5.7 | Output of the slope control | 57 | | 5.8 | Output of the PI | 57 | | 5.9 | A zoom-in view of output of the PI | 58 | | 5.10 | DNL and INL of the proposed DTC obtained from post-layout simulation | 58 | | <i>C</i> 1 | A schematic of a sense amplifier flip-flop [61] | 61 | | 11 | A SCHEMATIC OF A SENSE AMOUNTER HID-HOO IN H | n | ## Chapter 1 ## Introduction ADCs play an important role in virtually every aspect of our daily life. Among various architectures of ADCs, a flash ADC, a delta-sigma ADC, and a SAR ADC are dominant. Architectures of these ADCs are shown in Fig. 1.1. Flash ADCs offer the highest conversion rate and are widely used in applications such as high-speed data links over electrical and optical channels. The exponentially rising silicon area and power consumption with bit resolution and stiff challenges encountered in battling comparator mismatch limit the resolution of these ADCs. Delta-sigma ADCs offer the highest resolution by means of oversampling and noise-shaping. These ADCs are attractive only if signal bandwidth is small. Delta-sigma ADCs are widely popular in applications such as cellular communications. SAR ADCs were introduced in 1950s and debuted in complementary metal-oxide-semiconductor (CMOS) in 1970s convert an analog signal to a digital code via a binary search over all quantization levels [1, 2]. SAR ADCs have played a major role in advancing the state-of-the-art of ADCs since their inception [3]. Although popular in telephony and instrumentation where data rate is typically low, SAR ADCs have re-established themselves as a promising ADC architecture inherently crafted for modern CMOS technologies. The emerging applications of SAR ADCs range from biomedical instruments where power consumption is of a great importance [4] to high-speed data links where conversion rate is pivotal [5, 6], largely accredited to the full compatibility of SAR ADCs with technology scaling [7]. For example, in the low-power realm, SAR ADCs with kS/s rates and 10-bit resolution dissipate only a few micro watts [8, 9, 10, 11]. In the high-speed realm, a sea of 64 time-interleaved SAR ADCs are capable of achieving 90 GS/s [12]. #### 1.1 Challenges in SAR ADCs A key building block of SAR ADCs is a digital-to-analog converter (DAC) typically realized using a binary-weighted capacitor array. These DACs are known as charge-scaling DACs. Charge-scaling DACs suffer from two intrinsic drawbacks, namely an excessive amount of dynamic power consumption and a large silicon area. Lowering unit capacitance of the capacitor array, though effective in reducing Figure 1.1: Architectures of a flash ADC (left), a delta-sigma ADC (top-right), and a SAR ADC (bot-right). the overall power consumption and silicon area, is sharply confronted with a rising noise floor and the deteriorating effect of clock feed-through and charge injection. Techniques such as multi-stage binary-weighted capacitor arrays [13] and C-2C capacitor arrays [14] are effective in lowering silicon area and power consumption, they, however, suffer from performance degradation arising from the detrimental effect of the parasitic capacitances of scaling capacitors. #### 1.2 Time-Mode Circuits The advance of CMOS technology has been mainly geared towards optimizing the performance of digital systems. As a result, analog circuits not only continue to lose the benefits of specialized and process-controlled components, they must also cope with a rapidly shrinking voltage headroom, deteriorating device mismatch, and worsening linearity while satisfying ever stringent performance specifications. Although the scaling-induced performance degradation of analog circuits can be compensated to a certain degree using digital means, these approaches are often not only costly both in terms of silicon area and power consumption but also often negatively impact the performance of compensated analog circuits. Technology scaling, on the other hand, has greatly improved the switching time of digital circuits to the level that has well surpassed voltage resolution of analog circuits. For example, the propagation delay of static CMOS inverters in nanometers is only a few pico seconds. Time-mode signal processing where analog information is represented by the time difference between the occurrence of two digital events e.g. the rising or falling edges of two digital signals and processed using digital circuits only offer a viable and technology-friendly means to combat technology-scaling induced difficulties encountered in design of mixed analog-digital systems such as ADCs [15, 16, 17, 18, 19, 20, 21] and phase-locked loops (PLLs) [22, 23]. For example, flash ADCs can be achieved using a time-mode and a flash time-to-digital converter (TDC) (left) and a vernier delay-line (VDL) flash TDC (right) can be built as shown in Fig. 1.2 [24]. PLLs can be achieved using digital circuits such as a TDC instead of analog circuits as shown in Fig. 1.3 [22]. The timing diagram of a 2-bit TDC using a binary digital output representation is described in Fig. 1.4. Time-mode circuits possess a number of unique characteristics such as full compatibility with technology scaling, full programmability and portability, and a rapid design turn-around time which are not possessed by their analog counterparts [25]. Figure 1.2: Architectures of flash (left) and VDL flash TDCs (right) [24]. Figure 1.3: An architectures of an all-digital PLLs [22]. Figure 1.4: The timing diagram of a 2-bit TDC using a binary digital output representation. #### 1.3 Time-Mode SAR ADCs A time-mode SAR ADC, called a successive-approximation register time-to-digital converter (SAR TDC), can be implemented as shown in Fig. 1.5. An analog voltage signal to be digitized is first sampled and held by a sample-and-hold (S/H) block. The sampled input voltage is then converted to a time variable using a voltage-to-time converter (VTC). The resultant time variable is digitized using a SAR TDC consisting of a time comparator, a successive-approximation register (SAR), a DTC, and control logic. The DTC maps the output of the SAR to a time variable which is to be compared with the output of the VTC. Since the VTC is outside the feedback loop of a SAR TDC, its nonidealities, in particular, nonlinearity, is not suppressed by the feedback. Each block of a SAR TDC shown in Fig. 1.5 is analogous to a block of a SAR ADC. For example, a DTC is analogous to a DAC and a time comparator to a voltage comparator. A DTC combined with a VTC are analogous to a charge-redistribution DAC. The core of time-mode SAR ADCs is the DTC that maps the output of the SAR to a time variable whose value is given by $2^N \tau$ where N is the number of the bits of the SAR and $\tau$ is the unit time of the DTC. Similar to DACs, the performance of DTCs is measured by parameters such as dynamic range and differential and integral nonlinearity. One of methods to achieve a SAR TDC is a cyclic time domain successive approximation (CTDSA) method and an architecture of a SAR TDC using a CTDSA method is shown in Fig. 1.6 [26]. With this architecture, timing occurrences of feedback signals $FB_1$ and $FB_2$ are adjusted according to an output of a phase detector (PD). Figure 1.5: An architecture of a SAR TDC. Figure 1.6: An architecture of a SAR TDC using a CTDSA method [26]. This paper presents an area and power-efficient DTC to be used for a time-mode SAR ADC along with comprehensive reviews of the state-of-the-art of DTCs and digital PIs to be used for DTCs. The proposed DTC utilizes a delay-line based on a pre-skewed inverter [27] and a DTC using a delay line based-on a pre-skewed inverter has not been reported yet. The objective is to propose a DTC that are suitable for time-mode SAR ADCs with minimum power consumption and silicon area constraints while using a delay-line based on a pre-skewed inverter. The remainder of the paper is organized as follows: Chapter 2 presents a comprehensive review of the state-of-the-art of DTCs. In Chapter 3, a collection of the state-of-the-art of digital PIs are compiled. Chapter 4 presents a design of the proposed DTC that are suitable for SAR TDCs with minimum power consumption and silicon area constraints. In Chapter 5, simulation results of the proposed DTC design in TSMC 65 nm 1.0 V CMOS technology is provided. The paper is concluded and future work is discussed in Chapter 6. ## Chapter 2 ## Digital-to-Time Converters This chapter presents a comprehensive review for the start-of-the-art of DTCs. A DTC, or a digital-to-phase converter, converts a digital code to a time difference between arrivals of a reference clock and a data signal as shown in Fig. 2.1 [28]. A DTC can be used in applications such as a fractional-N all-digital PLL [29, 30, 31], a source-synchronous interface [32], a TDC [26], and a radar [33], and etc. A DTC can be implemented using architectures such as a delay-line, a VDL, a voltage-controlled oscillator (VCO), a capacitor array, a CSA, a PI, an inverter with a tunable ON-resistance, and a multipath pre-skewed delay line. If finer time resolution of a DTC is desired, more than one of aforementioned architectures can be cascaded and used together as one DTC. Figure 2.1: A time variable generated by a DTC. #### 2.1 Delay-Line-Based DTCs A conventional architecture of a delay-line-based DTC is shown in Fig. 2.2 [28, 31]. The reference clock can be generated from a PLL. The propagation delay of each delay element is set to be equal. The delay-locked loop (DLL) ensures that the per-stage delay of the delay line is preciously defined as per the period of the reference clock. The output of the delay stages are fed to a N:1 multiplexer (MUX) whose output is selected according to digital inputs entered to its selector line. The output of the DTC is defined between the rising edge of the reference clock and that of the output of the multiplexer. Since the output of the DTC is linearly proportional to the number of the delay stages of the delay line, a total of $2^N$ stages are needed to generate $2^N \tau$ delay where $\tau$ is the per-stage delay of the delay line. Figure 2.2: A conventional architecture of a delay-line-based DTC [28]. #### 2.2 VDL-Based DTCs The resolution of the DTC is lower-bound by the per-stage delay of the delay line. In order to lower the resolution, a VDL-based DTC shown in Fig. 2.3 can be used [33, 34, 35]. A conventional VDL consists of two delay lines whose per-stage delays are $\tau_1$ and $\tau_2$ , where $\tau_1 < \tau_2$ . The resolution of a VDL is $\tau_2 - \tau_1$ . As compared with a delay-line DTC, a VDL-based DTC enjoys the intrinsic advantages of finer resolution [36]. #### 2.3 PI-Based DTCs An architecture of a conventional PI-based DTC is shown in Fig. 2.4. A PI-based DTC generates a desired rising/falling edges located between two adjacent rising/falling edges [32, 37, 38] and use it as an output of a DTC. The two adjacent rising/falling edges can be generated using a delay-line, a VCO, and a multi modulos divider (MMD). The two adjacent rising/falling edges generated are fed into a slope control block. The slope control block ensures that slopes of two input edges are about 6 times the time interval between two input edges. The time-interpolated output y is given by $$y = \alpha x_1 + (1 - \alpha)x_2 \tag{2.1}$$ where $x_1$ and $x_2$ are input signals to be interpolated and $0 < \alpha < 1$ is the weighting factor. By varying $\alpha$ , a set of time-interpolated edges between $x_1$ and $x_2$ are obtained. Figure 2.3: An architecture of a VDL-based DTC. #### 2.4 CSA-Based DTCs CSA-based DTCs uses a DAC based on a binary-weighed CSA to generate time outputs [39, 40]. An architecture of a CSA-based DTC is shown in Fig. 2.5. A CSA-based DTC can be implemented using a variable-slope [39] and a constant-slope charging [40] techniques. With a variable-slope charging technique being used, a slope of a charging/discharging voltage ramp at the load of a voltage ramp generator varies depending on a digital input that determines a current intensity provided by a DAC. The load capacitance of a voltage ramp generator is fixed. A comparator, an inverter in this case, will begin to flip its output when the level of a charging/discharging voltage ramp at the load of a voltage ramp generator exceeds its threshold. The timing that the comparator begins to flip its output can be used as an output of a DTC. With a constant-slope charging being used, a slope of a charging/discharging voltage ramp at the load of a voltage ramp generator does not vary while an initial level of a charging/discharging voltage ramp varies depending on a digital input. This gives different timings for when the level of a Figure 2.4: An architecture of a conventional PI-based DTC. charging/discharging voltage ramp at the load of a voltage ramp generator exceeds a threshold of a comparator. As pointed out in [40], a DTC using variable-slope charinging technique suffers from poor integral nonlinearity (INL) arising from (i) the finite bandwidth of the comparator and (ii) the different mode of the operation of the comparator for different slopes of its input [41]. For a digitally controlled ramping input, the propagation delay of the comparator varies with the digital code in a nonlinear fashion, deteriorating INL [42]. Figure 2.5: An architecture of a CSA-based DTC. ## 2.5 Capacitor Array-Based DTCs Capacitor array-based DTCs uses a DAC based on a binary-weighed capacitor array to generate time outputs [26, 29, 30, 43, 44, 45]. An architecture of a capacitor array-based DTC is shown in Fig. 2.6. A voltage ramp generator tunes either its load capacitance by controlling a capacitor-DAC or its initial voltage level by using a charge-redistribution method. The intensity of charging/discharging current of a voltage ramp generator is fixed. Although enjoying a simple configuration, the capacitor array-based DTC suffers from the following drawbacks: (i) The charging or discharging current of the capacitors is generated from a cascode-configured current source in order to minimize the effect of the voltage of the capacitors on the current. (ii) High power consumption and large area due to the need for a binary-weighted capacitor array. Figure 2.6: An architecture of a capacitor array-based DTC. #### 2.6 ON-Resistance-Based DTCs ON-resistance-Based DTCs uses a tuning of ON-resistance of an inverter to generate time outputs [46]. An architecture of a ON-resistance-based DTC is shown in Fig. 2.7. The ON-resistance of an inverter can be tuned by varying a length (a number of stacked devices) [46] or a width (a number of fingers) of an inverter. Figure 2.7: An architecture of a ON-resistance-based DTC. #### 2.7 Multipath Pre-Skewed Delay Line DTCs The minimum per-stage delay of delay lines is set by technology and is typically much larger that the desired time resolution of SAR TDCs. For example, FO1 and FO4 of TSMC 65nm, which is the propagation delay of a minimum-sized inverter with an identical inverter load and that with 4 identical inverter loads, are 9 ps and 16 ps, respectively. In order to achieve a high time resolution, two effective approaches are at our disposal: (i) Time interpolation that generates a set of fine transitions between two given transitions [47] and (ii) pre-skewing that utilizes the output of an early stage of the delay line that arrives earlier than the input of the current stage to pre-charge/pre-discharge the load capacitor of the current stage [27]. The per-stage delay can be further reduced if multiple pre-skewing signals from earlier stages are used. This is known as multipath pre-skewing [48]. Pre-skewing has been used extensively in gated ring oscillator (GRO) ADCs to increase the frequency of the oscillators so as to improve the conversion rate of GRO ADCs [16]. Although a fine resolution can be achieved using a high degree of time interpolation, this is echoed with a high level of power consumption and a larger silicon area. Also, the nonlinearity of time interpolation arising from mismatches between interpolating cells sets the maximum achievable time resolution. Lowering the per-stage delay of the delay line prior to time interpolation becomes critical in avoiding these difficulties. To investigate the relation between the number of pre-skewing paths and the per-stage delay of the delay line, a 16-stage delay line with each stage made of two cascaded identical minimum-sized static inverters, as shown in Fig.2.8 and 0/1/2/3 pre-skewing signals are analyzed (Power supply = 1.2 V) and the results are shown in Table 2.1. The per-stage delay of the delay line without pre-skewing is 10.8 ps. It is reduced to 5.8 ps when 1 pre-skewing is used, a 46.4 % reduction of the per-stage delay. The per-stage delay is further reduced to 3.9 ps and 2.9 ps when 2 and 3 pre-skewing signals are used, representing 32.7 % and 24.8% reduction of the per-stage delay, respectively. The above observations show that the reduction in per-stage delay levels off when the number of pre-skewing signals increases. The reason for this is the rising load capacitance. Fig.2.9 plots the dependence of the per-stage delay and power consumption of the delay line on the number of pre-skewing signals. It is seen that the relation between the the per-stage delay/power consumption and the number of pre-skewing signals are nonlinear. A trade-off between delay reduction and power/silicon consumption needs to be made. The DNL and INL of the delay line in TT, and at FF/SS process corners are plotted in Figs. 2.10 and 2.11, respectively. The DNL and the INL of the delay line increase as the number of pre-skewing (c) 3 pre-skewing signals per inverter Figure 2.8: A pre-skewed delay line. Table 2.1: Per-stage delays with 0/1/2/3 pre-skewing signals. | No. of pre-skewing | Per-stage | INL/DNL | Power | |--------------------|------------|--------------------------|-----------| | signals | delay (ps) | (LSB) | $(\mu W)$ | | 0 | 10.8 | $0.3/0.6 \times 10^{-3}$ | 58.25 | | 1 | 5.8 | $-4/4 \times 10^{-3}$ | 130.9 | | 2 | 3.9 | $-32/-32 \times 10^{-3}$ | 374.8 | | 3 | 2.9 | $-22/-22 \times 10^{-3}$ | 759.1 | path increases. They are less than 0.04 LSB. The power consumption of the delay line increases as the number of the pre-skewing path increases, a price paid for a smaller per-stage delay. Silicon area also increases due to the increased number of inverters and associated routing cost. #### 2.8 Multi-Step DTCs DTCs can be cascaded to generate a finer time resolution [29, 38, 39, 43, 46, 49]. An example of this is a DTC that consists of a course time tuning stage followed by a fine time tuning stage as shown in Fig. 2.12. In Fig. 2.12, course time edges $t_0, t_1, ..., t_{N-1}$ are generated by the delay-line and fed into the N:2 MUX. Among N course time edges, two adjacent course time edges are selected and propagated to the PI depending on most significant bits (MSBs). Using two selected adjacent course time edges, a Figure 2.9: Dependence of per-stage delay and power consumption of a pre-skewed delay line on the number of pre-skewing signals. Figure 2.10: The DNL of a pre-skewed delay line. Legends: TT (dashed), FF (hashed), SS (dotted). corresponding fine time edge $t_{out}$ is generated by the PI depending on LSBs. Fig. 2.13 shows a 5-bit DTC that consists of course and fine time tuning stages designed in TSMC 65 nm 1.0 V CMOS technology. This DTC has 8 identical delay stages, each consists of 2 cascaded minimum-sized inverters. A 8:2 MUX is used to select two adjacent outputs of the DTC to be interpolated. The MUX is TG-based with buffers at the output to restore voltage swing. To minimize the load effect of the MUX, a voltage buffer is added at the output of each stage of the DTC. Fig. 2.14 shows the output of the DTC and that of the MUX. It is seen that the output of the MUX is the same as that of the Figure 2.11: The INL of a pre-skewed delay line. Legends: TT (dashed), FF (hashed), SS (dotted). Figure 2.12: An architecture of a DTC that consists of a course time tuning stage followed by a fine time tuning stage. DTC. The per-stage delay of the DTC is approximately 14 ps. Also observed is that the rise time of the output of the MUX is comparable to that of space between two adjacent outputs of the DTC. A slope control block ensuring that the rise time of the signals to be interpolated is at least 5 times that of the space between them is added [50, 51]. Figure 2.13: The 5-bit DTC that consists of a course and fine time tuning stages. Figure 2.14: Output of the MUX of the DTC. Fig. 5.1 plots the output of the DTC with 4-level time interpolation. It is seen that the rise time of the output of the slope control block is approximately 5 times the space between the interpolated signals. Also observed is that the outputs of the interpolator are spaced evenly between the interpolated signals. The resolution of the DTC is 3.5 ps. Fig. 2.16 plots the output of the interpolation. Fig.2.17 plots DNL and INL of the interpolation. It is seen that both DNL and INL are less than 0.5 LSB. Figure 2.15: Output of the DTC with 4-level time interpolation. Figure 2.16: Output of the DTC with 4-level time interpolation. Legends: Inputs (solid), TT (dashed), FF corner (dotted) and SS corner (dash-dot). ## 2.9 Performance Comparison Table 2.2 compares the performance of recently published DTCs. In this chapter, a comprehensive review for the start-of-the-art of DTCs were provided. According to a review, DTCs can be implemented using a delay-line, a VDL, a VCO, a capacitor array, a CSA, a PI, an inverter with a tunable ON-resistance, or a multipath pre-skewed delay line. To achieve a fine time resolution of a DTC efficiently, more than one of aforementioned architectures can be cascaded and used together. The performance of the reported DTCs were compared. Figure 2.17: DNL and INL of the 5-bit DTC with 4-level time interpolation. Legends : TT (solid line), FF (dashed line), SS (dotted line). Table 2.2: Performance comparison of DTCs. | Ref. | Tech. | Res | INL | Power | Remarks | |------|-------------------|-----------------------|---------------------|----------------------------------------|-------------------------| | [32] | 130 nm | _ | $\pm 12 \text{ ps}$ | 15 mW @ 1 GHz | Interpolation | | [46] | $180~\mathrm{nm}$ | $0.78 \mathrm{\ ps}$ | -0.15 LSB | $0.8~\mathrm{mW}$ @ $2.5~\mathrm{GHz}$ | C-array | | [38] | $65~\mathrm{nm}$ | 8-bit | 1.33 LSB | $4.3~\mathrm{mW}$ @ $1.5~\mathrm{GHz}$ | Interpolation | | [40] | $65~\mathrm{nm}$ | $0.019 \mathrm{\ ps}$ | _ | $1.8~\mathrm{mW}~@~55~\mathrm{MHz}$ | Constant-slope charging | | [33] | $65~\mathrm{nm}$ | 1.5 ps | _ | _ | Vernier | | [31] | $40~\mathrm{nm}$ | 21.5 ps | 3.14 | $0.0137~\mathrm{mW}$ @ 32 MHz | Delay line | | [49] | 28 nm | $0.244~\mathrm{ps}$ | 1.2 ps | $19.8~\mathrm{mW}$ @ 2 GHz | Interpolation | | [44] | 28 nm | 0.55 ps | 1.8 | $0.5~\mathrm{mW}~@~40~\mathrm{MHz}$ | Variable-slope charging | | [45] | 28 nm | 0.1 ps | $0.075~\mathrm{ps}$ | $0.015~\mathrm{mW}$ @ 40 MHz | Constant-slope charging | ## Chapter 3 ## Digital Phase Interpolators PIs are widely used in DTCs for clock and data recovery in serial data links [49, 52] and ADCs [18]. PIs can be loosely classified into analog PIs and digital PIs. The analog PIs performs a phase interpolation using the digitally weighted biasing current of differential amplifiers while the digital PIs performs a phase interpolation using digitally weighted current of inverters. Fig. 3.1 (left) shows the schematic of a current-steering analog PI [53]. This PI uses the sum of the digitally-tuned tail currents implemented using a set of gated identical current sources to generate desired transitions between In1 and In2. Interpolation words gating the current sources are complimentary of each other. Analog PIs enjoy the advantage of high-speed operation as In1 and In2 need not to be full-swing but suffer from the following drawbacks [47]: (i) High power consumption arising from both the static current of the interpolator itself and the need for an amplifier to amplify the output of the interpolator. (ii) The finite output resistance of the gated current-source transistors negatively affects linearity. (iii) Since the gated tail current of the differential pairs of a N-bit interpolator ranges from I to $(2^N - 1) * I$ where I is the current of one gated tail current source, the differential-pair transistors might fall into triode mode. (iv) The differential configuration of the interpolators makes it not immume from the effect of mismatches. (v) Poor scalability with technology. The PIs shown in Fig. 3.1 (right) performs interpolation by gating the entire unit differential pair thereby lessening the aforementioned detrimental effects [52]. Technology scaling has sharply improved the switching characteristics of digital circuits. Digital PIs where phase interpolation is realized using digital circuits benefit fully from the merits of technology scaling including a fast switching time, low power consumption, a high degree of integration, programmability, portability, and a short design turn-around time. This chapter presents the architectures and design techniques of recently emerged digital PIs. #### 3.1 Gated Inverter PIs Phase interpolation can be performed by digitally tuning the current that charge the load capacitor of static inverters, as shown in Fig. 3.2. The interpolated output is given by : $y = \alpha x_1 + (1 - \alpha)x_2$ where Figure 3.1: Time interpolation using digitally weighted biasing current of differential amplifiers. Left - Switched current arrays [53]. Right - Switched differential pairs [52]. $x_1$ and $x_2$ are the transitions to be interpolated and $0 < \alpha < 1$ is the interpolation weighting factor. By varying the interpolation weighting factor $\alpha$ , a set of fine transitions between $x_1$ and $x_2$ can be obtained. The number of the interpolated transition is set by the number of the bits of the weighting factor word. The adjustment of the weighting factor is typically realized by gating a set of identical inverters. Interpolating inverters can be implemented using gated static inverters [54], switched static inverters [55], tri-state static inverters [38, 49], or current-starved inverters [37, 50, 56], as shown in Fig. 3.2. The weighting factor is a thermometer-coded word. For example, for 3-bit interpolation, a total of 16 identical gated inverters are needed, 8 for each of the two transitions to be interpolated. The weighting factor of $x_1$ is set by thermometer-coded word $W_{x_1}$ whose content falls into 00000001 $\sim$ 1111110, corresponding to $\alpha = 1/8 \sim 7/8$ . Similarly, the weighting factor $x_2$ is set by thermometer-coded word $W_{x2}$ whose content falls into 111111110 $\sim$ 00000001, corresponding to $1 - \alpha = 7/8 \sim 1/8$ . Note that $W_{x1}$ and $W_{x2}$ are complementary of each other. It is seen from Fig. 3.2 that since in the vicinity of the thresholdcrossing of the output of the interpolator, the transistors of the inverters operate in saturation. As a result, the interpolation cell functions as a current source. Since all interpolation cells are identical, the output current of the PI is the sum of that of individual interpolation cell, yielding a good linear relation between the interpolation weighting factor and interpolated time ideally. The output of the 2-5 bits PI using gated static inverters, shown in Fig. 3.2 (a), in both norminal process conditions and at FF (fast n-type metal-oxide-semiconductor (NMOS)/fast p-type metal-oxide-semiconductor (PMOS)) and SS (slow NMOS/slow PMOS) process corners is shown Fig. 3.3 [47]. The signals to be interpolated are a pair of ideal voltage ramps with rise time 45 ps. The load of the interpolator is a minimum-sized inverter. It is seen that process spread has a significant impact on the delay of the interpolators. It, however, has a less impact on the profile of the interpolated outputs. The DNL and INL of the interpolators are shown in Figs. 3.4 and 3.5, respectively. It is seen that both DNL Figure 3.2: Digital PIs using gated inverters. (a) Gated static inverters [54]. (b) Switched static inverters [55]. (c) Tri-state static inverters [38, 49]. (d) Current-starved inverters [37, 56]. (e) Cascode tri-state inverters. and INL deteriorate with the increase in the level of interpolation. Figure 3.3: Output of PIs. Legends: Inputs (solid), TT (dashed), FF corner (dotted) and SS corner (dash-dot). A number of inverter-based digital PIs emerged recently. For example, the 7-bit digitally controlled edge interpolator shown in Fig. 3.6 was proposed [49]. It uses 128 unit interpolation cells placed in an array of 8 x 16 cells, thermometrically gated to yield monotonic behaviour. All interpolation cells operate on the same input signals to be interpolated and their outputs are tied together to the common interpolation output node. Each interpolation cell consists of two gated tri-state inverters, one operates on In1 and the other on In2. To minimize the loading effect of the interpolator on the input signals, Figure 3.4: DNLs of the PI. Legends: TT (dashed), FF (dash-dot), SS (dotted). input drivers were inserted. Similarly output buffers were also used. Another example of inverter-based digital time interpolation is shown in Fig. 3.7 [54]. The inputs of the PIs are 8 coarse time transitions separated by 45°. Adjacent transitions are interpolated to generate 4 sub-transitions. Each time interpolator consists of 2 segments of gated inverters, each has 4 identical inverters with their schematic is shown in Fig. 3.2 (a). Inverters 1-4 and 5-8 operate on In1 and In2, respectively. The outputs of the gated inverters are connected together to drive the same load capacitor. ## 3.2 Harmonic Rejection PIs It is seen in Fig. 3.2 in the vicinity of the threshold-crossing, the transistors of the interpolator operate in saturation and function as current sources. The interpolator in this case functions as an integrator and exhibit better linearity when functioning as integrators as compared with the case where they functions as RC networks [38]. The latter is the case where the load capacitor is not sufficiently large. Though offering good linearity with a large load capacitor, it suffers from the drawback of an attenuated output, a low conversion rate, and high power consumption. An effective way to improve linearity is to remove the odd harmonics of the output of PIs if the input is a sinusoid. For a 8-phase clock whose phases are $0^{\circ}$ , $45^{\circ}$ , $90^{\circ}$ , $135^{\circ}$ , $180^{\circ}$ , $225^{\circ}$ , $270^{\circ}$ , and $325^{\circ}$ to be interpolated, the phasor diagram of the 3rd harmonic component of the output of the time interpolator is shown in Fig. 3.8. It is seen that the sum of $\phi_1$ and $\phi_3$ cancels out $\phi_2$ . The same holds for the 5th harmonic component of the output of the time interpolator as well. This observation reveals that if the transitions to be interpolated Figure 3.5: INLs of the PI. are properly chosen, the summing operation shown in Fig. 3.8 will be 3rd/5th harmonic-free and yield better linearity. ## 3.3 Voltage Division PIs Phase interpolation can be performed using resistor-based voltage dividers, as shown in Figs. 3.9 [57, 58]. Fine time can be obtained by interpolating the output of two adjacent stages that have the same switching direction. Assuming that all resistors are identical, it can be shown that $$v_{11+} = v_{2+} + \frac{3}{4} (v_{1+} - v_{2+}),$$ $$v_{12+} = v_{2+} + \frac{1}{2} (v_{1+} - v_{2+}),$$ $$v_{13+} = v_{2+} + \frac{1}{4} (v_{1+} - v_{2+}).$$ (3.1) The temporal displacement of $v_{1+}$ and $v_{2+}$ and the spatial placement of the interpolated outputs allow $v_{11+}$ , $v_{12+}$ , and $v_{13+}$ to be spaced evenly in time. Since the number of interpolation levels using resistor-based dividers is typically small, the resolution obtained in this way is rather limited. Mismatches Figure 3.6: Digital PIs using tri-state static inverters [49]. Figure 3.7: Digital PIs using gated static inverters [54]. between resistors also have a detrimental impact on linearity. Moreover, since during interpolation, $v_{1+}$ always rises earlier than $v_{2+}$ , a current flowing through the resistors exists, resulting in additional power consumption. To minimize power consumption, the value of the resistors should be large. This, however, is costly in terms of silicon area. If diffusion resistors are used to realize the resistors, both the nonlinearity and large parasitic capacitance of the diffusion resistors will have an adverse impact on the performance of the interpolator. ## 3.4 Pipelined PIs Digital PI typically assume one of the two architectures: (i) Hierarchical-tree architecture shown in Fig. 3.10 (a) where each interpolator only generates one interpolated transition [18] and (ii) gated inverter array PIs shown in Fig. 3.10 (b) where multiple gated interpolating cells are connected in parallel [49]. The former enjoys the advantage of a simple configuration and a low loading effect per interpolator but suffers from a large number of interpolator cells subsequently a large silicon area and a high level Figure 3.8: Harmonic rejection PIs [38]. Figure 3.9: Digital phase interpolation using resistor voltage division [57, 58]. of power consumption. The latter features a moderate number of interpolation cells subsequently a moderate silicon area and a relatively low level of power consumption but suffers from a severe loading effect at both the input and output of the interpolator. In order to utilize the advantages of both hierarchical-tree PIs and gated inverter array PIs, Narayanan et al. [56] showed that since in applications such as PLLs, only one fine transition (phase) is needed for phase comparison at a time, there is no need to generate all fine phases simultaneously as hierarchical-tree and gated inverter array time interpolators do. If only the desired fine phase is generated, interpolators can be greatly simplified. The pipelined architectures shown in Fig. 3.11 only generate the wanted fine phase [56]. It is seen from the two examples shown in the figure that the PIs provide 3-level interpolation. At each interpolation level, three identical inverter-based PIs and a switching network are used. Among the three interpolators, only one does interpolation while the remaining two only forward their inputs. The purpose of employing two forwarding PIs is to ensure synchronization. For N-bit interpolation, N+1 interpolation cells are needed. If a hierarchical-tree time interpolator is used, the number of interpolation cells will be $2^N-1$ . Figure 3.10: (a) Hierarchical-tree PIs [18]. (b) Gated inverter array PIs [49]. ## 3.5 Design Considerations As phase interpolation provides fine times between two coarse times, linear relation between the weighting factor of the PIs and the interpolated time is critical. The linearity of digital PIs is affected by (i) the time transitions to be interpolated (input signaling), (ii) the charging process of the load capacitor of the interpolator (output signaling), (iii) the isolation between the coarse times to be interpolated and the interpolated fine times, and (iv) the effect of shoot-through. ### 3.5.1 Input Signaling One important contributor to the nonlinearity of PIs is when the rise time of the transitions to be interpolated is smaller as compared with the time difference between them, as shown in Fig. 3.12 (a). It is seen that in time interval $[t_1, t_2]$ , one-to-one mapping between voltage and time vanishes. To prevent this from occurring, the rise time of the transitions to be interpolated should be at least 5 times the delay time between them [50]. It was shown in [32] that notable improvement in linearity was observed when the rise time is 3 times that of the delay time. When such a condition cannot be met naturally, a slope control block preceding the interpolator that controls the slope of the transitions to be interpolated, as shown in Fig. 3.12 (b), should be employed [50, 51]. A number of challenges exist in design of the slope control block: (i) The slope of the output of the slope control block needs to be constant. To achieve this, these blocks are often realized using cascode current sources and a constant load capacitor. (ii) As the output of the slope control block is fed to the PIs, lowering the slope of the signals to be interpolated will increase the power consumption of the PIs due to a large short-circuit current of the input inverters of the interpolator. (iii) An overly small slope will increase interpolation time. Figure 3.11: Pipelined PIs [56]. ### 3.5.2 Output Signaling At the threshold-crossing of the load inverter, the transistors of the interpolating inverters operate in saturation. The interpolator exhibits better linearity if the current provided by the interpolating inverters exhibits less variation [38]. The output resistance of the interpolating inverters varies with their inputs, adversely affecting the linearity of the interpolator. Increasing the load capacitance of the interpolator improves linearity. This, however, is at the cost of worsening timing jitter and interpolation speed [32]. To minimize the effect of the varying output resistance, a series resistor whose resistance is larger than the output resistance of interpolating inverters can be added at the output of the interpolator. For PIs with a large interpolating cells, the resistance can be programmed to improve the linearity of the interpolator [38]. ### 3.5.3 Input-Output Isolation When interpolation cell of Fig. 3.2 (b) is not activated, its output is still influenced by its input via the gate-to-drain capacitance of the transistors [53]. Such an influence affects the output of the interpolator subsequently its linearity. To minimize this detrimental effect, interpolating inverter of Fig. Figure 3.12: (a) Nonlinearity due to small slope of signals to be interpolated. (b) Linearity improvement using slope control. 3.2 (a) or Fig. 3.2 (c) should be used while Fig. 3.2 (c) is preferred [38, 50]. This is because in Fig. 3.2 (c) there is no switching activities at the input node. As the input nodes are connected together in the gated inverter array in Fig. 3.2 (b), the switching activity of one interpolating inverter impacts others. ### 3.5.4 Shoot-Through Although when the rise time of the transitions to be interpolated is sufficiently large as compared with the time difference between them, the situation where the logic state of $x_1$ and that of $x_2$ differ from each other should not occur. If such a case do exist, a short-circuit path between power and ground rails will exist, as shown in Fig. 3.13, deteriorating linearity. To eliminate this possibility, the logic gates controlled by y shown in Fig. 3.12 (c) and Fig. 3.12 (d) ensures that when $x_1$ and $x_2$ have different logic states, no direct path from $V_{DD}$ to ground exists. Figure 3.13: (a) Shoot-through. (b) Elimination of shoot-through in tri-state inverter interpolating cell [49]. (c) Elimination of shoot-through in current-starved inverter interpolating cell [51]. ## 3.6 Performance Comparison Table 5.2 compares the performance of recently reported digital PIs. Table 3.1: Performance comparison of digital PIs. | Ref. | Tech. | Freq | Res. | INL/DNL | |------|------------------|--------------------|---------------------|---------------------| | [32] | 130 nm | 1.5 GHz | 4.1 ps | 0.1 ps/12 ps | | [58] | 90 nm | _ | $4.7 \mathrm{\ ps}$ | 0.6 LSB/1.2 LSB | | [54] | $65~\mathrm{nm}$ | $3~\mathrm{GHz}$ | 10.4 ps | 0.5 ps | | [38] | $65~\mathrm{nm}$ | $1.5~\mathrm{GHz}$ | 3.37 ps | 1.33 LSB/0.5 | | [50] | $65~\mathrm{nm}$ | _ | 6B | 0.41 LSB/1.25 LSB | | [37] | $32~\mathrm{nm}$ | $6~\mathrm{GHz}$ | 2.6 ps | _ | | [49] | 28 nm | $2~\mathrm{GHz}$ | $0.244~\mathrm{ps}$ | 1.2 LSB/0.3 ps | In this chapter, a comprehensive review for the state-of-the-art of digital PIs was provided. Digital PIs can be implemented using gate inverters, an harmonic rejection technique, a voltage-division method, and a pipelined method. The design considerations for digital PIs were discussed. The performance of the reported digital PIs were compared. # Chapter 4 # The Proposed DTC The proposed DTC is a 8-bit 2-step DTC that consists of a pre-skewed delay line for a course time tuning and a digital PI for a fine time tuning. The proposed DTC is designed using TSMC 65 nm 1.0 CMOS technology. The schematic and the layout of the proposed DTC are shown in Fig. 4.1 and Fig. 4.2. The coarse time tuning stage consists of a 4-bit pre-skewed delay line with 2 pre-skewing signals and a 17:2 MUX while the fine time tuning stage consists of a 4-bit cascode tri-state inverter-based PI with a slope control block. ## 4.1 The Prior Design The proposed DTC is implemented based on previously designed DTCs employing a 6-bit preskewed delay line with 0 (regular delay-line)/1 pre-skewing signals and a 2-bit PI using a tri-state static inverter from Fig. 3.2 (c) as a PI cell. The previously designed DTC consists of a 6-bit pre-skewed delay line with 0/1 pre-skewing signals, 65:2 MUX, an intermediate buffer stage, a slope control block using a current mirror, and a 2 bit PI shown in Fig. 4.3. ## 4.2 The Architecture of the Proposed DTC ### 4.2.1 The Pre-Skewed Delay Line Fig. 4.4 shows the schematic of the 4-bit pre-skewed delay line with 2 pre-skewing signals. The layout of the pre-skewed inverter and the pre-skewed delay line with 2 pre-skewing signals are shown in Fig. 4.5 and 4.6. The pre-skewed delay line consists of a chain of unit size inverters that have pre-skewing signals. The pre-skewed delay line generates 17 course phases $t_0 - t_{16}$ , which are fed into the 17:2 MUX. For example, each pre-skewed delay element shown in Fig. 4.1 consists of 2 unit size inverters that have pre-skewing signals. Tens of pre-skewed delay elements, whose output edges are not fed into the 17:2 MUX, are placed between the reference clock and a phase $t_0$ to generate the course phases whose time differences are reasonably stable. Figure 4.1: The architecture of the proposed DTC. Figure 4.2: The layout of the proposed DTC. Figure 4.3: The architecture of the previously designed DTC. Figure 4.4: The schematic of pre-skewed delay line with 2 pre-skewing signals. Circuit parameters of an unit size inverter: $W_p/L_p=0.27\ um/0.06\ um,\ W_n/L_n=0.12\ um/0.06\ um.$ Figure 4.5: The layout of the pre-skewed inverter. ## 4.2.2 The TG-Based 17:2 Multiplexer The TG-based 17:2 MUX propagates 2 adjacent course phases $t_i$ and $t_{i+1}$ depending on the MSB selector word of the DTC. The schematic of the TG-based 17:2 MUX is shown in Fig. 4.7. The layout of the TG and the TG-based 17:2 MUX are shown in Fig. 4.8 and 4.9. 1 dummy stage is placed by each of TGs propagating $t_0$ and $t_{16}$ for the symmetrical load. To prevent the loading effect, a buffer is placed between the pre-skewed delay line and the multiplexer. Also, the buffers are placed at the output to restore voltage swing. The layout of the min. sized inverter used for a buffer is shown in Fig. 4.10. Figure 4.6: The layout of the pre-skewed delay line with 2 pre-skewing signals. ## 4.2.3 The Intermediate Buffer Stage The 2 adjacent course phases $t_i$ and $t_{i+1}$ from the MUX are transmitted to the fine phase stage. First, $t_i$ and $t_{i+1}$ are to be fed into the input of the slope control block. The size of input transistors of the slope control block is larger than the size of the buffer at the output of the MUX and a significant loading effect exists at the input of the slope control block. To solve this, the intermediate buffer stage for increased driving capability is introduced. The intermediate buffer stage consists of 3 cascaded inverters that are placed between MUX and the slope control block. The circuit parameters for each of inverters are $(W_p/L_p = 0.27 \ um/0.06 \ um, \ W_n/L_n = 0.12 \ um/0.06 \ um)$ with 4 fingers, $(W_p/L_p = 4.32 \ um/0.06 \ um, \ W_n/L_n = 1.92 \ um/0.06 \ um)$ with 4 fingers, respectively. The intention of gradually increasing a width or a number of fingers of an inverter by 4 is to minimize propagation delays between inverters. The layout of the inverters used in the intermediate buffer stage is shown in Figs. 4.11 and 4.12. #### 4.2.4 The Slope Control The slope control block is a current-starved inverter where the current source is provided by a low-voltage cascode current mirror as shown in Fig. 4.14. The purpose of the slope control block is to loosen the slope of the incoming signals $t_i$ and $t_{i+1}$ so that their rising/falling time can be at least 3 times greater than the time difference of their arrivals [50]. It has been reported that a linearity of a PI deteriorates if this condition is not met. The low-voltage cascode current mirror was chosen as it can provide better linear charging/discharging process at the load when compared to other current mirrors while satisfying the limited voltage headroom and providing the needed driving capability. For example, Fig. 4.13 shows the characteristics of the current density at the output of slope control block using a low-voltage cascode current mirror (top), a cascode current mirror (middle), and a basic current mirror (bot) during the time interpolation process. The current provided by the low-voltage cascode current mirror is shown to be large enough and stays constant during the time interpolation process. Figure 4.7: The schematic of the TG-based 17:2 MUX. Circuit parameters : an unit size inverter - $W_p/L_p = 0.27~um/0.06~um$ , $W_n/L_n = 0.12~um/0.06~um$ , a TG - $W_p/L_p = 0.24~um/0.06~um$ , $W_n/L_n = 0.12~um/0.06~um$ . Meanwhile, the current density proivded by the cascode current mirror is significantly smaller due to the difficulty of meeting the requirement of the saturation mode of the transistors, which is resulted from the limited voltage headroom. The current provided by the basic current mirror attenuates significantly during the time interpolation process, which is not desirable for the linear charging/discharging process. Figure 4.8: The layout of the TG. Figure 4.9: The layout of the TG-based 17:2 MUX. Figure 4.10: The layout of the min. sized inverter used for a buffer. Figure 4.11: The layout of of the inverter with a circuit parameter of $W_p/L_p=0.27\ um/0.06\ um$ and $W_n/L_n=0.12\ um/0.06\ um$ with 4 fingers. The layouts of circuit components in the slope control block are shown in Fig. 4.16, 4.17, 4.18, 4.19, 4.20, 4.21, and 4.22. Figure 4.12: The layouts of of the inverters with circuit parameters of $W_p/L_p=4.32\ um/0.06\ um$ and $W_n/L_n=1.92\ um/0.06\ um$ (left) and $W_p/L_p=4.32\ um/0.06\ um$ , $W_n/L_n=1.92\ um/0.06\ um$ with 4 fingers (right). ### 4.2.5 The PI The 4-bit PI consists of 32 identical PI cell whose 16 cells receive $t_i$ and rest 16 cells $t_{i+1}$ at the inputs as shown in Fig. 4.25. The layouts of the PI cell and the 4-bit PI are shown in Figs. 4.26 and 4.27. The number of total active cells during the time interpolation is always 16 out of 32 and the interpolation weighting factor $\alpha$ is determined by the LSB selector word of the DTC. The output of the 32 cells are tied together. The PI generates 16 fine phases between $t_i$ and $t_{i+1}$ . The PI cell used in this work is the tri-state static inverter where each of 1 NMOS and 1 PMOS is placed in extra between the switching transistor and the input transistor, as shown in Fig. 3.2. (e). The placed NMOS Figure 4.13: The characteristics of the current density at the output of slope control block using a low-voltage cascode current mirror (top), a cascode current mirror (middle), and a basic current mirror (bot). Legends: current (solid), time-interpolated output (dotted), output of slope control block (dashed). Figure 4.14: The schematic of the slope control block. Circuit parameters : $W_{1,2}/L_{1,2} = 0.12 \ um/0.06 \ um$ , $W_{3,4}/L_{3,4} = 0.27 \ um/0.06 \ um$ , $W_{5,6}/L_{5,6} = 1.2 \ um/0.06 \ um$ , $W_{7,8}/L_{7,8} = 2.7 \ um/0.06 \ um$ , $W_{9,10}/L_{9,10} = 1.2 \ um/0.06 \ um$ , $W_{11,12,13}/L_{11,12,13} = 2.7 \ um/0.06 \ um$ with 10 fingers, $W_{14,15,16}/L_{14,15,16} = 1.2 \ um/0.06 \ um$ with 10 fingers. and PMOS are intended to isolate the input and the output better by attenuating the signals coming from the output of the cell and the LSB selector word into the input of the cell. Fig. 4.23 shows the characteristics of the input signals of the 4-bit PI using the cell in this work and a tri-state inverter cell from Fig. 3.2. (c) over the change of the LSBs, respectively. Fig. 4.24 shows the differences between each time when the input signal of the 4-bit PI becomes 0.8 V from 0 V and the average of the times over the change of LSBs (left) and INL of the 4-bit PI (right). It is shown that the cell in this work more attenuates unwanted signal coming from the output and the LSB selector word. At the output of the PI is followed by 2 min. sized inverting buffers for a better charging/discharging shape. ## 4.3 Analysis ### 4.3.1 Optimal Slope of the Input of the PI The slope control block adjusts the slope of the signals to be interpolated so as to maximize both the speed and linearity of the PI. It was observed in the preceding section that the slope of the input of the PI greatly affects the speed of the PI subsequently the throughput of the ADC. It is highly desirable to increase the slope of the signals to be interpolated so as to maximize the speed of the DTC. Prior studies, however, showed that the slope of the input to be interpolated needs to be at least 5 times the Figure 4.15: The layout of the slope control block. delay time between the inputs to be interpolated so as to minimize the nonlinearity of the PI [32, 50, 51]. No theoretical analysis, however, exists to backup the constraint imposed on the minimum slope of the signals. In this section, we investigate factors contributing to the nonlinearity of the PI so as to allow Figure 4.16: The layout of M1 and M2 in Fig. 4.15. Figure 4.17: The layout of R1 and R2 in Fig. 4.15. Figure 4.18: The layout of M3 and M4 in Fig. 4.15. us to find the maximum slope of the inputs of the PI that minimizes the delay of the interpolation without sacrificing linearity. Since the output of the PI is picked up by a static inverter whose threshold Figure 4.19: The layout of M5, M6, M9, M10 in Fig. 4.15. Figure 4.20: The layout of M7 and M8 in Fig. 4.15. Figure 4.21: The layout of M11, M12, and M13 in Fig. 4.15. Figure 4.22: The layout of M14, M15, and M16 in Fig. 4.15. Figure 4.23: The characteristics of the input signals of 4b time interpolator with the new cell in this work (left) and a tri-state inverter cell from Fig. 3.2. (c) (right) as LSBs change. Figure 4.24: The differences between each time when the input signal of 4b time interpolator becomes 0.8 V from 0 V and the average of the times over the change of LSBs (left) and INL of 4b time interpolator (right). Legends: the cell in this work (solid), the cell from Fig.3.2. (c) (dashed). voltage is $V_{DD}/2$ ideally, interpolation should be performed in the region where signals $x_1$ and $x_2$ to be interpolated vary with time linearly such that the output of the interpolator can be calculated using $y = \alpha x_1 + (1 - \alpha)x_2$ . #### Simple Analysis To simplify analysis, assume that interpolation is performed in the vicinity of $V_{DD}/2$ only. Further assume that both the NMOS and PMOS transistors of the PI operate in saturation throughout the entire input range $0 \sim V_{DD}$ . In Fig. 4.28 (a), for a time instant between A and C, although $x_1$ varies with time linearly but $x_2$ remains at 0. Similarly for a time instant between D and B, $x_2$ varies with time linearly but $x_1$ remains at $V_{DD}$ . In both cases, no linear relation between y and y or y exists. Figure 4.25: The schematics of the 4-bit PI (left) and the PI cell (right). Circuit parameters : $W_{1,2,3}/L_{1,2,3}=0.27~um/0.06~um$ and $W_{4,5,6}/W_{4,5,6}=0.12~um/0.06~um$ . As a result, the PI exhibits a high degree of nonlinearity. In Fig. 4.28 (b), for a time instant between A and B, both $x_1$ and $x_2$ vary with time linearly. y has linear relation with $x_1$ and $x_2$ . In Fig. 4.28 (c), for a time instant between A and B, both $x_1$ and $x_2$ vary with time linearly. As a result, y also has linear relation with $x_1$ and $x_2$ . The slope of the signals to be interpolated is unnecessarily low, resulting in an overly long interpolation time. Fig. 4.28 (b) thus defines the maximum slope at which the speed of the PI is maximized with no linearity degradation. The maximum slope of $x_1$ and $x_2$ , denoted by $x_1$ and $x_2$ , respectively, are given by $$s_{1,2} = \frac{\frac{1}{2}V_{DD}}{T_{in}} = \frac{0.5}{T_{in}}. (4.1)$$ The rise time of the input $\tau_{rise}$ , i.e., the amount of time for the input to rise from 0.1 V to 0.9 V, is therefore obtained from $$\tau_{rise} = \frac{0.8}{s_{1,2}} = 1.6T_{in}.\tag{4.2}$$ Figure 4.26: The layout of the PI cell. Figure 4.27: The layout of the 4-bit PI. #### Better Analysis The preceding analysis provides an important insight on how the slope of inputs affects the linearity of the PI despite its simplicity. In what follows a more practical analysis is provided. As mentioned earlier that the transistors of the PI need to operate in saturation so as to provide a better linearity. Assume $V_{Tn} = |V_{Tp}| = V_T$ . Using pinch-off condition, one can show that the condition upon which the NMOS and PMOS transistors of the PI are saturation is given by $V_T \leq v_{in} \leq v_o + V_T$ and $v_o - V_T \leq v_{in} \leq V_{DD} - V_T$ . Figure 4.28: Analyze the impact of slope of inputs of the PI. The transistors of PIs are assumed to operate in saturation when $0 \le v_{in} \le V_{DD}$ . As the output of the PI varies with its input, to simplify analysis, we assume that the NMOS and PMOS transistors of the PI are saturation when $V_T \leq v_{in} \leq V_{DD}$ and $0 \leq v_{in} \leq V_{DD} - V_T$ . This is a better approximation as compared with the preceding analysis. The input voltage range in which both NMOS and PMOS transistors are in saturation is highlighted in Fig. 4.29. In Fig. 4.29 (a), for a time instant between A and D, the NMOS transistor of the inverter whose input is $x_2$ is not in saturation. Similarly for a time instant between C and B, the PMOS transistor of the inverter whose input is $x_1$ is not in saturation. In both cases, no linear relation between y and $x_1$ or $x_2$ exists. As a result, the PI exhibits a high degree of nonlinearity. In Fig. 4.29 (b), for a time instant between A and B, since the input of the PI in this case falls into the range in which both NMOS and PMOS transistors of the interpolator are in saturation, y has linear relation with $x_1$ and $x_2$ . In Fig. 4.29 (c), for a time instant between A and B, the corresponding voltage falls into the range in which both NMOS and PMOS transistors of the interpolator are in saturation. As a result, y has linear relation with $x_1$ and $x_2$ . The preceding analysis shows that Fig. 4.29 (b) in the boundary at which the transistors of the PI operate in saturation and therefore defines the maximum slope at which the speed of the PI is maximized with no linearity degradation. Fig. 4.29 (c) is the case where the slope of the input of the PI is overly small, which gives rise to a long interpolation time. To obtain the maximum slope of the input at which no linearity degradation occurs, consider Fig. 4.29 (b). The slope of $x_1$ , denoted by $s_1$ , and that of $x_2$ , denoted by $s_2$ , are given by $$s_1 = \frac{\frac{1}{2}V_{DD}}{T_{in} + \tau}, \quad s_2 = \frac{V_T}{\tau}.$$ (4.3) Figure 4.29: Analyze the impact of slope of inputs of time interpolator. The NMOS and PMOS transistors of interpolators are assumed to operate in saturation when the input falls $V_T \leq v_{in} \leq V_{DD}$ and $0 \leq v_{in} \leq V_{DD} - V_T$ , respectively. Since $s_1 = s_2$ , we have $$\tau = \left(\frac{V_T}{\frac{1}{2}V_{DD} - V_T}\right)T_{in}.\tag{4.4}$$ For a 65 nm CMOS technology, we use $V_{DD}=1$ V and $V_{T}=0.36$ V to estimate $\tau$ , the result is $\tau=2.57T_{in}$ . The slope of the input is therefore obtained from $$s_{1,2} = \frac{\frac{1}{2}V_{DD}}{T_{in} + \tau} = \frac{0.14}{T_{in}}. (4.5)$$ The rise time of the input $\tau_{rise}$ is therefore obtained from $$\tau_{rise} = \frac{0.8}{s_{1.2}} = 5.7T_{in}.\tag{4.6}$$ The rise time of the inputs of the PI needs to be approximately 6 times the time space between the inputs of the PI in order to ensure that the transistors of the PI operate in saturation, echoing with the design guides on the slope of the signals to be interpolated given in [32, 50]. #### **Practical Analysis** The NMOS and PMOS transistors of the PI will operate in saturation if $V_T \leq v_{in} \leq v_o + V_T$ and $v_o - V_T \leq v_{in} \leq V_{DD} - V_T$ , respectively. Since $x_1$ and $x_2$ rise from 0 V, $v_o = V_{DD}$ before $x_1$ and $x_2$ surpass the threshold voltage of the inverter. For $0 < x_1, x_2 < V_{DD}/2$ , the NMOS transistor of the PI will operate in saturation if $V_T \leq v_{in} \leq V_{DD} + V_T$ while the PMOS transistor of the inverter operates in triode. Similarly, for $V_{DD}/2 < x_1, x_2 < V_{DD}$ , $v_o=0$ . The NMOS transistors of the PI operate in triode while the PMOS transistor of the inverter operates in saturation. Although it appears at the first glance that such a region where both the NMOS and PMOS transistors of the interpolator operate in saturation does not exist, in reality, the inverters of the interpolator will start to switch when gate-source voltage exceeds $V_T$ . An in-depth analysis can be carried out to find the maximum slope of the input of the interpolator at which the latency of the PI is minimized without sacrificing linearity. Such an analysis is rather involved and will not be presented here. #### Simulation Result The proposed DTC, designed in a TSMC 65 nm technology, was used to verify the aforementioned analysis and analyzed using Spectre with BSIM4.4 device models. Fig. 4.30 plots the dependence of the INL of the PI on $T_{in}/\tau$ . It is seen that INL deteriorates exponentially with the decrease of $\tau$ or equivalently the increase of the slope of the input of the PI. This agrees well with the theoretical analysis presented earlier. Fig. 4.31 plots the dependence of the DNL and INL of the PI on interpolation selection word (weight factor) with $T_{in} = 5.2$ ps and $\tau = 17$ ps. It is seen that DNL and INL are less than 0.1 LSB and 0.18 LSB, respectively. The DNL and INL of the DTC are less than 0.16 LSB and 0.48 LSB, respectively. Figure 4.30: Dependence of INL of a PI on the slope of the inputs. #### 4.3.2 Linear Time Interpolation The condition for linear time interpolation that guides to a design consideration for the slope of input at TI is briefly discussed. The linear time interpolation can be achieved when following relations Figure 4.31: DNL (left) and INL (right) of the PI with $T_{in} = 5.2$ ps and $\tau = 17$ ps. are met: $$t_{out,n_1} - t_{out,n_1+1} = t_{out,n_2+1} - t_{out,n_2} = \alpha, \tag{4.7}$$ $$t_{MSB} = t_{V_{-}In2=0+\epsilon} - t_{V_{-}In1=0+\epsilon} = \beta,$$ (4.8) $$t_{LSB} = \frac{\beta}{k} = \alpha, \tag{4.9}$$ where $t_{out,n_1}$ and $t_{out,n_2}$ are times when output of time interpolator reaches to $v_{out}$ , with $n_1$ and $n_1+1$ time interpolation cells receiving $In_1$ being active where $n_1=1,2,...,k,\ n_2=k-n_1$ , and k is # of total time interpolation cells / 2. $t_{V\_In1=0+\epsilon}$ and $t_{V\_In2=0+\epsilon}$ are times when input edges $In_1$ and $In_2$ arrive at input of time interpolator, $\alpha$ a time difference between 2 neighbouring output edges of time interpolator as LSB selector word changes by 1. $t_{MSB}$ is a course time resolution and equal to a time difference between $t_{V\_In1=0+\epsilon}$ and $t_{V\_In2=0+\epsilon}$ , where $t_{V\_In1=0+\epsilon}-t_{V\_In1=0+\epsilon}>0$ . $t_{MSB}$ and $t_{LSB}$ are not to be changed during time interpolation. For a linear time interpolation, eqs. (4.7)-(4.9) need to be satisfied with all LSB selector words. Linear time interpolation may be achieved if all input NMOS and PMOS transistors of active time interpolation cells receiving $In_1$ and $In_2$ as inputs operate in saturation mode for a duration of t>0. This may happen if $t_{V\_In2=Vth,n} < t_{V\_In2=Vdd/2-\Delta} < t_{V\_In1=Vdd/2+\Delta}$ , where $t_{V\_Ini=Vth,n}$ is when input transistors of active time interpolation cells receiving $In_i$ enters into linear region, and $t_{V\_Ini=Vdd/2-\Delta}$ and $t_{V\_Ini=Vdd/2+\Delta}$ are lower and upper boundaries of the region where both NMOS and PMOS input transistors of active time interpolation cells receiving $In_i$ input operate in saturation mode, where $\Delta>0$ is an arbitrary number defining the saturation range of input transistors of time interpolator, respectively. Assumptions are made that $I_{d,sat}=I_{d,In1,sat}=I_{d,In2,sat}=m>>I_{d,linear}$ when $t_{V\_Ini=Vdd/2-\Delta} < t < t_{V\_Ini=Vdd/2+\Delta}$ , where m>0 is a constant, $I_{dp,Ini,sat} \approx I_{dn,Ini,linear}$ when $t_{V\_Ini=Vdd/2+\Delta} < t < t_{V\_Ini=Vdd-Vth,p}$ , and $I_{dn,Ini,sat} \approx I_{dp,Ini,linear}$ when $t_{V\_Ini=Vth,n} < t < t_{V\_Ini=Vdd/2-\Delta}$ . For $0 < t < t_{V\_In2=Vdd/2-\Delta}$ , drain current at output of time interpolator $I_d = I_{d,In1} + I_{d,In2}$ , where $I_{d,In1} = I_{d,In1,sat} + I_{d,In1,linear}$ and $I_{d,In2} = I_{d,In2,sat} + I_{d,In2,linear}$ , can be approximated to $I_d \approx I_{d,In1,sat}$ as $I_{d,In1} \approx I_{d,In1,sat}$ and $I_{d,In2} \approx 0$ . Then, it can be said that a discharging process at output of time interpolator is dominated by $I_{d,In1}$ for $0 < t < t_{V\_In2=Vdd/2-\Delta}$ . For $t_{V\_In2=Vdd/2-\Delta} < t < t_{V\_In1=Vdd/2+\Delta}$ , $I_d$ can be approximated to $I_d \approx I_{d,In1,sat} + I_{d,In2,sat}$ as $I_{d,In1} \approx I_{d,In1,sat}$ and $I_{d,In2} \approx I_{d,In2,sat}$ . Then, it can be said that a discharging process at output of time interpolator is dominated by $I_{d,In1}$ and $I_{d,In2}$ for $< t_{V\_In2=Vdd/2-\Delta} < t < t_{V\_In1=Vdd/2+\Delta}$ . Then, $t_{out,n_1} - t_{out,n_1+1}$ , a time difference between 2 events when 2 neighbouring output edges of time interpolator reach $V_{out}$ from Vdd, assuming a rising edge at input of the PI, can be described as $$\begin{split} V_{out} = & Vdd - \frac{n_1 * I_{d,In1,sat} * (t_{V\_In2 = Vdd/2 - \Delta} - t_{V\_In1 = Vdd/2 - \Delta})}{C_{load}} \\ & - \frac{(n_1 + n_2) * I_{d,In1,sat} * (t_{V\_In1 = Vdd/2 + \Delta} - t_{V\_In1 = Vdd/2 - \Delta})}{C_{load}}, \end{split}$$ $$t_{out} = t_{V\_In1=Vdd/2+\Delta}. (4.10)$$ From eq. (4.10), a time interval between phase-interpolated edges can be described as $$\begin{split} t_{out,n=n1} - t_{out,n=(n1+1)} \\ &= \frac{C_{load}}{k*I_{d,In1,sat}} * \frac{I_{d,In1,sat}*(t_{V\_In2=Vdd/2-\Delta} - t_{V\_In1=Vdd/2-\Delta})}{C_{load}} \\ &= \frac{(t_{V\_In2=Vdd/2-\Delta} - t_{V\_In1=Vdd/2-\Delta})}{k} \\ &= \frac{(t_{V\_In2=0+\epsilon} - t_{V\_In1=0+\epsilon})}{k}, \end{split}$$ $$t_{out,n=n1} - t_{out,n=(n1+1)} = \alpha = \frac{\beta}{k}.$$ (4.11) From eq. (4.11), eqs. (4.7) - (4.9) are satisfied. Therefore, linear time interpolation may occur when $t_{V\_In2=Vth,n} < t_{V\_In2=Vdd/2-\Delta} < t_{V\_In1=Vdd/2+\Delta}$ . In this chapter, the architecture of the proposed 8-bit area and power-efficient DTC was presented. The proposed DTC consists of 4-bit course time tuning stage and a 4-bit fine time tuning stage. The course time tuning stage is implemented with a pre-skewed delay line with 2 pre-skewing signals. The fine time tuning stage is implemented with a cascode tri-state inverted-based PI. The slope control block is implemented a current-starved inverter with a low-voltage cascode current mirror as a supplying current source. The design considerations for the slope control block and the phase interpolation are provided. # Chapter 5 # Simulation Results The performance of a DTC can be characterized using similar testing techniques that can be applied to testing of a DAC. For example, DC paramters such as an offset, a longest respond time, and Full scale range (FSR), and linearity parameters such as DNL, INL, and signal-to-noise ratio [59, 60] can be measured. Also, power consumption and silicon area of a DTC are key metrics. In this work, a longest respond time, FSR, DNL,INL, and power consumption of the proposed DTC are measured. The simulation with BSIM4.4 models and the post-layout simulation are performed. A voltage ramp is used as a stimulator for the DTC and 256 test cases, to cover all digital input possibilities, generated by a Verilog-A model is used. The generated digital inputs are thermometer-coded. For $I_1$ , $I_2$ , and $I_{ref}$ from Fig. 4.15, transistors with ideal voltages applied at their gates are used. ## 5.1 The Prior Design #### 5.1.1 Simulation with BSIM4.4 Device Models The previously designed DTC shown in Fig. 4.3 was designed in a TSMC 65 nm technology and analyzed using Spectre with BSIM4.4 device models. Fig. 5.1 plots the outputs of the previously designed DTCs with a regular delay line and a pre-skewed delay line with 1 pre-skewing signal. The time resolution of the previously designed DTC with the regular delay line is 3.2 ps and latency is 1.1 ns with power consumption 256.7 $\mu$ W. The time resolution of the previously designed DTC with the pre-skewed delay line with 1 pre-skewing signal is 1.5 ps and latency is 651.5 ps with power consumption 330.7 $\mu$ W. Fig. 5.3 plots the DNL and INL of the both DTCs. The peak DNL and INL of both DTCs are less than 0.5 LSB. Figure 5.1: Outputs of the previously designed DTCs with a regular delay line and a pre-skewed delay line with 1 pre-skewing signal. ## 5.2 The Proposed Design #### 5.2.1 Simulation with BSIM4.4 Device Models The simulation result of the proposed DTC using Spectre with BSIM4.4 models is presented in this section. The total 257 selector words were tested of which 4 MSBs were fed into the MUX and 4 LSBs the PI as a thermometer code. The max. respond time of the DTC is 395 ps, which is the 2.53 GHz speed of operation, and a time resolution is 0.33 ps for the nominal process and temperature condition (27 °C). The max. respond time of 303 ps (3.3 GHz) and 561 ps (1.78 GHz) and the time resolution of 0.26 ps and 0.44 ps are obtained for FF and SS process corners. Fig. 5.4 plots the DNL and INL of the DTC. The DNL and INL are less than 0.14 LSB and 0.49 LSB, respectively, for the nominal condition and less than 0.26 LSB and 0.72 LSB for process corners. The DTC was tested for the temperature of (40 °C) and (-20 °C) and DNL and INL are less than 0.16 LSB and 0.51 LSB. The power consumption of Figure 5.2: DNL of the previously designed DTCs with a regular delay line and a pre-skewed delay line with 1 pre-skewing signal. Legends: TT (solid line), FF (dashed line), SS (dotted line). Figure 5.3: INL of the previously designed DTCs with a regular delay line and a pre-skewed delay line with 1 pre-skewing signal. Legends: TT (solid line), FF (dashed line), SS (dotted line). the DTC is 1.38 mW for the nominal condition, where power consumption for the pre-skewed delay line, 17:2 MUX, the slope control block, and the PI are 548 $\mu$ W, 71.7 $\mu$ W, 717 $\mu$ W, and 4.75 $\mu$ W, respectively. Table 5.1 shows the parameters of the proposed DTC. The simulation results showing outputs of the pre-skewed delay line with 2 pre-skewing signals, the 17:2 MUX, the slope control, and the PI with a zoom-in view are shown in Figs. 5.5, 5.6, 5.7, 5.8, and 5.9. Figure 5.4: DNL and INL of the proposed DTC obtained using Spectre with BSIM4.4 models. Legends: TT (solid line), FF (dashed line), SS (dotted line). Table 5.1: Parameters of the proposed DTC obtained from simulation with BSIM4.4 device models. | Parameter | | |---------------------------------------------|---------------------------------| | Technology | 65 nm | | Power Supply | 1 V | | Frequency | $2.53~\mathrm{GHz}$ | | Max. Response Time | 395 ps | | Time Resolution | 0.33 ps | | DNL/INL (40 $^{\circ}$ and -20 $^{\circ}$ ) | $0.14/0.49 \ (0.16/0.51) \ LSB$ | | Power Consumption | 1.38 mW | Figure 5.5: Output of the 4-bit pre-skewed delay line with 2 pre-skewing signals. ### 5.2.2 Post-Layout Simulation The post-layout simulation is performed using Mentors Calibre. The layout of the proposed DTC is done using Cadence Virtuoso. Parasitic resistance and capacitance (R + C + CC) from metal interconnects and power rails are extracted. With the extracted R + C + CC being accounted, a time resolution of the proposed DTC is 1.72 ps and FSR is 442 ps. The peak DNL and INL are -0.73/0.66 LSB and -2.4/2.4 LSB, respectively. The post-layout simulation result showing DNL and INL of the Figure 5.6: Output of the 17:2 MUX. Figure 5.7: Output of the slope control. Figure 5.8: Output of the PI. proposed DTC is shown in Fig. 5.10. Figure 5.9: A zoom-in view of output of the PI. Figure 5.10: DNL and INL of the proposed DTC obtained from post-layout simulation. ## 5.3 Performance Comparison Table 5.2 compares the performance of the DTC in this work with recently published DTCs. It should be mentioned that the performance parameters of this work is based on simulation results obtained from using BSIM4.4 models only and without a layout being taken account. DTCs in [38], [46], and [49] were fabricated. | Ref. | Tech. | Res. | INL | Power | |-----------------------|-------|-------|---------|--------------------------------| | | (nm) | (ps) | (LSB) | (mW) | | This (without layout) | 65 | 0.33 | 0.49 | 1.38 @ 2.53 GHz | | [46] | 180 | 0.78 | -0.15 | $0.8 \ @ \ 2.5 \ \mathrm{GHz}$ | | [38] | 65 | _ | 1.33 | $4.3 @ 1.5 \mathrm{~GHz}$ | | [49] | 28 | 0.244 | 1.2 ps | 19.8 @ 2 GHz | Table 5.2: Performance comparison of DTCs. In this chapter, simulation results of the proposed DTC were provided. According to simulation results with BSIM4.4 device models only, the time resolution of 0.33 ps, a maximum operation frequency of 2.53 G Hz, the power consumption of 1.38 mW, and peak DNLs and INLs less than 0.14 LSB and 0.49 LSB, respectively, for a nominal process (TT) and a temperature condition (27 C°) were achieved. The simulation was performed over process and temperature corners as well. With a post-layout simulation, a time resolution of the proposed DTC increased to 1.72 ps and the peak DNL and INL were -0.73/0.66 LSB and -2.4/2.4 LSB, respectively. The performance of this work, obtained from using BSIM4.4 models only and without a layout being taken account, was compared with recently published DTCs that were fabricated. # Chapter 6 # Conclusions and Future Work ## 6.1 Conclusions The 8-bit DTC to be used for a time-mode SAR ADC with a minimum power consumption and silicon area was presented. The architecture and the drawbacks of a conventional voltage-mode SAR ADC were discussed. The principle of time-mode circuits and benefits of their applications to mixedsignal circuits were explained. The architecture of a time-mode SAR ADC was presented. The need for an area and power-efficient DTC to be used for a time-mode SAR ADC was discussed. The principle of a DTC was explained and prior works on a DTC were reviewed. The principle of a PI was explained and prior works on digital PIs were reviewed. The design of the proposed DTC was presented. The proposed DTC was a 8-bit 2-step DTC that consists of the course tuning stage and the fine tuning stage. Each block of the proposed DTC was presented using schematic and layout views. Optimal slope of the input of the PI and the condition for linear phase interpolation were investigated. Simulation results of the proposed DTC designed in TSMC 65 nm 1.0 V CMOS technology are provided. According to simulation results with BSIM4.4 device models only, the time resolution of 0.33 ps, a maximum operation frequency of 2.53 G Hz, the power consumption of 1.38 mW, and peak DNLs and INLs less than 0.14 LSB and 0.49 LSB, respectively, for a nominal process and a temperature condition (27 C°) were achieved. DNL and INL of the proposed DTC deteriorated with the post-layout simulation. Mismatches between the interconnects can be the reason for this. #### 6.2 Future Work In a SAR TDC, an arrival time of a DTC's output is compared with that of a VTC by a time comparator. Example of time comparators are D flip-flops and sense amplifier flip-flops [61, 62]. A schematic of a sense amplifier flip-flop from [61] is shown in Fig. 6.1. A minimum time difference in arrivals of inputs that a time comparator can detect is lower-bounded by a setup time of a time comparator. It is reported that there are few flipflops of which setup time is less than 10 ps with CMOS technology of 65 nm or more [63]. For a high-speed SAR ADC with a sampling rate of more than 1 Giga-samples per second, a time resolution less than 10 ps is required. Therefore, it is imperative that a time comparator with a setup time less than 10 ps is achieved. Alternatively, a time amplifier (TA) can be used to meet the requirement of a setup time of a time comparator [64]. For example, a TA can receive outputs of a DTC and a VTC, amplifies a time difference in their arrivals, and propagate them to a time comparator. With this way, a time comparator can detect a time difference in arrivals of outputs of a DTC and a VTC without being affected by its setup time. Figure 6.1: A schematic of a sense amplifier flip-flop [61]. # Acronyms PD phase detector. **PI** phase interpolator. ADC analog-to-digital converter. $\mathbf{CMOS} \ \ \mathbf{complementary} \ \ \mathbf{metal\text{-}oxide\text{-}semiconductor}.$ CSA current source array. CTDSA cyclic time domain successive approximation. **DAC** digital-to-analog converter. **DLL** delay-locked loop. ${f DNL}$ differential nonlinearity. **DTC** digital-to-time converter. FSR full scale range. **GRO** gated ring oscillator. **INL** integral nonlinearity. LSB least significant bit. MMD multi modulos divider. $\mathbf{MSB}$ most significant bit. MUX multiplexer. ${f NMOS}$ n-type metal-oxide-semiconductor. Acronyms **PLL** phase-locked loop. ${\bf PMOS}\,$ p-type metal-oxide-semiconductor. S/H sample-and-hold. ${\bf SAR}$ successive-approximation register. ${\bf SAR}~{\bf ADC}~{\rm successive-approximation}$ register analog-to-digital converter. ${f SAR}$ ${f TDC}$ successive-approximation register time-to-digital converter. TA time amplifier. **TDC** time-to-digital converter. TG transmission gate. ${f VCO}$ voltage-controlled oscillator. **VDL** vernier delay-line. $\mathbf{VTC}$ voltage-to-time converter. # References - [1] J. McCreamy and P. Gray, "All-MOS charge redistribution analog-to-digital conversion techniques part i," *IEEE J. Solid-State Circuits*, vol. SC-10, no. 6, pp. 371–378, Dec. 1975. - [2] J. McCreamy, P. Gray, and D. Hodges, "All-MOS charge redistribution analog-to-digital conversion techniques part ii," *IEEE J. Solid-State Circuits*, vol. SC-10, no. 6, pp. 379–385, Dec. 1975. - [3] P. Harpe, "Successive approximation analog-to-digital converters," *IEEE Solid-State Circuits Magazine*, pp. 64–73, Fall 2016. - [4] H. Tang, Z. Sun, K. Chew, and L. Siek, "A 5.8 nW 9.1-ENOB 1-kS/s local asynchronous successive approximation register ADC for implantable medical device," *IEEE Trans. VLSI Systems*, vol. 22, no. 10, pp. 2221–2225, Oct 2014. - [5] S. Palermo, S. Hoyos, A. Shafik, E. Z. Tabasy, S. Cai, S. Kiran, and K. Lee., "CMOS ADC-based receivers for high-speed electrical and optical links," *IEEE Comm. Magazine*, vol. 54, no. 10, pp. 168–175, Oct. 2016. - [6] M. Le, J. Gorecki, J. Riani, J. Pernillo, A. Tan, K. Gopalakrishnan, B. Helal, P. Khandelwal, C. Loi, I. Quek, P. Wong, and A. Buchwald, "A background calibrated 28 GS/s 8b interleaved SAR ADC in 28nm CMOS," in *Proc. IEEE Custom Integrated Circuits Conf.*, April 2017, pp. 1–4. - [7] B. Murmann, "The successive approximation register ADC: A versatile building block for ultra-low power to ultra-high-speed applications," *IEEE Comm. Magazine*, vol. 54, no. 4, pp. 78–83, Apr. 2016. - [8] P. Harpe, H. Gao, R. van Dommele, E. Cantatore, and A. van Roermund, "A 3 nW signal acquisition IC integrating an amplifier with 2.1 NEF and a 1.5 fJ/conv-step ADC," in *IEEE Int'l Solid-State Circuits Conf. Dig. Tech. Papers*, 2015, pp. 382–383. - [9] P. Harikumar, J. Wikner, and A. Alvandpour, "A 0.4-v sub-nano-watt 8-bit 1-kS/s SAR ADC in 65-nm CMOS for wireless sensor applications," *IEEE Trans. Circuits and Systems II*, vol. 63, no. 8, pp. 743–747, 2016. - [10] D. Zhang and A. Alvandpour, "A 12.5-ENOB 10-kS/s redundant SAR ADC in 65-nm CMOS," IEEE Trans. Circuits and Systems II, vol. 63, no. 3, pp. 244-248, Mar. 2016. REFERENCES REFERENCES [11] Y. Zhang, E. Bonizzoni, and F. Maloberti, "A 10-b 200-kS/s 250-nA self-clocked coarse-fine SAR ADC," *IEEE Trans. Circuits and Systems II*, vol. 63, no. 10, pp. 924–928, Oct. 2016. - [12] L. Kull, T. Toifl, M. Schmatz, P. Francese, C. Menolfi, M. Braendli, M. Kossel, T. Morf, T. Andersen, and Y. Leblebici, "A 90 GS/s 8b 667 mW 64x interleaved SAR ADC in 32 nm digital SOI CMOS," in *IEEE Int'l Solid-State Circuits Conf. Dig. Tech. Papers*, 2014, pp. 378–379. - [13] Y. Yee, L. Terman, and L. Heller, "A two-stage weighted capacitor network for D/A-A/D conversion," *IEEE J. Solid-State Circuits*, vol. SC-14, no. 4, pp. 778–781, Aug. 1979. - [14] S. Singh, A. Prabhakar, and A. Bhattcharyya, "C-2C ladder-based D/A converters for PCM codes," IEEE J. Solid-State Circuits, vol. 22, no. 6, pp. 1197–1200, Dec. 1987. - [15] G. Li, Y. Tousi, A. Hassibi, and E. Afshari, "Delay-line-based analog-to-digital converters," IEEE Trans. Circuits Syst. II, vol. 56, no. 6, pp. 464–468, June 2009. - [16] M. Straayer and M. Perrott, "A 12-bit, 10-MHz bandwidth, continuous-time $\Delta\Sigma$ ADC with a 5-bit, 950-MS/s VCO-based quantizer," *IEEE J. Solid-State Circuits*, vol. 43, no. 4, pp. 805–814, Apr. 2008. - [17] M. Park and M. Perrott, "A single-slope 80 Ms/s ADC using two-step time-to-digital conversion," in IEEE Int'l Symp. Circuits Syst., 2009, pp. 1125–1128. - [18] T. K. Jang, J. Kim, Y. G. Yoon, and S. Cho, "A highly-digital VCO-based analog-to-digital converter using phase interpolator and digital calibration," *IEEE Trans. VLSI Systems*, vol. 20, no. 8, pp. 1368–1372, Aug 2012. - [19] W. Yu, J. Kim, K. Kim, and S. Cho, "A time-domain high-order MASH $\Delta\Sigma$ ADC using voltage-controlled gated-ring oscillator," *IEEE Trans. Circuits Syst. I.*, vol. 60, no. 4, pp. 856–866, Aug. 2013. - [20] W. Yu, K. Kim, and S. Cho, "A 148 $fs_{rms}$ integrated noise 4 MHz bandwidth second-order $\Delta\Sigma$ time-to-digital converter with gated switched-ring oscillator," *IEEE Trans. Circuits Syst. I.*, vol. 61, no. 8, pp. 2281–2289, Aug. 2014. - [21] J. Kim, Y. Kim, K. Kim, W. Yu, and S. Cho, "A hybrid-domain two-step time-to-digital converter using a switch-based time-to-voltage converter and SAR ADC," *IEEE Trans. Circuits Syst. II.*, vol. 62, pp. 631–635, Jul. 2017. - [22] T. Tokairin, M. Okada, M. Kitsunezuka, T. Maeda, and M. Fukaishi, "A 2.1-to-2.8-GHz low-phase-noise all-digital frequency synthesizer with a time-windowed time-to-digital converter," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2582–2590, Dec. 2010. - [23] J. Hong, S. Kim, J. Liu, N. Xing, T. Jang, J. Park, J. Kim, T. Kim, and H. Park, "A 0.004 mm<sup>2</sup> $250~\mu W~\Delta \Sigma$ TDC with time-difference accumulator and a 0.012 mm<sup>2</sup> 2.5 mW bang-bang digital REFERENCES REFERENCES - PLL using PRNG for low-power SoC applications," in *IEEE Int'l Conf. Solid-State Circuits Dig. Tech. Papers*, 2012, pp. 240–242. - [24] P. Levine and G. Roberts, "High-resolution flash time-to-digital conversion and calibration for system-on-chip testing," *IEE Proc. Comput. Digit. Tech.*, vol. 152, no. 3, pp. 415–426, May 2005. - [25] F. Yuan, Ed., CMOS Time-Mode Circuits and Systems: Fundamentals and Applications. New York: CRC Press, 2015. - [26] A. Mantyniemi, T. Rahkonen, and J. Kostamovaara, "A CMOS time-to-digital converter (TDC) based on a cyclic time domain successive approximation interpolation method," *IEEE J. Solid-State Circuits*, vol. 44, no. 11, pp. 3067–3078, Nov. 2009. - [27] S. Lee, B. Kim, and K. Lee, "A novel high-speed ring oscillator for multiphase clock generation using negative skewed delay scheme," *IEEE J. Solid-State Circuits*, vol. 32, no. 2, pp. 289–291, Feb. 1997. - [28] G. Roberts and M. Ali-Bakhshian, "A brief introduction to time-to-digital and digital-to-time converters," *IEEE J. Solid-State Circuits*, vol. 57, no. 3, pp. 153–157, March 2010. - [29] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. Lacaita, "A 2.9-to-4.0-GHz fractional-N digital PLL with bang-bang phase detector and 560-fs<sub>rms</sub> integrated jitter at 4.5-mW power," *IEEE J. of Solid-State Circuits*, vol. 46, no. 12, pp. 2745–2758, Dec. 2011. - [30] N. Pavlovic and J. Bergervoet, "A 5.3 Ghz digital-to-time-based fractional-N all-digital PLL," in *IEEE Int'l Solid-State Circuits Conf. Dig. Tech. Papers*, 2011, pp. 54–55. - [31] V. Chillara, Y. Liu, B. Wang, A. Ba, M. Vidojkovic, K. Philips, H. de Groot, and R. Staszewski, "9.8 an 860 μW 2.1-to-2.7 GHz all-digital PLL-based frequency modulator with a DTC-assisted snapshot TDC for WPAN (Bluetooth Smart and Zigbee) applications," in *IEEE Int'l Solid-State* Circuits Conf. Dig. Tech. Papers, Feb. 2014, pp. 172–173. - [32] P. Hanumolu, V. Kratyuk, G. Wei, and U. Moon, "A sub-picosecond resolution 0.5-1.5 GHz digital-to-time converter," *IEEE J. Solid-State Circuits*, vol. 43, no. 2, pp. 414–424, Feb. 2008. - [33] Y. Kao and T. Chu, "A direct-sampling pulsed time-of-flight radar with frequency-defined vernier digital-to-time converter in 65 nm CMOS," *IEEE J. of Solid-State Circuits*, vol. 50, no. 11, pp. 2665–2677, Nov. 2015. - [34] S. Li and C. Salthouse, "Digital-to-time converter for fluorescence lifetime imaging," in *Proc. IEEE Int'l Instrum. and Measur. Tech. Conf.*, 2012, pp. 894–897. - [35] S. Tseng, H. Chou, B. Hu, Y. Kao, Y. Huang, and T. Chu, "Equivalent-time direct-sampling impulse-radio radar with rotatable Cyclic Vernier digital-to-time converter for wireless sensor network localization," *IEEE Trans. on Microwave Theory and Techniques*, vol. 66, no. 1, pp. 485–508, Jan. 2018. REFERENCES REFERENCES [36] K. Chu, T. Chen, and C. Wei, "A 10-bit segmented digital-to-time converter with 10-ps-level resolution and offset calibration circuits," in *Proc. Int'l Symp. on Next-Generation Electronics*, May 2016, pp. 1–4. - [37] S. Joshi, J. T. S. Liao, Y. Fan, S. Hyvonen, M. Nagarajan, J. Rizk, H. J. Lee, and I. Young, "A 12-Gb/s transceiver in 32-nm bulk CMOS," in *Symp. VLSI Circuits Dig. Tech. Papers*, June 2009, pp. 52–53. - [38] M. Chen, A. Hafez, and C. Yang, "A 0.1-1.5 GHz 8-bit inverter-based digital-to-phase converter using harmonic rejection," *IEEE J. Solid-State Circuits*, vol. 48, no. 11, pp. 2681–2692, Nov. 2013. - [39] S. Alahdab, A. Mäntyniemi, and J. Kostamovaara, "A 12-bit digital-to-time converter (DTC) with sub-ps-level resolution using current DAC and differential switch for time-to-digital converter (TDC)," in *Proc. IEEE Int'l Instrument. and Meas. Tech. Conf.*, May 2012, pp. 2668–2671. - [40] J. Ru, C. Palattella, P. Geraedts, E. Klumperink, and B. Nauta, "A high-linearity digital-to-time converter technique: Constant-slope charging," *IEEE J. of Solid-State Circuits*, vol. 50, no. 6, pp. 1412–1423, June 2015. - [41] J.Daga and D. Auvergne, "A comprehensive delay macro modeling for submicrometer CMOS logics," *IEEE J. of Solid-State Circuits*, vol. 34, no. 1, pp. 42–55, Jan 1999. - [42] R. Mita, G. Palumbo, and M. Poli, "Propagation delay of an RC-chain with a ramp input," *IEEE Trans. Circuits Syst. II*, vol. 54, no. 1, pp. 66–70, Jan 2007. - [43] G. Nagaraj, S. Miller, B. Stengel, G. Cafaro, T. Gradishar, S. Olson, and R. Hekmann, "A self-calibrating sub-picosecond resolution digital-to-time converter," in *Proc. IEEE Int'l Microwave Symp.*, 2009, pp. 2201–2204. - [44] N. Markulic, K. Raczkowski, P. Wambacq, and J. Craninckx, "A 10-bit, 550-fs step digital-to-time converter in 28nm CMOS," in *Proc. European Solid State Circuits Conf.*, Sept. 2014, pp. 79–82. - [45] P. Chen, F. Zhang, Z. Zong, H. Zheng, T. Siriburanon, and R. B. Staszewski, "A 15-μw, 103-fs step, 5-bit capacitor-DAC-based constant-slope digital-to-time converter in 28nm CMOS," in 2017 IEEE Asian Solid-State Circuits Conference (A-SSCC), Nov. 2017, pp. 93–96. - [46] Y. Choi, S. Yoo, and H. Yoo, "A full digital polar transmitter using a digital-to-time converter for high data rate system," in *Proc. IEEE Int'l Symp. Radio-Frequency Integration Tech.*, 2009, pp. 56–59. - [47] D. Lee, G. Khan, and F. Yuan, "Architectures and design techniques of digital time interpolators," in *IEEE Int'l Conf. Integrated Circuits and Microsystems*, Nov. 2018, pp. 15–20. - [48] S. Mohan, W. Chan, D. Colleran, S. Greenwood, J. Gamble, and I. Kouznetsov, "Differential ring oscillators with multipath delay stages," in *Proc. of IEEE Custom Integ. Circuits Conf.*, Sep. 2009, pp. 503–506. REFERENCES [49] S. Sievert, O. Degani, A. Ben-Bassat, R. Banin, A. Ravi, W. Thomann, B. Klepser, Z. Boos, and D. Schmitt-Landsiedel, "A 2 GHz 244 fs-resolution 1.2 ps-peak-INL edge interpolator-based digitalto-time converter in 28 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 51, no. 12, pp. 2992–3004, Dec. 2016. - [50] S. Kumaki, A. H. Johari, T. Matsubara, I. Hayashi, and H. Ishikuro, "A 0.5V 6-bit scalable phase interpolator," in *Proc. IEEE Asia Pacific Conf. Circuits Syst.*, Dec 2010, pp. 1019–1022. - [51] R. Nandwana, T. Anand, S. Saxena, S. J. Kim, M. Talegaonkar, A. Elkholy, W. S. Choi, A. Elshazly, and P. K. Hanumolu, "A calibration-free fractional-N ring PLL using hybrid phase/current-mode phase interpolation method," *IEEE J. Solid-State Circuits*, vol. 50, no. 4, pp. 882–895, April 2015. - [52] G. Wu, D. Huang, J. Li, P. Gui, T. Liu, S. Guo, R. Wang, Y. Fan, S. Chakraborty, and M. Morgan, "A 1-16-Gb/s all-digital clock and data recovery with a wideband, high-linearity phase interpolator," *IEEE Trans. VLSI Systems*, vol. 24, no. 7, pp. 2511–2520, July 2016. - [53] S. Sidiropoulos and M. Horowitz, "A semidigital dual delay-locked loop," IEEE J. of Solid-State Circuits, vol. 32, no. 11, pp. 1683–1692, Nov. 1997. - [54] A. Tsimpos, G. Souliotis, A. Demartinos, and S. Vlassis, "All digital phase interpolator," in Int'l Conf. Design Tech. Integrated Syst. in Nanoscale Era, Apr. 2015, pp. 1–6. - [55] A. Nicholson, J. Jenkins, A. Schaik, T. Hamilton, and T. Lehmann, "A 1.2V 2-bit phase interpolator for 65nm CMOS," in *Proc. IEEE Int'l Symp. Circuits Syst.*, May 2012, pp. 2039–2042. - [56] A. Narayanan, M. Katsuragi, K. Kimura, S. Kondo, K. K. Tokgoz, K. Nakata, W. Deng, K. Okada, and A. Matsuzawa, "A fractional-N sub-sampling PLL using a pipelined phase-interpolator with an FoM of -250 db," *IEEE J. Solid-State Circuits*, vol. 51, no. 7, pp. 1630–1640, July 2016. - [57] J. Chou, Y. Hsieh, and J. Wu, "Phase averaging and interpolation using resistor strings or resistor rings for multi-phase clock generation," *IEEE Trans. Circuits Syst. I.*, vol. 53, no. 5, pp. 984–991, May 2006. - [58] S. Henzler, S. Koeppe, W. Kamp, and D. Schmitt-Landsiedel, "90 nm 4.7 ps-resolution 0.7-LSB single-shot precision and 19 pJ-per-shot local passive interpolation time-to-digital converter with on-chip characterization," in *IEEE Int'l Solid-State Circuits Conf. Dig. Tech. Papers*, 2008, pp. 548–635. - [59] M. Baker, Demystifying Mixed Signal Test Methods. Netherlands: Elsevier, 2003. - [60] G. Roberts, F. Taenzler, and M. Burns, Introduction to Mixed-Signal IC Test and Measurement (2nd Edition). Oxford University Press. - [61] B. Nikolic, V. Oklobdzija, V. Stojanovic, W. Jia, J. Chiu, and M. Leung, "Improved sense-amplifier-based flip-flop: design and measurements," *IEEE J. Solid-State Circuits*, vol. 35, no. 6, pp. 876–884, June 2000. REFERENCES [62] I. Yi, M. Chae, S. Hyun, S. Bae, J. Choi, S. Jang, B. Kim, J. Sim, and H. Park, "A time-based receiver with 2-tap decision feedback equalizer for single-ended mobile dram interface," *IEEE J. Solid-State Circuits*, vol. 53, no. 1, pp. 144–154, Jan 2018. - [63] A. Strollo, D. D. Caro, E. Napoli, and N. Petra, "A novel high-speed sense-amplifier-based flip-flop," IEEE Trans. VLSI Syst., vol. 13, no. 11, pp. 1266–1274, Nov 2005. - [64] M.Lee and A. Abidi, "A 9B, 1.25 ps resolution coarse-fine time-to-digital converter in 90 nm CMOS that amplifies a time residue," *IEEE J. Solid-State Circuits*, vol. 43, no. 4, pp. 769–777, Apr. 2008.