# A 65nm CMOS I/Q RF Power DAC with 24 - 42dB 3<sup>rd</sup> Harmonic Cancellation and up to 18dB Mixed-Signal Filtering

Bonjern Yang, Eric Y. Chang, Ali Niknejad, Borivoje Nikolić, Elad Alon Berkeley Wireless Research Center, University of California, Berkeley, CA 94704, USA

## Abstract

This paper presents an RF DAC transmitter (TX) with integrated, programmable harmonic cancellation as well as mixed-signal filtering at a peak power of 25.6dBm. The 65nm CMOS prototype uses device stacking and transformer combining and demonstrates 24dB to 42dB HD3 reduction across a frequency range of 0.7GHz to 2GHz, and up to 18dB of notching at a 40MHz offset.

#### Introduction

RF DACs are amenable to reconfigurable RF front ends due to their ability to translate directly from digital bits to RF, but generate undesired spectral emissions. An integrated, programmable solution for these issues is needed in order to achieve a compact reconfigurable RF front end. This work expands upon techniques for cancelling harmonic emissions [1] and quantization noise [2] to realize a wideband RF DAC with reduced spectral emissions. In contrast to previous work, this work demonstrates harmonic cancellation across a wide frequency range as well as for modulated data, and demonstrates improved filtering (notching) results even without the use of predistortion.

The TX is partitioned into sub-PAs which are summed at the output. For harmonic cancellation, these sub-PAs take in the same data but phase shifted LO signals. By setting appropriate sub-PA weights and phase shifts, different harmonics can be nulled. Similarly, the mixed-signal filter drives different sub-PAs with the same LO but differently delayed data, implementing a filter with positive coefficients. Both techniques are strongly sensitive to the linearity of the sub-PA output network, which drove the selection of a switched-capacitor power amplifier (SCPA) topology [3] with relatively constant output impedance across input code.

## **Circuit Implementation**

The TX (Fig. 1) is implemented in a Cartesian architecture, in which I and Q are combined using 25% duty cycle clocks and cell sharing [4]. Relative to the standard Cartesian architecture, cell sharing achieves higher peak efficiency. The TX is divided into 2 LO phases, which are further divided into 4 equally-weighted sub-PAs for mixed-signal filtering (Fig. 2), for a total of 8 sub-PAs. The 2 LO phases are equally weighted, and phase shifted by (nominally) 60° in order to null the 3<sup>rd</sup> harmonic. Each set of 4 filter sub-PAs is summed by connecting their outputs, while the two LO phases are summed using an on-chip, 2:1 series stacked transformer. The sub-PA (whose layout was generated using a custom PyCell), transformer layout, and board output network have been designed to be as symmetric as possible in order to maximize the matching between the two LO phases, as this critically impacts the effectiveness of the harmonic cancellation.

Each sub-PA is a differential stacked SCPA (Fig. 3), segmented with 4 binary and 4 thermometer bits. The use of a stacked PA necessitates on-chip level shifters, which are implemented using AC coupled inverters.

In order to achieve cancellation across a wide frequency range, high resolution, wideband phase interpolators (PI) are required. They are implemented on chip as 2 pairs of 9-bit PIs, with a Gilbert-cell based topology [5]. A pair of PIs is needed in order to generate 25% I and Q LO signals for each phase. The data delay line is implemented using a chain of flip-flops operating at the data rate, with a maximum delay of 25 clock periods. Both the phase shift and delay amounts are set through a scan chain.

## **Measurement Results**

The TX test-chip is implemented in TSMC's 65nm technology and is bump limited to 3.4mm x 2.6 mm. Using a 2.4V PA supply and 1.2V analog/digital supplies, at a sample rate of 500MS/s, the TX achieves a peak output power of 25.6dBm with harmonic cancelling enabled (Fig. 6). The TX achieves 3dB bandwidth of 0.7GHz to 1.6GHz and a peak total efficiency of 20% with cancellation enabled, and 25% without cancellation (Fig. 7). The total efficiency is defined as the output power divided by the power consumed by the entire chip. The test board has a differential output, each of which is connected through a matched pair of cables into an oscilloscope, effectively driving a  $100\Omega$  differential load.

HD3 reduction is defined as the difference between the nominal (no cancellation) HD3 and the HD3 of the optimal phase shift for cancellation. For each carrier frequency, the phase settings of a pair of PIs is swept while the other is kept constant. The optimal PI setting to minimize HD3 is then found for each frequency. The prototype demonstrates HD3 improvement of 24dB to 42dB for maximum power CW measurements, achieving HD3 as low as -57dB (Fig. 4), as well as a 32dB HD3 reduction at a 900MHz carrier for 20MHz LTE data with 9dB peak to average power ratio (PAPR) (Fig. 5(a)). The reduction in cancellation for CW and modulated data case is due to the optimal PI codes differing slightly at different I/Q data codes. With 20MHz bandwidth LTE data, the prototype also achieves up to 18dB filtering at 40MHz offset by placing notches at 31.25MHz and 41.67MHz (Fig. 5(b)). The prototype also achieves an EVM of 4.94% for a 16OAM constellation operating a data rate of 125MS/s (Fig. 9).

The efficiency of this PA is somewhat lower than other recent designs, so it is worthwhile quantifying how much of this is a consequence of the techniques used to improve spectral quality. The harmonic cancellation technique nominally reduces the output power by 1.2dB, accounting only for a drop from 25% to 20% in efficiency. Compared to a Class E PA, the SCPA used here incurs additional loss from the PMOS devices. In this design, the removal of the capacitive losses from the PMOS devices and their associated drivers would bring the peak efficiency up to 30% from 25%. As mentioned previously, the varying output impedance of a Class E PA can substantially limit the effectiveness of the cancellation techniques. It is finally worth noting that this particular TX was designed to be relatively wideband, and thus a low loaded quality factor for the output network (which limits efficiency) was selected.

### Acknowledgements

This work by DARPA RF-FPGA contract HR0011-12-9-0013 in collaboration with Nokia and Boeing. The authors acknowledge the students, faculty, and sponsors of the Berkeley Wireless Research Center, Aritra Banerjee of Texas Instruments, chip fabrication donated by the TSMC University Shuttle Program, and Integrand Software.

#### References

[1] C. Huang et al, RFIC, pp. 214-217, 2016. [2] R. Bhat, H. Krishnaswamy, RFIC, pp. 413-416, 2014. [3] S. Yoo et al, ISSCC, pp. 428-429, 2011. [4] H. Jin et al, ISSCC, pp. 168-169, 2015. [5] N. Kuo et al, A-SSCC, pp. 345-348, 2014. [6] A. Ba et al, RFIC, pp. 239-242, 2014.



Fig 4. CW HD3 cancellation, HD3 vs. frequency. Fig 5. HD3 cancellation (a) and mixed-signal filtering (b) of 20MHz LTE data at 9dB PAPR.









Fig 8. Die Photo.

|              | [1]       | [6]              | [2]                | This Work             |
|--------------|-----------|------------------|--------------------|-----------------------|
| Architecture | HR SCPA   | Conduction       | Mixed-             | HR and                |
|              |           | Angle            | Signal             | Filtering             |
|              |           | Calibration      | Filtering          | SCPA                  |
| Technology   | 40 nm     | 40 nm            | 65 nm              | 65 nm                 |
| Peak Power   | 8.9 dBm   | 1.2 dBm          | 29.9 dBm           | 25.6 dBm <sup>1</sup> |
| Frequency    | 0.9 GHz   | 2.4 GHz          | 2.4 GHz            | 0.7–1.6 GHz           |
| Efficiency   | 43%2      | 39% <sup>2</sup> | 38.3% <sup>3</sup> | 20%3                  |
| HD Reduction | HD2: 48dB | HD2: 27 dB       | N/A                | HD3: 42 dB            |
| Notch Depth  | N/A       | N/A              | 8 dB               | 18 dB                 |

<sup>1</sup>100Ω differential load, <sup>2</sup>Drain efficiency, <sup>3</sup>Total efficiency