# Design Space Exploration of All-Digital Symbol Timing Adjustment Architectures

Serge Vernalde

Patrick Schaumont

Marc Engels

Ivo Bolsens

IMEC, Kapeldreef 75, B-3001 Leuven, Belgium

### Abstract

In this contribution, a design space exploration of the various possible schemes for alldigital symbol timing adjustment of QAM signals is made. The exploration is guided by both performance degradation and implementation cost considerations. The BER performance degradation is obtained using a quasianalytic simulation approach, while the implementation cost is estimated by high level digital circuit synthesis. The results show that a good performance/implementation tradeoff is obtained by using baseband interpolation with an oversampling factor of three and adequate compensation. This timing adjustment circuit is now being applied in the design of a digital downstream CATV QAM receiver.

## 1 Introduction

Currently, digital modems for broadband communication over coaxial or twisted pair access networks are of major interest. The high speed requirements together with a need for integration call for an all-digital solution, where all synchronization loops are implemented digitally on-chip. In an all-digital receiver, a fixed clock determines the samples that are taken. Since this clock is not synchronized to the transmitter clock, the original samples need to be recovered by means of interpolation on the received unsynchronized samples. Because perfect interpolation cannot be realised, a non-ideal polynomial interpolation of lower order is performed. This introduces a performance degradation which has been calculated by theoretical analysis in literature [1][2]. We wish to determine the performance degradation through simulation based on a quasi-analytic analysis. This approach provides a fast estimation of the bit error rate and therefore is well suited for the ex-

ploration of the system configuration. Special attention is paid to the relation between performance and implementation cost. Area and timing estimates are presented for an ASIC realisation of the different alternatives. The design space exploration is performed with relation to three design parameters of a QAM-16 modem. First, the position of the interpolator inside the digital QAM receiver architecture is considered at passband and baseband positions. Next, the effect of the symbol oversampling factor is considered. Third, the interpolator polynomial degree is varied. We also take the compensation filter mentioned in [3] into account, as this is required for any practical implementation.

# 2 All-Digital Symbol Timing Adjustment

In a digital receiver, the goal is to strobe the sampled signal at the top of the symbol, corresponding to maximum eye opening. However, the sampled signal values in a digital receiver show a time shift with relation to the samples of the transmitter. This can be constant or there can be a difference in sample rate due to a mismatch between the transmitter and the receiver sampling clock. The original samples can be recovered by adjusting the local clock phase of the receiver or by digital interpolation on the signal.

The first solution is the conventional synchronous sampling approach (figure 1(a)). The A/D sampling clock is adjusted so that the outcoming samples correspond to the transmitter samples. This hybrid solution requires a Voltage Controlled Crystal Oscillator (VCXO) which is a large and expensive component.

The second solution uses a fixed sampling clock and performs interpolation on the re-



(b) Nonsynchronous sampling

Figure 1: Symbol timing recovery

ceived signal samples to calculate the intermediate values. Since in this case the receiver sampling grid is not aligned to the transmitter sampling grid, nonsynchronous sampling results (figure 1(b)). The fundamentals of this digital symbol timing adjustment technique have been covered in [4]. The control value for the interpolation consists of an integer part m(the basepoint index) and a fractional part  $\mu$ . The basepoint index determines for each symbol which sample passes to the output of the variable decimator. The fractional delay parameter is used in the interpolation filter. It determines the point between two samples at which the interpolated value must be calculated. The basepoint index is adapted by the control loop such that  $\mu$  always corresponds to a point in the central interval of the interpolator filter. This guarantees the lowest interpolation error. There are several advantages to the digital symbol timing adjustment approach:

- A higher degree of integration is obtained. All timing adjustment circuitry is implemented on-chip, hereby reducing the component count.
- In a fully digital implementation, the

design description of the complete loop can be done in the same language. No analog-digital interdomain modeling or simulation has to be performed. For instance, the modeling of the VCXO for the hybrid approach and the cosimulation with the digital part of the system are no trivial tasks. The single-domain description also makes the transfer to the implementation level much easier.

- A fixed crystal oscillator can be used instead of the expensive VCXO.
- The delay in the control loop is reduced compared to the hybrid approach. This improves the loop stability and enables a faster updating of the control parameters.
- The approach provides a higher flexibility. Different symbol rates can be supported by the same circuit, e.g. by using a digital clock synthesizer which is controllable within a broad frequency range. It can also be used to perform a resampling operation when the control loop is designed to track large differences in sample rates.



(b) Bandpass interpolation

Figure 2: The receiver architecture with digital timing adjustment

#### 3 Design Exploration Parameters

In this section, the design space will be laid out. Since perfect interpolation with a sinc function would require an infinite impulse response, practical implementations use an approximative interpolator function. As a consequence, an error is introduced on the interpolated value, resulting in a performance (BER) degradation. It is the goal of our design exploration to investigate the influence of the different parameters on this performance degradation. In this section, the parameters that will be explored are explained while the corresponding performance degradation and the implementation cost will be presented in section 5. The method used to calculate the performance degradation is discussed in section 4.

#### 3.1 The receiver architecture

For the position of the digital interpolator in the receiver architecture, two alternatives are investigated. Figure 2 shows the receiver architecture. The digital input is a bandpass signal at a low IF. This signal is downconverted to complex baseband by a quadrature oscillator, matched filtered, decimated and sliced. For the data filtering, a root raised cosine profile is taken. The question that arises is where to perform the interpolation, at complex baseband (figure 2(a)) or at bandpass (figure 2(b)). At first sight, the bandpass interpolation seems an attractive solution from the implementation point of view, since only one interpolation filter is required. However, since the bandpass signal contains higher frequency components than the complex baseband signal, the interpolation will probably have to performed at a higher rate in order to obtain the same performance. The performance analysis will show this relation.

#### 3.2 The symbol oversampling ratio

The oversampling ratio K = Fs/Fsym is an important design parameter, since it has a considerable effect on the hardware complexity. Therefore, this parameter will also be varied in order to observe the performance degradation due to the interpolation error.

### 3.3 The interpolator polynomial degree

For the interpolator filter, a polynomial interpolation is considered based on the Lagrange polynomials [5]. This type of interpolation functions has an efficient underlying FIR implementation. The analysis is performed for linear, quadratic and cubic interpolation. The effect on the BER degradation and on the implementation cost will be investigated.

## 3.4 The compensation filter

In figure 2, a compensation filter as mentioned in [3] is depicted after the interpolator filter. This filter reduces the interpolation error by means of a fixed compensation. Since the interpolator filter impulse response varies with  $\mu$ , the interpolation error is also dependent on  $\mu$ . We took the following approach in determining the compensation filter coefficients:

- 1. A discrete set (N) of  $\mu$  values is selected in the range [0..1]. For each of these N values, a compensation filter is designed such that the overall response of the interpolation filter plus the compensation filter exhibits an all-pass behaviour for that particular value of  $\mu$ . For this  $\mu$ , no interpolation error occurs.
- 2. For each of the N compensation filters, the performance degradation is calculated in terms of  $\mu$  and the worst case value retained. We thus obtain a worst case degradation for each of the N compensation filters.
- 3. The compensation filter corresponding to the minimum worst case degradation is selected. The accompanying  $\mu$  value for which the all-pass behaviour is obtained, has the value 0.18 for the linear and cubic interpolation and the value 0.25 for the quadratic interpolation.

From the implementation point of view, the compensation filter impulse response can be incorporated in the data filter. This means that the hardware overhead of this compensation filter becomes negligible.

# 4 Performance Calculation through Quasi-Analytic Analysis

Several methods [6] exist to calculate or estimate the performance degradation in terms of the BER or the  $E_b/N_0$  ratio which in our case results from the interpolation error. They range from a fully analytic analysis [1][2] of the communication link to the Monte Carlo based simulation. In between, there are numerous approaches that make certain assumptions about the system in order to shorten the large execution times of the Monte Carlo simulation.

For our performance calculation, the quasianalytic analysis (QA) approach was selected which is also described in [6]. The QA method determines the ISI distribution caused by the nonideal interpolation through simulation. This is combined with a Gaussian distribution by means of calculation to take into account the additive white Gaussian noise. Since only the ISI distribution has to be obtained through simulation, a fast estimation of the BER can be performed, making the method suitable for system design exploration. The demands placed on the system architecture in order for the QA method to be applicable, are satisfied:

- The interpolation filter has a short memory (low order), so the simulation time required to produce a representative ISI distribution is small.
- The system is linear.
- The supposed channel model is Gaussian.

This makes the QA approach the ideal method for our performance evaluation. For a rectangular constellation like QAM-16, we assume that the I- and Q-channels are uncorrelated. This allows to determine the error probability for each dimension separately using the QA approach and then combine them. For each simulation, a complete BER versus  $E_b/N_0$  curve is obtained.

# 5 Design Exploration Results

In this section, the performance degradation due to the nonideal interpolation process is presented as well as the implementation complexity of the interpolator hardware. Different alternatives are explored by varying the parameters described in section 3.

# 5.1 $E_b/N_0$ degradation

For the performance loss, the QA approach of section 4 is applied, using a set of 5000 pseudo-random symbols of a QAM-16 constellation. The IF carrier frequency is located at 0.75\*Fsym and the raised cosine rolloff factor is 0.2 which corresponds to the C-profile of the DAVIC specification [7]. We assume that  $\mu$  is always set to the correct value, so that the samples are interpolated at the optimal position. The results are shown in figure 3 for a BER of  $10^{-6}$ . The figures represent the  $E_b/N_0$  degradation, i.e. the increase in  $E_b/N_0$  required to maintain the BER at the same value as for ideal interpolation. For each entry point in the table, the  $E_b/N_0$  degradation is calculated for 10 values of  $\mu \in [0,1]$ and the maximum degradation listed.

A first observation from the tables is that the compensation filter has a large impact on the degradation. Second, the baseband interpolation gives much better performance results than the bandpass interpolation when using the same parameters, as was expected. High oversampling ratios in combination with high order interpolator polynomials are required to obtain an acceptable  $E_b/N_0$  degradation. Since the implementation cost then becomes too high, it was decided for our application to perform the interpolation at baseband. The baseband interpolation results show highly acceptable degradations already with K = 3and a linear or quadratic interpolator polynomial.

| Interp.                    | Linear |       | Quadratic |       | Cubic |       |
|----------------------------|--------|-------|-----------|-------|-------|-------|
| K=Fs/Fsym                  | Unc.   | Comp. | Unc.      | Comp. | Unc.  | Comp. |
| 3                          | 1.5    | 0.3   | 0.15      | 0.05  | 0.05  | 0.02  |
| 4                          | 0.5    | 0.1   | 0.03      | 0.02  | 0.01  | 0.01  |
| (a) Baseband interpolation |        |       |           |       |       |       |

| Interp.                                                       | Linear |       | Quadratic |       | Cubic |       |
|---------------------------------------------------------------|--------|-------|-----------|-------|-------|-------|
| K=Fs/Fsym                                                     | Unc.   | Comp. | Unc.      | Comp. | Unc.  | Comp. |
| 4                                                             | (*)    | 6.8   | 8.6       | 2.9   | 3.8   | 1.3   |
| 5                                                             | 9.0    | 2.0   | 1.7       | 0.7   | 0.7   | 0.3   |
| (*) : $Pe = 10^{-6}$ cannot be reached using these parameters |        |       |           |       |       |       |

(b) Bandpass interpolation

Figure 3:  $E_b/N_0$  degradation (dB) at  $P_e = 10^{-6}$  for worst case  $\mu$ 

#### 5.2 Implementation

For the implementation cost of the timing adjustment circuitry, we focus on the cost of the interpolator filter since it is the largest component and since the compensation filter can be realised together with the data filter. An efficient structure for the implementation of the interpolator filter is offered by the Farrow structure [8]. In figure 4, this structure is depicted for the three interpolator polynomials. The implementation cost is estimated at different levels in the synthesis process.

At the highest level, an operation count is given for each type of operation (figure 5(a)). For a multiplication by a constant, the con-



Figure 4: Farrow structure for interpolator filter

stant is transformed to a CSD (Canonical Signed Digit) [9] representation. The multiplication is then expanded into a sequence of shift operations and additions or subtractions. The shift operations are not mentioned since they do not contribute to the hardware cost, while the additions or subtractions are listed in the column *CSD-ops*.

The next level of cost estimates is obtained by performing behavioural and RT level synthesis with the CATHEDRAL-2/3 silicon compiler [10]. A gate count is obtained in terms of equivalent 2-input NAND gates. For the internal filter wordlengths, 10 bits is taken (figure 5(b)).

The last step is the mapping of the abstract gates to a Standard Cell library in order to obtain area and critical path delay figures. The results are shown for the Mietec  $0.7\mu m$  3.3V Standard Cell library.

The figures show that the area of the filter is doubled each time the interpolator polynomial degree is increased. It is thus important to keep the interpolator order as low as possible.

The critical path delay enables to calculate the maximum symbol rate. For instance, the quadratic interpolator filter works at a maximal clock frequency of 34.5 Mhz. If an oversampling ratio of 3 is taken, an upperbound of 11.5 Mbaud is placed on the symbol rate. Faster implementations can be obtained through the addition of pipeline sections in the filter structure.

| opn.<br>type<br>order | delays | CSD-ops | add/subtr. | mult. |
|-----------------------|--------|---------|------------|-------|
| linear                | 1      | 0       | 1          | 1     |
| quadratic             | 2      | 0       | 5          | 2     |
| cubic                 | 3      | 5       | 15         | 3     |

(a) operation complexity

| order     | NAND-2<br>gates | SC area              | crit. path<br>delay |
|-----------|-----------------|----------------------|---------------------|
| linear    | 785             | 0.35 mm <sup>2</sup> | 20.5 ns             |
| quadratic | 1652            | 0.73 mm <sup>2</sup> | 29.0 ns             |
| cubic     | 3415            | 1.50 mm <sup>2</sup> | 43.2 ns             |

SC area : Standard Cell area (no intercell routing included) in 0.7 um 3.3V Mietec CMOS technology

(b) gate count and area/delay figures

Figure 5: Interpolator filter implementation results

## 6 Conclusion

A design exploration was performed to analyze the performance degradation and the implementation complexity of all-digital symbol timing adjustment schemes. The parameters that were varied are the symbol oversampling ratio, the interpolator polynomial degree and the position of the interpolator in the receiver architecture (baseband/bandpass). It was shown that the quasi-analytic analysis approach was applicable for the estimation of the performance degradation. Since this is a fast method, it is well suited for the design exploration. The results showed that bandpass interpolation results in a considerable performance degradation. It is only applicable when high order interpolation is performed in combination with high oversampling ratios, which has a drastic impact on the hardware complexity. For the design of the downstream CATV QAM receiver, we will therefore use baseband interpolation. With an oversampling ratio of 3 and quadratic interpolation, the  $E_b/N_0$  degradation is only 0.05 dB at a BER of  $10^{-6}$ . The implementation cost for

the interpolator filter was estimated at different steps in the design process. The results show that the hardware cost doubles each time the interpolator polynomial degree is increased.

## 7 Acknowledgements

The authors wish to thank Jan Maris for his help in the implementation of the interpolator filter structures. This work was carried out under the Flemish Government Impulse Program for Information Technology, in cooperation with Alcatel Telecom, Antwerpen, Belgium. Mark Engels is a senior research assistant of the Belgium National Fund for Scientific Research.

#### References

- Katrien Bucket, Marc Moeneclaey, "The effect of interpolation on the BER performance of narrowband BPSK and (O)QPSK on Rician-Fading Channels", *IEEE Trans Comm*, Vol. 42, No. 11, pp. 2929-2933, Nov 1994.
- [2] Katrien Bucket, Marc Moeneclaey, "The effect of non-ideal interpolation on Symbol Synchronizer Performance", European Trans On Telecomm, Vol. 6, No. 6, pp. 627-632, Nov 1995.
- [3] Lars Erup, Floyd Gardner, Robert Harris, "Interpolation in Digital Modems - Part II: Implementation and Performance", *IEEE Trans Comm*, Vol. 41, No. 6, pp. 998-1008, June 1993.
- [4] Floyd Gardner, "Interpolation in Digital Modems - Part I: Fundamentals", *IEEE Trans* Comm, Vol. 41, No. 3, pp. 501-507, March 1993.
- [5] M. Abramowitz and I. A. Stegun, Eds., Handbook of Mathematical Functions. Nat. Bur. Stds., Appl. Math. Series, Vol. 55, p. 878, June 1964.
- [6] W. H. Tranter, K. L. Kosbar, "Simulation of Communication Systems", *IEEE Communica*tions Magazine, pp. 26-35, July 1994.
- [7] Digital Audio-Visual Council, "Lower Layer Protocols And Physical Interfaces", DAVIC 1.0 Specification, Part. 08, pp. 18-24, 1996.
- [8] C. W. Farrow, "A continuously variable digital delay element", Proc. IEEE Int. Symp. Circuits and Systems, Espoo, Finland, pp. 2641-2645, June 1988.
- [9] H. Samueli, "The Design of Multiplierless Digital Data Transmission Filters with Powerof-Two Coefficients", Proc. SBT/IEEE Int. Telecomm. Symp., pp. 425-429, September 1992.
- [10] Cathedral-2/3, The Cathedral-2/3 Silicon Compiler for Real Time Signal Processing, IMEC, Kapeldreef 75, B-3001 Leuven, Belgium, August 1993.