# **Towards Real-Time Implementation of Optical OFDM Transmission**

Qi Yang<sup>1,3</sup>, Noriaki Kaneda<sup>1</sup>, Xiang Liu<sup>2</sup>, S. Chandrasekhar<sup>2</sup>, William Shieh<sup>3</sup>, and Y. K. Chen<sup>1</sup>

1: Bell Laboratories, Alcatel-Lucent, 600-700 Mountain Avenue, Murray Hill, NJ 07974, USA

qi.yang@alcatel-lucent.com qy@ee.unimelb.edu.au

2: Bell Laboratories, Alcatel-Lucent, 791 Holmdel-Keyport Road, Holmdel, NJ 07733, USA 3: National ICT Australia and ARC Special Research Centre for Ultra-Broadband Information Networks, Department of Electrical and Electronic Engineering, The University of Melbourne, VIC 3010, Australia

**Abstract**: We review the progresses made towards real-time implementation of optical OFDM, with emphasis on the needed digital signal processing (DSP). Various OFDM procedures and algorithms are discussed in the context of DSP-efficiency and resource requirement. ©2010 Optical Society of America

OCIS codes: (060.1660) Coherent communications; (060.2330) Fiber optics communication.

## 1. Introduction

Recently we have witnessed dramatic surge of interest in orthogonal frequency-division multiplexing (OFDM) within optical communication community [1-3]. Since the concept of coherent optical OFDM (CO-OFDM) was introduced in [1], the demonstrated line rate has rapidly advanced to 100 Gb/s [4-6] and most recently to 1 Tb/s [7-8] for transmission beyond hundred of kilometers of standard single-mode fiber (SSMF). Due to its high spectral efficiency, robustness against channel dispersion, low computation complexity and adaptive bandwidth granularity, CO-OFDM has been considered as one of the attractive candidates for the future long-haul transport. However, most of the published CO-OFDM experiments are based on off-line processing [4-8], which lags behind single-carrier counterpart where a real-time transceiver operating at 40 Gb/s based on CMOS ASIC has been reported [11]. More importantly, OFDM is based on symbol and frame structure, and the required digital signal processing (DSP), needed for OFDM procedures such as window synchronization and channel estimation, remains a challenge for real-time implementation. Among many demonstrated algorithms, only a few can be practically realized due to various limitations associated with digital signal processor capability. It is thus essential to investigate efficient and realistic algorithms for real-time CO-OFDM implementation.

In the last few years, a number of real-time reception of single-carrier transmissions were demonstrated [9-11]. Very recently, a few research groups published the real-time optical OFDM transmitter/receiver demonstrations with multi-gigabit-per-second data rate [12-16]. In this paper, we review the design of key DSP modules needed for the real-time reception of a CO-OFDM signal using high-speed analog-to-digital convertors (ADC) and field-programmable gate array (FPGA). Additionally, we investigate some important factors for real-time coherent receiver implementation, such as bit-resolution and DSP resource usage.



# 2. Real-time coherent optical OFDM receiver structure

Fig.1. Real-time polarization-diversity coherent receiver.

LO: local osillator, SE PD&TIA: single-ended photodiodes with transimpedance amplifiers. FE/C: frequency estimation and compensation. CE/C: channel estimation and compensation. PE/C: phase estimation and compensation. SD: symbol decision.

Figure 1 shows a generic architecture for a real-time reception of a CO-OFDM signal, incorporating polarization-diversity coherent receiver front end followed by an FPGA-based back end. The polarization-division multiplexed (PDM) CO-OFDM signal is split into two branches with a polarization beam splitter. Then each branch is combined with a local oscillator (LO) laser in an optical 90-degree hybrid for I/Q separation. The I/Q signals are then detected by photodiodes. After sampling with ADCs, the digitized I/Q signals are then fed to a FPGA through well matched cables with high-speed signaling standard, for instance,

### OMS6.pdf

low-voltage-differential-signaling (LVDS). The digital signal processing unit (DSPU) then processes the CO-OFDM signal, involving seven major procedures: (1) OFDM window synchronization; (2) OFDM symbol re-arrangement; (3) frequency offset estimation; (4) fast Fourier transform (FFT); (5) channel estimation; (6) phase estimation; (7) constellation reconstruction and symbol decision. For direct detection optical OFDM, (3) and (6) can be omitted [14].

### 3. Real-time CO-OFDM receiver implementation



Fig.2. (a) Data processing for window synchronization, and (b) sample order re-arrangement from parallel to serial

In both single carrier and multi-carrier (OFDM) real-time implementation, the signal with several gigahertz bandwidth has to be first de-multiplexed into multiple parallel channels, because digital signal processing unit (DSPU) can only operate at a much lower speed than the sampling rate. The primary concern in real-time realization is whether a particular algorithm can be implemented in high-speed parallel process. We consider this as the fundamental difference between real-time and off-line signal processing. Moreover, since the OFDM signal is represented in both time and frequency blocks, entirely different approaches between a single carrier and an OFDM signal must be given in digital signal processing. For single carrier, the DSP can be generally started at any clock cycle with non-data aided clock recovery and equalization [9-10]. In contrast, the OFDM frame has to be synchronized, which brings two different processing architectures: (a) processing the parallel data across N channels in each clock cycle; (b) processing serial data for each channel. To locate the beginning of each OFDM frame, all the sampled bits across N channels need to be employed. Thus, operation (a) is used here to perform the window synchronization procedure (1). Window synchronization is usually performed by auto-correlation on two serial identical patterns using Schmidl & Cox approach [17]. The length of identical patterns is preferably chosen as multiple (m) times of the channel number N. For the real-time implementation, the received samples within N channels are multiplied by the conjugated samples of the *m*-th next cycle, which is illustrated in Fig.2 (a). More detailed discussions can be found in [12]. Since the procedure (4) to (7) can be conveniently applied to individual symbols serially in each parallel channel, the procedure (2) is an important step to re-arrange and distribute complete symbols onto each channel, which is shown in Fig.2 (b). A large memory is needed for this procedure.



Fig.3. Channel estimation flow chart. TS: Training symbol. (A)CRF: (Averaged) channel response factor symbol. CS.: Compensated symbol.

Fig.4. Phase estimation flow chart.

Figs.3 and 4 show the flow diagram of channel and phase estimations. A precise time pointer of the processed sequence is essential, which can be implemented by the embedded clock counters. Once the OFDM window is synchronized, a clock counter is started to distinguish the training symbols from payload symbols, as well as to indicate the locations of pilot subcarriers from payload subcarriers. Fig. 3 shows the flow diagram for real-time CO-OFDM channel estimation. Two steps are involved in this procedure, channel response estimation and compensation. In the time slice for training symbols (TS), the received signal is multiplied with the locally stored transmitted symbols to estimate the channel response. Taking an average of the estimated channel

## OMS6.pdf

response factors (CRF), the random noise can be reduced. The averaged CRFs will be periodically fed for the payload symbols to compensate the channel response. Similarly, phase estimation procedure can also be divided into estimation and compensation parts, which is shown in Fig.4. Pilot subcarriers within one symbol are also selected by the inner timer. These pilot subcarriers then are compared with the locally stored transmitted pattern to estimate the common phase error (CPE). The same symbol is delayed, and then compensated with the estimated CPE. Some important computational operations such as FFT and phase calculation by (COordinate Rotation DIgital Computer) CORDIC [13], are usually well documented, and widely implemented in FPGA and ASIC design. It is usually packaged as a third party intellectual property (IP) tool, and can be conveniently assembled into a customized design.

In addition to the DSP algorithms that can be implemented, other important considerations for real-time optical OFDM are the hardware limitation, resource usage, and FPGA/ASIC capacity. The sampling speed/bandwidth of the ADC is one of the most critical limitations for the real-time implementation. Another most influential factor is the ADC bit resolution. Fig.5 shows a simulation result of the Q factor of 10 Gb/s QPSK CO-OFDM receiver as a function of bit-resolution of ADC outputs (curve with diamonds). 5-bit resolution gives a Q penalty less than 0.8 dB. Moreover, the bit resolution of complex multipliers is also investigated. In the simulation shown in Fig. 3 (curve with square), the ADC bit resolution is set to 5, and the bit resolution of multipliers increases from 3 to 10 bits. It can be seen that the system sensitivity improves with increased resolution and then saturates beyond 6-bit resolution. Typically, two kinds of commonly used DSP resources are 9-bit or 18-bit multipliers in FPGA. Thus, splitting a single 18-bit multiplier into two 9-bit multiplier resource usage in each procedure for the experiment in [12]. Frequency estimation with acceptable DSP precision will consume around 200 multipliers. Due to the limited resource, frequency estimation is not included in [12], but was implemented in a later experiment [13] where a larger capacity FPGA was used. Table I shows the compilation report on the resource usage and final estimated speed in Altera Stratix II GX FPGA.



Fig.5. Q factor as a function of bit resolutions of ADC and complex multipliers. QPSK modulation is assumed. CM: complex multipliers (with 5-bit ADC resolution).

| Function           | Multipliers | Channel |
|--------------------|-------------|---------|
| Auto-correlation   | 20          | 1       |
| FFT                | 24          | 16      |
| Channel estimation | 3           | 16      |
| Phase estimation   | 3           | 16      |
| Total              | 500         |         |

Table I: FPGA 9-bit multipliers usage

| Function         | Consumed/Resource   | Report |
|------------------|---------------------|--------|
| logic registers  | 77,797 / 106,032    | 88%    |
| memory bits      | 626,671 / 6,747,840 | 9 %    |
| 9bit multipliers | 500 / 504           | 99 %   |
| Estimated Speed  | 152.21 MHz          |        |

Table II: FPGA compilation report

For optical OFDM FPGA implementation, multipliers are the most limited resource. The current FPGA technology can support over 2,500 multipliers. This implies it can ideally support dual polarization with up to 5 GS/s sampling rate per single DSPU. Designing with such FPGAs, 10 Gb/s and beyond is achievable. With CMOS ASIC design [11], the available number of multipliers and logic registers can be largely increased. Recently, a 40-Gb/s coherent QPSK transceiver implemented in 90 nm CMOS ASICs has been demonstrated in real-time, with four 6-bit 20GS/s ADCs [11]. Since CO-OFDM has lower computational complexity, the state-of-the-art silicon technology can potentially allow the real-time implementation for the reception of CO-OFDM at 40 Gb/s and beyond.

#### 4. Conclusion

We have reviewed the design of key DSP modules for real-time implementation of CO-OFDM detection using FPGA. Various OFDM procedures and algorithms have been discussed in the context of DSP efficiency and FPGA resource requirement. With real-time implementation of CO-OFDM transmitter on the horizon [15-16], it is foreseeable that high-speed real-time CO-OFDM transmission could be realized in the near future.

#### References

- [1] W. Shieh, et al., Electron. Lett. 42, 587-589, 2006.
- [2] A. J. Lowery, et al., OFC'2006, paper PDP39.
- [3] I. Djordjevic, et al., Opt. Express 14, 3767-3775, 2006.
- [4] Q. Yang, et al., OFC'2008, paper PDP7.
- [5] S.L. Jansen, et al., OFC'2008, paper PDP2.
- [6] E. Yamada, et al., OECC'2008, paper PDP6.
- [7] Y. Ma, et al., Opt. Express 17, 9421-9427, 2009.
- [8] R. Dischler, et al., OFC'2008, paper PDPC2.
- [9] A. Leven, et al., OFC'2008, paper OTuO2.

- [10] T. Pfau, et al., OFC'2009, paper OthJ4.
- [11] H. Sun, et al., Opt. Express 16, 873-879, 2008.
- [12] N. Kaneda, et al., JLT, accepted for publication.
- [13] S. Chen, et al., JLT, 27, 3699-3704, 2009.
- [14] X.Q. Jin, et al., Opt. Express 17, 2009.
- [15] F. Buchali et al., ECOC'09, post-deadline paper PD 2.1.
- [16] Y. Benlachtar, et al., Opt. Express 17, 17658-17668, 2009.
- [17] T. M. Schmidl, et al., IEEE Trans. Commun. 45, 1613, 1997.