Published Online September 2012 in MECS (http://www.mecs-press.org/)

DOI: 10.5815/ijigsp.2012.09.04



# Area Optimized High Throughput IDMWT/DMWT Processor for OFDM on Virtex-5 FPGA

Anitha.K Assistant Professor, Arunai Engineering College, Tiruvannamalai, Tamilnadu, India

Dharmistan.K.Varugheese Pofessor, Karpagam Engineering College, Coimbatore, Tamilnadu, India anitha16ramesh@yahoo.co.in, Tel: +91-944-3811191

Abstract— OFDM is one of the most popular modulation techniques that is been widely used in most of the wireless and wired communication links. The OFDM architecture consists of QAM modulator and orthogonal frequency modulator. In this work we propose DMWT based orthogonal frequency modulator for achieving higher BER. The IDMWT architecture is designed considering N=4, thus the preprocessing unit converts the QAM samples of N to 2N and is modulated using DMWT filters. The filtered output is further transmitted and is received at the receiver. During the post processing, N samples are extracted by use of DMWT demodulation technique. The complex architecture of IDMWT and DMWT are reduced for its complexity and speed by the modified architecture. The DMWT architecture is modified for FPGA implementation improving the area, power and speed performances. The modified DMWT architecture is implemented on VirtexII pro FPGA which operates at 300MHz frequency and occupies area of less than 1%, with power consumption less than 28mW. The proposed design is suitable for real time and low power applications.

Index Terms— OFDM, DWT, DMWT, FPGA

# I. INTRODUCTION

The FFT based OFDM uses complex exponential bases function to reduce interference hence it was replaced with wavelets to produce better performance at the cost of loss in orthogonality between the carriers [1] [2].Mutiwavelets preserves high frequency components and also increases sensitivity better than scalar wavelets [7]. Multiwavelets show the perfect union of symmetry, orthogonally, finitely support and smoothness [8]. The design of orthogonal symmetric prefilter banks is shown with the discrete multiwavelet for coding and transform image communications. The new DMWT structure increases computational complexity, energy compaction ratio as well as the compression performance when applying to a VQ based image coding system[9][10]. A biorthogonal multi-wavelets filter has many characteristics, such as symmetry, compact support, orthogonality and 3-order vanishing moment[11].

The Fourier based OFDM (FFT-OFDM) use the complex exponential bases functions and it's replaced by an orthonormal wavelets in order to reduce the level of interference. It is found that OFDM based on Haarbased orthonormal wavelets (DWT-OFDM) are capable of reducing the inter symbol interference ISI and inter carrier interference ICI, which are caused by the loss in orthogonality between the carriers [1] [2].

To further improve the performance gains a new transform is implemented based on Multifilters called Multiwavelets (DMWT-OFDM). These filters shows more properties which is not achievable in other transforms (Fourier and wavelet) [3].

A most important Multiwavelets filter is the GHM filter proposed by Geronimo, Hardian, and Massopust The Multiwavelets functions coefficients are 2X2 matrices ,and they must multiply vectors instead of scalars during transformation step. Thus multifilter bank requires 2 input rows. To start the analysis algorithm and to reduce the noise effects , the preprocessing step associates given scalar input signal of length N to a sequence of length-2 vectors[4] [5].

In this paper the block diagram of the DMWT-OFDM is discussed in section II. Section III discusses the BER performance of DMWT-OFDM in AWGN channel. Section IV carries out design and FPGA implementation of DMWT/IDMWT architecture

Section V explains the design of proposed modified DMWT/IDMWT architecture. Finally section VI explain the results and conclusion of the FPGA implementation of DMWT/IDMWT.

# II. PROPOSED SYSTEM FOR DMWT- OFDM

The block diagram of the proposed system for OFDM is depicted in figure (1).



Figure 1: Block Diagram of DMWT-OFDM System

The OFDM modulator and demodulator of DMWT-based OFDM are shown in figure (2).





(b) DMWT-OFDM demodulator

Figure 2: DMWT-OFDM modem system

The S/P converter, the signal demapper and the insertion of training sequence are same as in DWT-OFDM. After that, a computation of IDMWT for 1-D signal is achieved by using an over-sampled scheme of preprocessing (repeated row), the IDMWT matrix is doubled in dimension compared with that of the input, which is a square matrix of NxN, where N is in power of 2. A transformation matrix dimension is equal to input signal dimensions after preprocessing. To compute a single-level 1-D discrete multiwavelets transform, the next steps are:

- 1. Checking input dimensions: With input vector of length N, where N is in power of 2.
- 2. Constructing a transformation matrix W as in 3, using GHM low and high pass filters matrices given in 1 and 2, after substituting GHM matrix filter coefficients values, a 2NX2N transformation matrix results.

$$H_{0} = \begin{bmatrix} \frac{3}{5\sqrt{2}} & \frac{4}{3} \\ -\frac{1}{20} & -\frac{3}{10\sqrt{2}} \end{bmatrix} \qquad H_{1} = \begin{bmatrix} \frac{3}{5\sqrt{2}} & 0 \\ \frac{9}{20} & \frac{1}{\sqrt{2}} \end{bmatrix}$$

$$H_{2} = \begin{bmatrix} 0 & 0 \\ \frac{9}{20} & -\frac{3}{10\sqrt{2}} \end{bmatrix} \qquad H_{3} = \begin{bmatrix} 0 & 0 \\ -\frac{1}{20} & 0 \end{bmatrix}$$

$$(1)$$

$$G_{0} = \begin{bmatrix} -\frac{1}{20} & -\frac{3}{10\sqrt{2}} \\ \frac{1}{10\sqrt{2}} & \frac{3}{10} \end{bmatrix} G_{1} = \begin{bmatrix} \frac{9}{20} & -\frac{1}{\sqrt{2}} \\ \frac{9}{10\sqrt{2}} & 0 \end{bmatrix}$$

$$G_{0} = \begin{bmatrix} \frac{9}{20} & -\frac{3}{10\sqrt{2}} \\ \frac{9}{10\sqrt{2}} & \frac{3}{10} \end{bmatrix} G_{3} = \begin{bmatrix} -\frac{1}{20} & 0 \\ -\frac{1}{10\sqrt{2}} & 0 \end{bmatrix}$$
(2)

- 3. Preprocessing the input signal by repeating the input stream with the same stream multiplied by a constant  $\alpha$  , for GHM system functions  $\alpha=1/\sqrt{2}$  .
- 4. Transformation of input vector which can be done by apply matrix multiplication to the 2NX2N constructed transformation matrix by the 2NX1 preprocessing input vector.

# III. PERFORMANCE OF DMWT-OFDM IN AWGN CHANNEL

In this section, the result of the simulation for the proposed DMWT-OFDM system is calculated and shown in figure (3), which gives the BER performance of DMWT-OFDM in AWGN channel. It is shown clearly that the DMWT-OFDM is much better than the two previous system FFT-OFDM and DWT-OFDM. This is a reflection to the fact that the orthogonal bases of the multiwavelets is much significant than the orthogonal bases used in FFT-OFDM and DWT-OFDM.



Figure 3: BER performance of DMWT-OFDM in AWGN channel model.

# IV. DESIGN OF DMWT/IDMWT ARCHITECTURE



In this work, design and FPGA implementation of a hardware efficient DMWT architecture is carried out. The QAM modulated data which generates the I and Q channel signals are preprocessed and is modulated using IDMWT, the OFDM modulated data is AWGN channel transmitted through and demodulated using DMWT, the base band signal is extracted using QAM demodulation. Figure 2 shows the detailed block diagram of OFDM modulation and demodulation. The input signal is considered as 1MHz signal with sampling frequency of 64Msps, the QAM modulator carrier frequency is chosen to be 64 MHz, the QAM symbols are obtained at 512Msps. The OFDM modulator has to process the modulated data at the rate of 512Msps. From the previous discussions, it is found that prior to OFDM modulation, the input samples are to be scaled and extended as 2N x 1 vector, which is the requirement for GHM based IDMWT. In order to achieve this the pre processing unit performs the scaling and extension operation, thus the incoming samples to preprocessing that are at 512Msps are preprocessed to 2N x1 with 1024 Msps. The preprocessed data is to be processed using IDMWT, this has to operate at frequency greater than 1024Msps.

# DESIGN OF IDMWT

In this work, we select N=4, thus the QAM symbols are grouped into frames of 4 samples and is preprocessed. With N=4, the preprocessing unit extends the samples to 8 with scaling. The scaled samples are to be processed in the IDMWT with GHM wavelets of size  $2N \times 2N$ , with N=8, the GHM filter size is  $8 \times 8$ . The GHM filter for N=4 is given in equation

$$W= \begin{vmatrix} H0 & H1 & H2 & H3 \\ H2 & H3 & H4 & H1 \\ G0 & G1 & G2 & G3 \\ G2 & G3 & G0 & G1 \end{vmatrix}$$

As we perform inverse IDMWT, the GHM filter coefficients are:

$$W = \begin{bmatrix} H0 & H2 & G0 & G2 \\ H1 & H3 & G1 & G3 \\ H2 & H0 & G2 & G0 \\ H3 & H1 & G3 & G1 \end{bmatrix}$$

Using the above equation, the preprocessed data is modulated to generate OFDM signal. The OFDM signal using GHM filter can be mathematically represented as:

$$[Y] = [W_T][X]$$
  
 $[2NX1][2NX2N][2NX1]$ 

The above equation is implemented on FPGA. The input matrix is first stored in a memory of size Mx8, where M is an integer of size 1024. The input memory is loaded from the preprocessing unit. The controller reads the data from input memory into a intermediate memory of size 8x8, the controller also reads the corresponding GHM coefficients from memory. The input is multiplied and accumulated using dedicated multipliers on FPGA to compute the output samples. Figure below shows the top level block diagram of IDMWT logic for the I channel, which is similar for the Q channel.

# A.Computation Complexity of IDMWT

As the input is of size 8 x 1 and is 8 bit per sample, every input frame is multiplied by 2N rows of GHM filter coefficients. Thus it requires 2N\*2N multiplications and 2N(2N-1) additions. For computation of every output sample, ti requires 2N clock cycles (write data into intermediate memory) + 2N clocks for reading data from intermediate memory

+ 1 clock cycle for multiplication + 2N-1 clock for addition and another 2N clock cycle for write operation, thus for every output computation it requires 8N clock cycles. The latency is 8N clock cycles, throughput is 8N-1 clock cycles. In order to improve throughput and latency, it is required to modify the IDMWT architecture. In this work we propose a high speed DMWT and IDWMT architecture that is implemented on FPGA.

#### V. MODIFIED DMWT ARCHITECTURE

In the previous section the BER performance is analyzed and now the GHM matrix coefficients were calculated and substituted in equations 1, 2 and 3 the equations 4 to 11 are derived to design multiwavelets. Here it is scaled with scaling factor 128. The table below shows co-efficient before and after scaling.

Table1:Scaled and Un scaled co-efficient

| Before  | After   |
|---------|---------|
| Scaling | Scaling |
| 3/5√2   | 54      |
| 4/3     | 170     |
| 0.26819 | 34      |
| 0.1707  | 22      |
| 0.4145  | 53      |
| 0.7070  | 90      |
| 3/10    | 38      |
| 2/3     | 84      |
| 1/2     | 64      |
| 0.5207  | 66      |
| 0.08787 | 11      |
| 0.5207  | 68      |
| 0.362   | 46      |
| 0.6864  | 88      |
| 0.3793  | 48      |

| [Y0]         | 3/5√2   | 4/3     | -3/5√2  | 0             | 0      | 0       | 0               | 0     | X0 .             |
|--------------|---------|---------|---------|---------------|--------|---------|-----------------|-------|------------------|
| И            | -1/20   | -3/10/2 | 9/20    | 1/√2          | 9/20   | -3/10/2 | -1/20           | 0     | X1               |
| <i>Y</i> 2   | 0       | 0       | 0       | 0             | 3/5√2  | 4/3     | 3/5√2           | 0     | X2               |
| <i>Y</i> 3   | 9/20    | -3/10/2 | -1/20   | 0             | -1/20  | -3/10/2 | 9/20            | 1/√2  | <i>X</i> 3       |
| <i>Y</i> 4 = | -1/20   | -3/10/2 | 9/20    | $-1/\sqrt{2}$ | 9/20   | -3/10/2 | -1/20           | 0     | $X0(1/\sqrt{2})$ |
| <i>Y</i> 5   | -1/10/2 | 3/10    | 9/10/2  | 0             | 9/10/2 | -3/10   | $-1/10\sqrt{2}$ | 0     | $X1/\sqrt{2}$    |
| <i>Y</i> 6   | 9/20    | -3/10/2 | -1/20   | 0             | -1/20  | -3/10/2 | 9/20            | -1/√2 | $X^2/\sqrt{2}$   |
| <i>Y</i> 7   | 9/10/2  | -3/10   | -1/10/2 | 0             | 1/10/2 | -3/10   | 9/10/2          | 0     | $X3/\sqrt{2}$    |

$$y0 = (54*x0+170*x1+54*x2) \tag{4}$$

$$y1 = (34*x0-22*x1+53*x2+90*x3)$$
 (5)

$$y2 = (38*x0 + 84*x1 + 38*x2) \tag{6}$$

$$y3 = (53*x0-22*x1+34*x2+64*x3)$$
 (7)

$$y4 = (34*x0-22*x1+53*x2-90*x3)$$
 (8)

$$y5 = (66*x0+11*x1+68*x2) \tag{9}$$

$$y6 = (53*x0-46*x1+34*x2-64*x3)$$
 (10)

Equation above have been derived based on the modified GHM filter coefficients. From the above equations it is found that to compute every output sample, it is required to perform minimum of 3 multiplications and 2 additions. Thus for N=4, the number of multiplications and additions are 28 multiplications and 20 additions respectively. The number of multiplications and additions are reduced by more than 50%. This reduction in multiplication and addition optimizes the design in terms of area and power requirement. It is also found that the latency of the design is 8N clock cycles, but throughput is 7N clock cycles, which is faster compared with existing design which is 8N-1. The latency and throughput can be further improved with parallel and pipelining architecture.

# VI. FPGA IMPLEMENTATION OF MODIFIED DMWT/IDMWT

The HDL model for the modified equations of GHM filter is developed and simulated using ModeSim. Multiple test cases are chosen to test the functionality of the modified equation and is verified against software reference model results. The functionally correct HL code is synthesized using Xilinx ISE 10.1 targetting VirtexII pro FPGA. Next section discuss the results of FPGA implementation.

It is seen that the pre-simulation and post place and route simulation results match, thereby proving that the design is perfectly mapped onto FPGA meeting the required design specifications.

The HDL co simulation of the design is performed using matlab simulation which is shown in Figure 5 below.



Figure 4 Post Place and Route Simulation



Figure 5 HDL CO-Simulation

Figure6 below shows the RTL schematic of the proposed design with interconnects between the various blocks. It is a technology independent schematic.



Figure 6 RTL Schematic

#### TABLE2: DEVICE UTILIZATION SUMMARY

| Logic Utilization  | Used | Availabl | Utilization |  |
|--------------------|------|----------|-------------|--|
|                    |      | e        |             |  |
| Number of 4 input  | 126  | 27, 392  | 1%          |  |
| LUTs               |      |          |             |  |
| Number of occupied | 72   | 13, 696  | 1%          |  |
| Slices             |      |          |             |  |
| Number of bonded   | 97   | 556      | 17%         |  |
| IOB'S              |      |          |             |  |
| IOB Flip Flops     | 64   |          |             |  |
| Number of          | 18   | 136      | 13%         |  |
| MULT18X18s         |      |          |             |  |
| Number of          | 1    | 16       | 6%          |  |
| BUFGMUXs           |      |          |             |  |

#### A. Synthesis Report

Target Device: xc2vp30-7-ff896

Minimum input arrival time before clock: 8.362ns

Maximum output required time after clock: 3.293ns

Total memory usage is 225952 kilobytes

# B. Conclusion

In this work, we propose a modified GHM filter architecture for OFDM modulation and demodulation. Software reference model for DMWT based OFDM model is developed and simulated to find the BER performances for various SNRs. The simulation results show that the DMWT OFDM model outperforms FFT and DWT based OFDM models. The DMWT coefficients that are fractions are converted to integers and are modified to reduce the number of multiplications and additions. The reduced GHM filter coefficients are used to process the QAM modulated

data, thus reducing the computation complexity and making it suitable for FPGA implementation. The modified equations are modeled using HDL and implemented on FPGA VirtexII pro. The design operates at maximum frequency of 300MHz and consumes less than 1% resources and thus is suitable for real time applications. The design can be further improved for its latency and throughput by designing a parallel and pipelined architecture for DMWT/IDMWT.

# REFERENCES

- [1] Zhang H. et al, "Research of DFT-OFDM and DWT-OFDM on Different Transmission Scenarios.", Proceedings of the 2<sup>nd</sup> International Conference on Information Technology for Application (ICITA), 2004.
- [2] Negash B.G. "Wavelet Based Multicarrier Transmission over Wireless Multipath Channels", MS.c Thesis, Delft University of Technology, Aug 2000.
- [3] Cotronei M., et al, "Multiwavelet Analysis and Signal Processing", IEEE Transaction on Circuits and Systems II.
- [4] V. Strela, G. Strang et al, "The Application of Multiwavelet Filter Banks to Image Processing" IEEE Transaction on Image Processing, 1993.
- [5] V. Strela, "Multiwavelets: Theory and Application", Ph.D Thesis, MIT, June 1996.
- [6] Biglieri E., Proakis J. and Shamai S. "Fading Channels: Information-Theoretic and Communications Aspects", IEEE Transactions on Information Theory, Vol. 44, No. 6, October 1998.
- [7] Ragupathy, U.S.; Kumar, A. Senthil, "Investigation on mammographic image compression and analysis using multiwavelets and neural networks", International conference (ICoBE), 2012, Page(s): 17 21
- [8] Liu Wei." An image coding method based on multiwavelet transform ", Image and Signal Processing(CISP), 4<sup>th</sup> International Congress on Volume: 2,2011, Page(s): 607 – 610.
- [9] Tai-Chiu Hsung, Lun, D.P.-K., Ho, K.C, "Orthogonal symmetric prefilter banks for discreate wavelet transforms" Signal Processing letters, IEEE, Vol.13, 2006
- [10] Tai-Chiu Hsung; Lun, D.P.-K.; Yu-Hing Shum; Ho, K.C.,"Generalised Discrete Multiwavelet Transform with Embedded Orthogonal Symmetric Prefilter Bank" Signal Processing, IEEE transaction, Vol.55, 2007.
- [11] Li Yongjun, Xu Xiaorong,"A Fractal Multi-Wavelet Filter Design And Application" Information Technology,computer engineering and management sciences(ICM), International Conference Volume: 2,2011, Page(s): 313 – 316.

# **AUTHORS PROFILE**

**Anitha.K.** received her B. Tech in Electronics & Communication Engineering from Bangalore University, Bangalore in 1999 and M. Tech degree in Applied Electronics from from Anna University, Chennai in 2006.

She is currently working for her Ph. D degree at Anna University, Coimbatore and also working as Asst. Prof in Tiruvannamalai, Arunai Engineering College, Dept. of Electronics and Communication Engineering, TamilNadu, India.

Her areas of interest are Hardware Software Co-Design, Signal Processing, Digital System Design and Wireless communication.