Long Term Evolution of Turbo Encoder and Decoder Architectures using Viterbi Algorithm

G.VENKATA KALYAN, Mr.K.BALA

ABSTRACT:
Long-term evolution (LTE) is aimed to achieve the peak data rates in excess of 300 Mb/s for the next-generation wireless communication systems. Turbo codes, the specified channel coding scheme in LTE, suffer from a low-decoding throughput due to its iterative decoding algorithm. One efficient approach to achieve a promising throughput is to use multiple maximum a posteriori (MAP) cores in parallel, resulting in a large area overhead. The two computationally challenging units in an MAP core are $\alpha$ and $\beta$ recursion units. Although several methods have been proposed to shorten the critical path of these recursion units, their area-efficient architecture with minimum silicon area is still missing. In this brief, a novel relation existing between the $\alpha$ and $\beta$ metrics is introduced, leading to a novel add–compare–select (ACS) architecture. The proposed technique can be applied to both the precise approximation of log-MAP and max-log-MAP ACS architectures. The proposed ACS design, which is implemented in a 0.13-$\mu$m CMOS technology and customized for the LTE standard, results in, at most, 18.1% less area compared with the reported designs to date while maintaining the same throughput level.

I. INTRODUCTION

MANY advanced wireless communication standards adopted turbo codes as the channel coding scheme due to its near Shannon error-correcting performance [1]. The decoding procedure is performed in two different half iterations, where the reliability of received bits is computed in the form of extrinsic values using interleavers and soft-input–soft-output (SISO) decoders in an iterative way. On even half iterations, the decoding process is performed on the noninterleaved data and parity, whereas on odd half iterations, the interleaved data are decoded. The extrinsic values, representing the reliability of the information bits, are sent to another half iteration by passing through the interleaver/deinterleaver unit until the acceptable error level is achieved.
Recently, long-term evolution (LTE) advanced has been dominated as the next-generation wireless communication standard, which is aimed at higher peak data rates close to 3 Gb/s [2]. The turbo decoder, which is specified in LTE, reveals to be a limiting block toward this goal due to its iterative decoding nature, high latency, and significant silicon area consumption. The decoding procedure is performed using the algorithm presented in [3]. Since the implementation of the actual maximum a posteriori (MAP) algorithm incurs very high computational complexity, typically, two modified forms of the MAP algorithm, i.e., the max-log-MAP and log-MAP algorithms [4], [5], are commonly realized instead. In these two alternative methods, the MAP core consists of log-likelihood ratio (LLR) units, as well as the core units to compute $\alpha$, $\beta$, and $\gamma$, i.e., the forward, backward, and branch metrics, respectively. In fact, the $\alpha$ and $\beta$ units, due to their recursive computation nature, are the most challenging units to implement, occupying almost 40% of the whole MAP core area [6]. The $\gamma$ unit, on the other hand, is a trivial part of the turbo decoder, consisting of few addition computations. Therefore, an area-efficient architecture for $\alpha$ and $\beta$ metric computation is highly desirable, which has always been a challenge in literature. In order to address this challenge, in this brief, a new relation between the $\alpha$ and $\beta$ metrics is introduced; based on this new relation, a novel add–compare–select (ACS) unit for forward and backward computation is proposed. The proposed scheme results in, at most, an 18.1% reduction in the silicon area compared with the designs reported to date.

**ACS ARCHITECTURE**

The common approach to implement the recursion unit is by using the ACS architecture. In this case, the radix-2 recursion unit is implemented by using an adder, a comparator unit, and a selector unit dictated by (12), as shown in Fig. 2(a), where common approximations of the logarithmic term in $\log(1 + e^{-|z-t|})$ are used for implementing the LUT. Few designs such as the one in [10] have been proposed to reduce the latency of this architecture, all in radix-2. However, in recent wireless communication systems with a clear demand for a high-throughput framework, a radix-4 architecture is a common approach [11] and should be efficiently designed. The main advantage
of using a radix-4 architecture is its concurrent computation of two-bit recursion metrics leading to a higher throughput. However, compared with its radix-2 counterpart, it has a higher latency and silicon area overhead. Therefore, due to the nature of the recursive computation, which highly restricts the clock frequency, achieving a high throughput is by far a more challenging task in a radix-4 framework. Although several designs have been proposed so far to shorten the latency of the radix-4 architectures, a large silicon area overhead is their unwanted byproduct [10]. Therefore, the goal of this brief is to alleviate the area overhead of radix-4 architectures.

IMPLEMENTATION RESULTS

In fact, the proposed MSR is introduced to efficiently reduce the silicon area overhead caused by using the radix-4 implementation platform. The proposed MSR technique can be easily Fig. 3. (a) Conventional radix-4 ACS architecture for two concurrent metrics computation. (b) Proposed radix-4 ACS architecture based on conventional architecture for two concurrent metrics computation. applied to any other architecture existing to date for the LTE turbo code. In order to observe the impact of the MSR method, this scheme is applied to recent architectures. The synthesis results of both the original architectures and the applied MSR version to previous ACS architectures presented in the literature are shown in Table II. For a fair comparison, all architectures were synthesized using the Synopsys Design Compiler in a0.13-μm CMOS technology, and the results are given in terms of the equivalent gate count. Furthermore, the results in Table II are for computing all forward and backward metrics that must be computed for the LTE turbo decoder. The proposed MSR technique can provide 15% reduction in the area when it is applied to the conventional architecture [see Fig. 2(b)], which implements the precise log-MAP algorithm. This reduction is a result of the omission of two Absolute Look-Up-Table (ALUT) units and two subtraction units for each two metrics out of eight possible metrics. Moreover, the MSR technique can be also applied to few recent architectures that are devised to reduce
the critical path of the conventional ACS architecture, such as the designs in [9], [10], and [12]. By applying the MSR method to these schemes, up to 18.1% reduction in the hardware complexity is achieved (see Table II). Needless to say, the architecture presented in [9] is used for the max-log-MAP turbo decoder. In order to alleviate the performance loss of using the which is called MSR. By applying the proposed method to the previous ACS architectures, an area-efficient architecture for recursive computations was achieved. The proposed architectures achieve, at most, 18.1% reduction in complexity according to the implementation results, which significantly reduces the complexity of the whole MAP core of the turbo decoder. Furthermore, the proposed method can be also used for higher radix designs to reduce complexity.

REFERENCES


Author’s Biography

G VENKATA KALYAN received his B.Tech degree from Srinivasa Institute of Technology and science (affiliated to JNTU Anantapuram) Department of ECE. He is pursing M.Tech in Srinivasa Institute of Technology and Science, Ukkayapalli, Kadapa, AP

Mr. K. BALA is currently working as an associate professor in ECE Department, Srinivasa Institute of Technology and science, Ukkayapalli, Kadapa, AP. He received his M.Tech from Sri Kottam Tulasi Reddy Memorial college of Engineering Kondair, Mahaboobnagar, AP.