DIFFERENT achieve the capacity for binary-input symmetric memoryless channels

capacity-approaching codes, such as Turbo codes and low-density parity-check
(LDPC) codes , have been designed and used for some time in applications, such
as wireless communication and data storage, to achieve very high data rates.
Recently, the first provable capacity-achieving codes, called polar codes, were
invented by Arikan. Being the first family of codes known to achieve the channel
capacity with explicit construction, polar codes have attracted a lot of
attention since their invention. Polar codes have been proved to achieve the
capacity for binary-input symmetric memoryless channels as well as discrete and
continuous memoryless channels. Moreover, an explicit construction method for
polar codes has been provided, and it is shown that they can be efficiently
encoded and decoded with complexity O(n log n), where n is the code length.


A number
of decoding methods have been proposed for polar codes, and among these,
successive cancellation decoding (SCD) and belief propagation decoding (BPD)
are the two most popular methods. Due to the serial nature of the algorithm,
SCD suffers from long latency, although it requires less computation as
compared with BPD. Several methods have been proposed to reduce the latency of
SC decoders to achieve a high throughput. Moreover, list decoding and stack
decoding, which are based on SCD, have been proposed to improve the
error-correcting performance for polar codes with short code lengths.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now


On the
other hand, polar BP decoders have the intrinsic advantage of parallel
processing. Therefore, compared with their SC counterparts, polar BP decoders
are more attractive for low-latency applications. However, due to their
iterative nature, the required latency and energy dissipation of BP decoders
increase linearly with the number of iterations. The requirement of a large
number of iterations results in high computation complexity, and hence makes
BPD less attractive than its SC counterpart.


In this
paper, a modified BP decoder is proposed to speed up the decoding process. In
the proposed BP decoder, the adjacent two stages of basic computation elements
are combined together to generate new decoding factor graph with bigger basic
decoding element which has four nodes on either left side or right side. Then
based on the new computation element, a corresponding path-based belief
computation method is presented. Simulation results show that the proposed
algorithm performs as well as the traditional BP decoding.


The new
stage-combined BP decoder has advantages in several aspects. The number of
stages is reduced. Therefore, the decoding latency is reduced to log2N clock
cycles per iteration instead of 2log(2N). In addition, the message storage and
read/write operations will be reduced to Nlog(2N) as well, which is also about
half of the conventional BP decoder.




a )
Polar codes


codes are linear block codes based on the phenomenon of channel polarization,
in which individual channels are recursively combined and split, such that
their mutual information tends toward either 1 or 0. In other words, some of
these channels become completely noise-free, while the others become completely
noisy. Furthermore, the fraction of noiseless channels tends toward the
capacity of the underlying binary symmetric channels.






















Fig 3.1.
Encoding signal flow graph of (8, 4) polar code.


Polar codes are specified by a generator matrix Gn = F?m,
where n = 2m is the code length and F?m is the mth Kronecker power of F =1 01 1 .An (n, k) polar
code can be generated in two steps. First, an n-bit message u is constructed by
assigning the k reliable and (n?k) unreliable positions as information bits and
frozen bits, respectively. The (n ? k) frozen bits are forced to 0 and form the
frozen set AC. Then, the n-bit u is multiplied with the generator matrix G = F?m
to generate an n-bit transmitted code-word x. Fig. 1 shows the encoding signal
flow graph for n = 8 polar codes, where the “?” sign
represents the XOR operation.



b ) Belief propagation decoding



The BP decoding for polar codes is
based on the factor graph representation of the codes. In the decoder, each
node is associated with two types of messages: left-to-right messages L and
right-to-left messages R. Fig.2 shows the four nodes involved in one basic
computation element BE2. Firstly, the belief (indicated as log likelihood
ratio, LLR) of each node is initiated. The left most source vector nodes in Fig.3.1
are initiated with zero or positive infinity as Eq.(4) and the right most code
word nodes are initiated with channel output LLRs as Eq.(5). Other nodes are
initiated with zero.






















Fig 3.2. Factor graph of the original BP polar coding.











Fig 3.3. Factor graph of basic computation element BE2 in BP
polar decoders.



In BP decoding, LLRs are passed
iteratively from left to right and then from right to left through basic
computation elements BE2s to compute the likelihood of information bits. The
computations in each BE2 are expressed as equation (3)


Li,j = g( Li+1,2j-1, Li+1,2j+


Li,j+N/2 = g( Ri,j, Li+1,2j-1)+


Ri+1,2j-1 = g( Ri,j, Li+1,2j+


Ri+1,2j = g( Ri,j, Li+1,2j-1)+




g(x,y) = log(cosh((x+y)/2))-log(cosh((x-y)/2))


it can be simplified as :

g(x,y) ? 0.9.sign(x)sign(y).min(|x|,|y|)


After t iterations, the decision can
be made based on the final LLR results of source vector nodes as



c) Scheduling schemes for bpd



For a BP
decoder, scheduling refers to the method of information dissemination and
updating from one end of the factor graph to the other. It has a significant
effect on the convergence speed of the BP decoding on the factor graph. In this
section, we introduce the scheduling schemes used in the existing BP decoders.
In Early stopping criteria for energy-efficient low-latency belief-propagation
polar code decoders, a one-way scheduling scheme is used, where stages are
activated sequentially from left to right and one iteration is completed when
the activation reaches the rightmost stage, as shown in Fig. 3.2. When a stage
is activated, all the PEs in that stage are activated and both left-hand and
right-hand propagation messages for that stage are calculated and updated. In
this one-way scheduling, one iteration takes m clock cycles, where m
is the number of stages in the factor graph. Careful analysis of the one-way
scheduling reveals that, in one iteration, information at theleft-hand side of the factor graph has traversed
the whole factor graph to reach the right-hand side, while the information at
the right-hand side has only traversed one stage. This uneven updating of the
information will lead to a slow convergence of decoding on the factor graph, as
will be shown in the later sections. In A 4.68Gb/s belief propagation polar
decoder with bit-splitting register file,”, another scheduling scheme, named
round-trip scheduling, is employed. Round-trip scheduling guarantees even
dissemination of information. In on iteration, right-hand side information
traverses all the way to the left-hand side, followed by left-hand side
information traversing all the way to the right-hand side, as shown in Fig.
3(b). One iteration in round-trip scheduling takes 2m?1 cycles. However, since the dissemination and updating of the
message are more even, it requires less iteration for the convergence, and
hence, the overall latency is smaller, as will be shown in the later sections.



d) Connected subfactor graph


In this
section, the notion of the CSFG is introduced. The CSFG will be used to
illustrate our proposed decoding scheme for low-complexity BPD. Fig. 3.4 shows
the scheduling tree of the SCD of an n
= 8 polar code , and Fig. 3.5 shows the equivalent BPD factor graph of the same
n = 8 polar code. At each stage, the
SCD scheduling tree is split into a number of subtrees, each of which is
responsible for decoding a corresponding constituent code. The sizes of the
subtrees vary at each level and the size is reduced by half when moving down from
one stage to another stage. A CSFG is defined as a subfactor-graph that has the
same number of inputs and outputs, where the output nodes are at the stage m + 1 and each input is connected to
each output through some PEs in the subfactor-graph. Fig. 4(b) shows two
examples of a CSFG. It can be seen that each CSFG has a corresponding subtree
in the decoding scheduling tree of SCD. Fig. 3.4 and 3.5 shows the examples of
the decoding subtrees and the corresponding CSFGs of the n = 8 polar code. The number of CSFGs at each stage is given by 2 j, where j is the stage number. For the n
= 8 polar code, the numbers of CSFGs at stages 1 and 2 are two and four,



















Fig 3.4.Subtrees in SCD Scheduling tree



























Fig 3.5. CSFG’s in BPD Factor graph


to the subtree of the SCD scheduling tree, each CSFG of the BPD factor graph is
responsible for the decoding of its corresponding constituent code using its
respective input messages. At each iteration t, the nodes at the stage j
in the BPD factor graph output left-to-right LLR-based propagating messages Rt1 :n,
j+1, and these are the inputs to the 2 j
CSFGs at the stage j .


 e) Proposed



In this
section, a modified BP decoder with stage combination for polar codes is
introduced. Combining adjacent stages in the factor graph can be applied to
reduce the number of stages.




software is used to verify the functionality of my design. The VLSI
architecture is developed using Verilog HDL. Fig 5.1 shows simulation result of
8 bit polar decoder. Fig 5.2 shows the RTL Schematics of belief propagation
polar code decoder.














Fig 5.1 Simulation result of Polar decoder







































Fig 5.2
RTL Schematic of belief propagation polar code decoder




this paper, a novel early stopping method based on combined belief propagation
polar code decoder is proposed for BP polar code decoders. This early stopping
scheme not only reduces the average complexity of the BPD, hence reducing the
energy consumption during decoding, but also helps to make the decoding
converge faster. This reduces the average number of iterations and further
reduces the energy consumption and average latency. A new quarter-way
scheduling scheme is proposed, and the corresponding VLSI hardware architecture
is developed for the proposed BPD.