Implementation on the SC140 Core
• exp_2_bin_extended. Exponential-to-binary table, 511 bytes long. Indices are the exponents and
entries are the corresponding Galois numbers in binary form. The last entry is equal to 0.
• exp_table_for_syndrome. Power matrix for polynomial evaluation, 16 × 256 words in size. It has the
typical form presented in Section 4.1, Polynomial Evaluation Over GF(256), on page 13. M is at most
2T and is thus chosen to be 16.
4.4 Lowest Cycle Count Limit for Polynomial Evaluation
The most general polynomial evaluation has the form presented in Section 4.1, Polynomial Evaluation Over
GF(256), on page 13. We assume that the entries of the input vector are represented as binary and the power matrix
is stored in exponential form. For a vector of length D+1 and field points α, α2, α3,...., αM, the C-code for
polynomial evaluation is then given by the following example:
Example 1. C Code for Matrix Multiplication
for (i=0; i<M; i++)
{
acc = 0;
for (j=0; j<=D; j++)
{
x_power = bin_2_exp[vector[j]];
y_power = exp_table_for_syndrome[i][j];
power = MIN((x_power + y_power),2*N+1);
acc ^= exp_2_bin_extended[power];
}
result[i] = acc;
}
For each field element, two table look-ups are needed for each term. The first table look-up converts the
polynomial from binary to exponential form. To save cycles, this conversion is performed separately, prior to
entering the polynomial evaluation routine. This binary-to-exponential conversion contributes a small overhead to
the total cycle count of the routine. It is implemented in the following steps:
1. Get the current vector element in binary form.
This requires one MOVEU.B (rx) instruction, where rx denotes a general AGU register.
2. Add this byte to the table basic address and transfer the resulting address into an AGU register.
This requires a MOVE.L command.
3. Read the table entry via a MOVE.W (rx) command.
A maximum of two move instructions can execute in one cycle. Thus, assuming full parallelization, for every
polynomial term, a minimum of two cycles is needed in the conversion routine.
If the degree of the polynomial is D and the number of field points is M, a gross estimate for the number of cycles,
denoted as Cconversion required, is given by Cconversion ≈ 2(D + 1) . The code shown in Example 1 involves the
import of data and table look-ups, which are both AGU-based operations. The execution sets therefore are filled by
AGU instructions rather than by DALU instructions. Thus, as in the binary-to-exponential conversion routine, the
number of AGU operations is the critical factor in determining the cycle count.
The two methods of choice to implement polynomial evaluation in assembly code are split-summation and multi-
sampling. In the split-summation method, one term in the result vector is calculated in every iteration. The inner
product of each row with the input vector is divided into four partial sums by loading four matrix and four vector
Reed Solomon Encoder/Decoder on the StarCore™ SC140/SC1400 Cores, With Extended Examples, Rev. 1
Freescale Semiconductor
15