Algebraic Structure

$GF(p^m)$ : Galois field of order $p^m$ , $p$ is prime.

伽罗瓦域是一个包含加法和乘法的代数结构，满足加法和乘法的交换律、结合律、分配律，每个元素都有逆元。

$\Z_n$ : Integers modulo $n\ge1$ .

$M_{m,n}(R)$ : $m$ by $n$ matrices over a ring $R$ .

这个玩意是一个矩阵，只是其中每一个元素都来自环。乘法通常不满足交换律。

$M_n(R)$ : $M_{n,n}(R)$ .

The Discrete Fourier Transform

~~数字信号处理~~

The key to fast multiplication of integers and polynomials is the discrete Fourier transform.

Roots of Unity

e^{\frac{2\pi}{n}i}=\cos\frac{2\pi}{n}+i\sin\frac{2\pi}{n}\\ i=\sqrt{-1}

A complex number $\alpha\in\C$ is a $n$ -th root of unity if $\alpha^n=1$ .

It is a primitive $n$ -th root of unity if $\alpha^m\neq1$ for all $m=1,\cdots,n-1$ .

There are exactly $\varphi(n)$ primitive $n$ -th roots of unity where $\varphi(n)$ is the number of positive integers less than or equal to $n$ that are relatively prime to $n$ .

🤔Lemma of cancellation property. Let $\omega$ be any primitive $n$ -th root of unity.

\sum_{j=1}^{n-1}\omega^{js}=\begin{cases} 0\text{ if }s\not\equiv0\mod n\\ n\text{ if }s\equiv0\mod n \end{cases}

If $s\equiv0\mod n$ , $\omega^{js}=1$ . Thus the lemma holds.
If $s\not\equiv0\mod n$ , $x^n-1=(x-1)\sum_{j=0}^{n-1}x^j$ .
Substituting $x=\omega^s$ makes LHS $0$ . Thus RHS equals $0$ , since $\omega^s\neq1$ , the lemma holds.

Let $F(\omega)=F_n(\omega)$ denote the matrix

\left[ \begin{array}{ccccc} 1&1&1&\cdots&1\\ 1&\omega&\omega^2&\cdots&\omega^{n-1}\\ 1&\omega^2&\omega^4&\cdots&\omega^{2(n-1)}\\ \vdots&&&&\vdots\\ 1&\omega^{n-1}&\omega^{2(n-1)}&\cdots&\omega^{(n-1)^2} \end{array} \right]

📖Definition of DFT and its inverse. Let $\mathbf a=(a_0,\cdots,a_{n-1})^\top\in\C^n$ . The discret Fourier transform of $\mathbf a$ is

DFT_n(\mathbf a)=\mathbf A=(A_0,\cdots,A_{n-1})^\top\\ A_i=\sum_{j=0}^{n-1}a_j\omega^{ij}\forall i=0,\cdots,n-1

i.e.,

DFT_n(\mathbf a)= \frac{1}{n} \cdot \left[ \begin{array}{ccccc} 1&1&1&\cdots&1\\ 1&\omega^{-1}&\omega^{-2}&\cdots&\omega^{-n+1}\\ \vdots&&&&\vdots\\ 1&\omega^{-n+1}&\omega^{-2(n-1)}&\cdots&\omega^{-(n-1)^2} \end{array} \right] \cdot \left[ \begin{array}{c} A_0\\A_1\\ \vdots\\ A_{n-1} \end{array} \right].

The inverse discrete Fourier transform of $\mathbf A=(A_0,\cdots,A_{n-1})^\top$ is $DFT_n^{-1}(\mathbf A)=\frac{1}{n}F(\omega^{-1})\cdot \mathbf A$ .

DFT_n^{-1}(\mathbf a)=F(\omega)\cdot\left[ \begin{array}{c} a_0\\a_1\\ \vdots\\ a_{n-1} \end{array} \right] =\left[ \begin{array}{c} A_0\\A_1\\ \vdots\\ A_{n-1} \end{array} \right].

🤔Lemma of reverse multiplication. $F(\omega^{-1})\cdot F(\omega)=F(\omega)\cdot F(\omega^{-1})=nI_n$ where $I_n$ is the identity matrix.

Proof by Lemma of cancellation property.

Connection to polynomial evaluation and interpolation. Let $\mathbf a$ be the coefficient vector of the polynomial $P(X)=\sum_{i=1}^{n-1}a_iX^i$ . Then computing $DFT(\mathbf a)$ amounts to evaluating the polynomial $P(X)$ at all the $n$ -th roots of unity, at

X=1,\omega,\omega^2,\cdots,\omega^{n-1}.

Similarly, computing $DFT^{-1}(\mathbf A)$ amounts to recovering the polynomial $P(X)$ from its values $(A_0,\cdots,A_{n-1})$ at the same $n$ points.

The Fast Fourier Transform

Naive algorithm for $DFT$ and $DFT^{-1}$ : $O(n^2)$ .

Fast Fourier Transform: $O(n\log n)$ .

FFT algorithm: divide-and-conquer.

Let $n$ be a power of $2$ . Computing $DFT(\mathbf a)$ amounts to computing the $n$ values

P(1),P(\omega),\cdots,P(\omega^{n-1}).

Actually, $P(X)$ has even part and odd part.

P(X)=P_e(X^2)+X\cdot P_o(X^2).

Where $P_e$ and $P_o$ are polynomials of degrees at most $n/2$ and $(n-1)/2$ respectively.

Pseudo code of FFT

Input: A polynomial $P(X)$ with coefficients given by an $n$ -vector $\mathbf a$ , $\omega$ , a primitive $n$ -th root of unity.

Output: $DFT_n(\mathbf a)$

Evaluate $P_e(X^2)$ and $P_o(X^2)$ at $X^2=1,\omega^2,\omega^4,\cdots,\omega^{2n-2}$ .
Multiply $P_o(\omega^{2j})$ by $\omega^j$ for $j=0,\cdots,n-1$ .
Add $P_e(\omega^{2j})$ to $\omega^jP_o(\omega^{2j})$ for $j=0,\cdots,n-1$ .

Analysis of FFT:

In step 1, since $\omega^n=1,\omega^{n+2}=\omega^2,\cdots,\omega^{2n-2}=\omega^{n-2}$ . So it suffices to evaluate $P_e$ and $P_o$ at only $n/2$ values.
i.e., at all the $(n/2)$ -th roots of unity. This is equivalent to the problem of computing $DFT_{n/2}(P_e)$ and $DFT_{n/2}(P_o)$ .
Step 2 and 3 take $n$ multiplications and $n$ additions respectively.

Thus, $T(n)=2T(n/2)+2n\Rarr T(n)=2n\log n$ for $n$ is a power of $2$ .

🎯Theorem of FFT’s complexity. Assuming the availability of a primitive $n$ -th root of unity, the discrete Fourier transform $DFT_n$ and its inverse can be computed in $O(n\log n)$ complex arithmetic operations.

Polynomial Multiplication

Convolution and polynomial multiplication

Assume $n\ge2$ . The convolution of two $n$ -vectors $\mathbf a,\mathbf b$ is the $n$ -vector

\mathbf c=\mathbf a*\mathbf b=(c_0,\cdots,c_{n-1})^\top

where $c_i=\sum_{j=0}^ia_jb_{i-j}$ .

Let $P(X),Q(X)$ be polynomials of degrees less than $n/2$ . Then $R(X)=P(X)Q(X)$ is a polynomial of degree less than $n-1$ .

Let $\mathbf a$ and $\mathbf b$ denote the coefficient vectors of $P$ and $Q$ . Then it is not hard to see $\mathbf a*\mathbf b$ gives the coefficient vector of $R(X)$ .

Thus convolution is essentially polynomial multiplication.

🎯Theorem of convolution. Let $\mathbf a,\mathbf b$ be $n$ -vectors whose initial $\lfloor n/2\rfloor$ entries are zeros. Then

DFT^{-1}(DFT(\mathbf a)*DFT(\mathbf b))=\mathbf a*\mathbf b.

This reduces the problem of convolution to two $DFT$ and $DFT^{-1}$ computations.

🎯Theorem of polynomial computation’s complexity. Assuming the availability of a primitive $n$ -th root of unity, we can compute the product $PQ$ of two polynomials $P,Q\in\C[X]$ of degree less than $n$ in $O(n\log n)$ complex operations.

Modular FFT

Define $\Z_M$ where $M=2^L+1$ .

An element $x\in\Z_M$ is a zero-divisor if there exist $y\neq0$ s.t. $x\cdot y=0$ .

The inverse element of $x$ is $y$ s.t. $xy=1$ .

🤔An element $x\in\Z_M$ has a multiplicative inverse $x^{-1}$ iff $x$ is not a zero-divisor.

Representation and basic operations modulo M

Let $2^L\equiv-1\mod M$ be denoted with special symbol $\bar 1$ . Each element of $\Z_M\diagdown\{\bar 1\}$ as binary string $(b_{L-1},\cdots,b_0)$ of length $L$ ; the element $\bar1$ is given a special representation.

For example, $M=17,L=4$ . Then $13$ is represented by $(1,1,0,1)$ .

It’s easy to add and subtract in $\Z_M$ under this representation in $O(L)$ time.

Multiplying $X$ by $2^j$ amounts to left-shifting the string $X$ by $j$ positions.

If $(b_{L-1},\cdots,b_0)$ is multiplied by $2^j(0\lt j\lt L)$ , the results is given as

(b_{L-j-1},\cdots,b_0,0,\cdots,0)-(0,\cdots,0,b_{L-1},\cdots,b_{L-j})

Primitive roots of unity modulo M

Let $K=2^k$ and $K$ divides $L$ . We define $\omega=2^{L/K}$ .

🤔Lemma: In $\Z_M$ , $\omega$ is a primitive $(2K)$ -th root of unity.

🤔The cancellation property holds:

\sum_{j=0}^{2K-1}\omega^{js}\equiv \begin{cases} 0\mod M\text{ if }s\not\equiv0\mod 2K\\ 2K\mod M\text{ if }s\equiv0\mod 2k \end{cases}

🎯Theorem. The transforms $DFT_{2K}(\mathbf a)$ and $DFT^{-1}_{2K}(\mathbf A)$ for $(2K)$ -vectors $\mathbf a,\mathbf A\in(\Z_M)^{2K}$ can be computed using the FFT method, taking $O(KL\log K)$ bit operations.

$T(2K)=2T(K)+O(KL)$

Fast Integer Multiplication

🎯Theorem of integer multiplication’s complexity. Given two integers $u,v$ of sizes at most $n$ bits, we can form their product $uv$ in $O(n\log n\log\log n)$ bit operations.

Here we prove a weaker bound $n\log^{2.6}n$ .

A simplified Schönhage-Strassen algorithm

Goal: Product $W=UV$ . Assume that $U,V$ are $N$ -bit binary numbers where $N=2^n$ .

Choose $K=2^k,L=3\cdot2^\ell$ where

k=\lfloor\frac{n}{2}\rfloor,\ell=\lceil n-k\rceil.

$k,\ell$ are integers, $n$ is not necessarily be integer.

$k+\ell\ge n$ , we may view $U$ as $2^{k+\ell}$ -bit numbers, padding with zeros. Break up $U$ into $K$ pieces, each of bit size $2^\ell$ .

Similar padding to $V$ , we have $(2K)$ -vector:

\bar U=(0,\cdots,0,U_{K-1},\cdots,U_0)\\ \bar V=(0,\cdots,0,V_{K-1},\cdots,V_0)

where each component has $2^\ell$ bits.

We can regard $\bar U,\bar V$ as coefficient vectors of the polynomials $P(X)=\sum_{j=0}^{K-1}U_jX^j$ and $Q(X)=\sum_{j=0}^{K-1}V_jX_j$ . Let

\bar W=(W_{2K-1},\cdots,W_0)

be the convolution of $\bar U,\bar V$ . Each $W_i$ in $\bar W$ satisfies the inequality

0\le W_i\le K\cdot2^{2\cdot2^\ell}

since it is the sum of at most $K$ products of the form $U_jV_{i-j}$ . Hence

0\le W_i\lt 2^{3\cdot2^\ell}\le M

where $M=2^L+1$ as usual.

$\bar W$ is the coefficient vector of the product $R(X)=P(X)Q(X)$ . Since $P(2^{2^\ell})=U$ and $Q(2^{2^\ell})=V$ , it follows that $R(2^{2^\ell})=UV=W$ . Hence

W=\sum_{j=0}^{2K-1}2^{2^\ell j}W_j.

We can easily obtain each summation in this sum from $\bar W$ by multiplying each $W_j$ with $2^{2^\ell j}$ . As each $W_j$ has $k+2\cdot2^\ell\lt L$ non-zero bits.

Each bit of $W$ is obtained by summing at most $3$ bits plus at most $2$ carry bits.

🤔Lemma: The product $W$ can be obtained from $\bar W$ in $O(N)$ bit operations.

$\bar W=DFT^{-1}(DFT(\bar U)\cdot DFT(\bar V))$ .
These transforms take $O(KL\log K)=O(n\log n)$ bit operations.
The inner product requires $2K$ multiplications of $L$ -bit numbers, which is recursively accomplished. Let $T(N)$ denote the bit complexity of algorithm:
$T(N)=O(N\log N)+2K\cdot T(L).$
Write $t(n)=T(N)/N$ where $N=2^n$ . The recurrence becomes
$t(n)=O(n)+2\frac{K}{N}T(L)=O(n)+2\frac{3}{L}T(L)=O(n)+6t(n/2+c),$
for some constant $c$ . Define $s(n)=t(n+2c)$ . Then
$s(n)=O(n+2c)+6t((n/2)+2c)=O(n)+6s(n/2).$
This resolves to $s(n)=O(n^{\lg 6})$ . Back-substituting, we obtain
$T(N)=O(N\log^\alpha N),\alpha=\lg6\lt2.5848$

The choice of $L=3\cdot2^\ell$ is sub-optimal.

The method implies

T(N)=O(N\log^{2+\epsilon}N)

for any $\epsilon\gt0$ .

Improvement: compute each $W_i$ in two parts:

Let $M'=2^{2\cdot2^\ell}+1$ and $M''=K$ .
Since $M',M''$ are relatively prime and $W_i\lt M' M''$ , it follows that if we have computed $W_i'=W_i\mod M'$ and $W_i''=W_i\mod M''$ , then we can recover $W_i$ using the Chinese remainder theorem.
Computing all the $W_i''$ 's and the reconstruction of $W_i$ from $W_i',W_i''$ can be accomplished in linear time.

Thus the new recurrence is

t(n)=n+4t(n/2),

which has the solution $t(n)=O(n^2)$ or $T(N)=O(N\log^2N)$ .

In order to get ultimate result, we need improve recurrence $t(n)=n+2t(n/2)$ by using a variant convolution called negative wrapped convolution and $DFT_K$ instead of $DFT_{2K}$ .