这课终于上完了。上得真挺难受😭

Lecture notes by Rothvoss, University of Washington

Lattices

Lattices are integral combinations of linearly independent vectors:

{i=1kλibiλ1,,λkZ}\left\{\sum_{i=1}^k\lambda_i\mathbf b_i\vert\lambda_1,\cdots,\lambda_k\in\Z\right\}

where b1,,bkRn\mathbf b_1,\cdots,\mathbf b_k\in\R^n are linearly independent vectors.

Another equivalent definition: a lattice is a discrete subgroup of Rn\R^n.

If k=nk=n the lattice has full rank. Without specification, the we consider lattice with full rank.

Let BRn×n\mathbf B\in\R^{n\times n} be the matrix that has the basis vector b1,,bn\mathbf b_1,\cdots,\mathbf b_n as columns, the lattice is

Λ(B)={i=1kλibiλ1,,λkZ}.\Lambda(\mathbf B)=\left\{\sum_{i=1}^k\lambda_i\mathbf b_i\vert\lambda_1,\cdots,\lambda_k\in\Z\right\}.

The matrix B\mathbf B is called basis of the lattice Λ(B)\Lambda(\mathbf B).

⚠️A lattice has more than one basis.

Unimodular matrices

幺模矩阵,什么玩意从来没听说过

📖Definition 1 (unimodular matrices). An n×nn\times n matrix U\mathbf U is called unimodular, if UZn×n\mathbf U\in\Z^{n\times n} and det(U){±1}\det(\mathbf U)\in\{\pm1\}.

简单一句话,就是行列式等于正负 1 的方阵。

🤔Lemma 1. If U\mathbf U is unimodular, then U1\mathbf U^{-1} is also unimodular.

🤔Lemma 2. Let B1,B2Rn×n\mathbf B_1,\mathbf B_2\in\R^{n\times n} non-singular. Then Λ(B1)=Λ(B2)\Lambda(\mathbf B_1)=\Lambda(\mathbf B_2) iff there is a unimodular matrix U\mathbf U with B2=B1U\mathbf B_2=\mathbf B_1\mathbf U.

The fundamental parallelepiped

“基平行六面体”

The fundamental parallelepiped of the lattice Λ(B)\Lambda(\mathbf B) is the polytope

P(B)={i=1nλibi0λi<1i[n]}.\mathcal P(\mathbf B)=\left\{\sum_{i=1}^n\lambda_i\mathbf b_i\vert0\le\lambda_i\lt1\forall i\in[n]\right\}.

简单来讲,就是两个基向量在 0 到 1 的线性组合覆盖的所有区域。在二维空间是个平行四边形,在三维空间是个平行六面体。

For every vector xRn\mathbf x\in\R^n, there is a unique coefficient vector λRn\lambda\in\R^n so that x=i=1nλibi\mathbf x=\sum_{i=1}^n\lambda_i\mathbf b_i. 高中数学知识

So x\mathbf x can be written as

x=i=1nλibi+i=1n(λiλi)bi.\mathbf x=\sum_{i=1}^n\lfloor\lambda_i\rfloor\mathbf b_i+\sum_{i=1}^n(\lambda_i-\lfloor\lambda_i\rfloor)\mathbf b_i.

The left half Λ(B)\in\Lambda(\mathbf B), and the right half Λ(B)\in\Lambda(\mathbf B).

This translates of the parallelepiped placed at lattices points exactly partition the Rn\R^n. This is called a tiling of Rn\R^n.

平铺

Fundamental parallelepiped can be rewritten as

P(B)={Bx:x[0,1[n},\mathcal P(\mathbf B)=\{\mathbf {Bx}:\mathbf x\in[0,1[^n\},

it is the image of the hyper-cube [0,1]n[0,1]^n under the linear map given by B\mathbf B. By transformation formula

vol(P(B))=vol([0,1[n)det(B).vol(\mathcal P(\mathbf B))=vol([0,1[^n)\cdot|\det(\mathbf B)|.

Fundamental parallelepiped it self depend on the choice of the basis, but its volume doesn’t.

🤔Lemma 3. Let BRn×n\mathbf B\in\R^{n\times n} and Λ=Λ(B)\Lambda=\Lambda(\mathbf B) be the generated lattice. Then the determinant of the lattice det(Λ)=det(B)\det(\Lambda)=|\det(\mathbf B)| is independent of the chosen basis. Moreover, det(Λ)=vol(P(B))\det(\Lambda)=vol(\mathcal P(\mathbf B)).

Minkowski’s Theorem

闵可夫斯基格点定理

A set KK is convex if x,yK\forall x,y\in K and 0λ10\le\lambda\le1, λx+(1λ)yK\lambda x+(1-\lambda)y\in K.

A set KK is centrally symmetric if xKx\in K iff xK-x\in K.

For convex symmetric set KK, we define a Minkowski norm:

xK=min{λ0:xλK},λK={λxxK}.\vert\vert x\vert\vert_K=\min\{\lambda\ge0:x\in\lambda K\},\\ \lambda K=\{\lambda x|x\in K\}.

That is, xK||x||_K gives the scaling factor that one needs until the scaled copy of KK includes xx.

Define y+K={x+yxK}y+K=\{x+y|x\in K\} as the translate of KK by yy.

Minkowski’s theorem says that every large enough symmetric convex set must contain a non-zero lattice point.

🎯Theorem 4 (Minkowski). Let KRnK\subseteq\R^n be a bounded symmetric convex set with vol(K)>2nvol(K)\gt2^n. Then K(Zn{0})K\cap(\Z^n\diagdown\{\mathbf 0\})\neq\empty.

🎯Theorem 5 (Blichfeldt). Let SRnS\subseteq\R^n be a measurable set with vol(S)>1vol(S)\gt1. Then there are s1,s2S\mathbf s_1,\mathbf s_2\in S with s1s2Zn\mathbf s_1-\mathbf s_2\in\Z^n.

Minkowski’s Theorem for General Lattices

🎯Theorem 6. (Minkowski’s first theorem). Let Λ\Lambda be a lattice and KK be a symmetric convex set with vol(K)>2ndet(Λ)vol(K)\gt2^n\det(\Lambda). Then K(Λ{0})K\cap(\Lambda\diagdown\{\mathbf 0\})\neq\empty.

Shortest Vector

The shortest vector w.r.t. L2-norm:

SVP(Λ)=min{x2xΛ{0}}SVP(\Lambda)=\min\{\vert\vert\mathbf x\vert\vert_2\vert\mathbf x\in\Lambda\diagdown\{\mathbf 0\}\}

is its length.

🤔In fact, finding the shortest vector is an NPNP-hard problem.

We can get some estimates on it.

🎯Theorem 7. Any lattice ΛRn\Lambda\subseteq\R^n, one has SVP(Λ)ndet(Λ)1/nSVP(\Lambda)\le\sqrt n\cdot\det(\Lambda)^{1/n}.

🎯Theorem 8 (Minkowski’s second theorem). For any full-rank lattice ΛRn\Lambda\in\R^n, one has

(i=1nλi(Λ))1/nndet(Λ)1/n.\left(\prod_{i=1}^n\lambda_i(\Lambda)\right)^{1/n}\le\sqrt n\cdot\det(\Lambda)^{1/n}.

Dirichlet’s Theorem

🎯Theorem 9. A vector α[0,1]n\alpha\in[0,1]^n of real numbers. There are numbers p1,,pnZ0p_1,\cdots,p_n\in\Z_{\ge0} and q{1,,Q}q\in\{1,\cdots,Q\} so that

maxi=1,,npiqαi1Q1/nq.\max_{i=1,\cdots,n}|\frac{p_i}{q}-\alpha_i|\le\frac{1}{Q^{1/n}q}.

Gram Schmidt Orthogonalisation

To find approximate shortest vector.

A lattice Λ(B)\Lambda(\mathbf B), we want to find a non-zero vector in polynomial time that has length x2αSVP(Λ(B))||x||_2\le\alpha\cdot SVP(\Lambda(\mathbf B)). Here α=α(n)1\alpha=\alpha(n)\ge1, called approximation factor (as small as possible).

The Gram Schmidt orthogonalisation takes indepedent vectors b1,,bnRn\mathbf b_1,\cdots,\mathbf b_n\in\R^n as input and computes an orthogonal basis b1,,bn\mathbf b^*_1,\cdots,\mathbf b^*_n so that span(b1,,bk)=span(b1,,bk)\text{span}(\mathbf b_1,\cdots,\mathbf b_k)=\text{span}(\mathbf b^*_1,\cdots,\mathbf b^*_k) for all k=1,.nk=1,\cdots.n.

Pseudo-code

Input: b1,,bnRn\mathbf b_1,\cdots,\mathbf b_n\in\R^n

Output: Orthogonal basis b1,,bn\mathbf b^*_1,\cdots,\mathbf b^*_n

  1. b1b1\mathbf b_1^*\gets\mathbf b_1
  2. b2b2μ1,2b1\mathbf b_2^*\gets\mathbf b_2-\mu_{1,2}\mathbf b_1^* with μ1,2b2,b1b122\mu_{1,2}\gets\frac{\lang\mathbf b_2,\mathbf b_1^*\rang}{||\mathbf b_1^*||_2^2}
  3. bjbji<jμi,jbi\mathbf b_j^*\gets\mathbf b_j-\sum_{i\lt j}\mu_{i,j}\mathbf b_i^* with μi,jbj,bibi22j=1,,n\mu_{i,j}\gets\frac{\lang\mathbf b_j,\mathbf b_i^*\rang}{||\mathbf b_i^*||_2^2}\forall j=1,\cdots,n

施密特正交化,属于是复习大一上学的线性代数了。

By Cavalieri’s principle (aka 祖暅原理), the shifting does not change the volume of the fundamental parallelepiped. Hence,

det(Λ(B))=vol(P(B))=i=1nbi2.\det(\Lambda(\mathbf B))=vol(\mathcal P(\mathbf B))=\prod_{i=1}^n||\mathbf b_i^*||_2.

🎯Theorem 10. Let B\mathbf B be a basis and B\mathbf B^* be its Gram Schmidt orthogonalisation. Then SVP(Λ(B))mini=1,,nbi2SVP(\Lambda(\mathbf B))\ge\min_{i=1,\cdots,n}||\mathbf b_i^*||_2

第一个正交化向量总是相等的。基向量的顺序很重要,这为 LLL 算法埋下伏笔。

The Lenstra-Lenstra-Lovász Algorithm

🎯Theorem 11. Given a regular matrix \mathbf B\in\Q^{n\times n} one can compute a vector xΛ(B){0}\mathbf x\in\Lambda(\mathbf B)\diagdown\{\mathbf 0\} of length x22n/2SVP(Λ(B))\vert\vert\mathbf x\vert\vert_2\le2^{n/2}\cdot SVP(\Lambda(\mathbf B)) in polynomial time.

The running time is actually O(n6log3(nB))O(n^6\log^3(n\vert\vert\mathbf B\vert\vert_\infty)).

Consider basis b1,,bn\mathbf b_1,\cdots,\mathbf b_n:

  • Subtracting vectors from each other. In n=2n=2, we can always subtract multiple of b1\mathbf b_1 from it so that μ1,21/2\mu_{1,2}\le 1/2. In higher dimensions, we can achieve μi,j1/2|\mu_{i,j}|\le1/2 for every i,ji,j.
  • Switching the order. It might be that b1\mathbf b_1 is much longer than b2\mathbf b_2. In this case, we may swap the order of b1\mathbf b_1 and b2\mathbf b_2. For higher dimension, we swap bi\mathbf b_i and bi+1\mathbf b_{i+1} if bi2bi+12\vert\vert\mathbf b_i\vert\vert_2\gg\vert\vert\mathbf b_{i+1}\vert\vert_2.

Coefficient Reduction

A basis is coefficient-reduced if μi,j1/2\vert\mu_{i,j}\vert\le1/2 for all i<ji\lt j.

🤔Lemma 12. Given any basis B\mathbf B one can compute a coefficient-reduced basis B~\tilde{\mathbf B} in polynomial time so that Λ(B~)=Λ(B)\Lambda(\tilde{\mathbf B})=\Lambda(\mathbf B) and the Gram Schmidt orthogonalisations are identical.

Main Procedure

📖Definition of LLL-reduced: Let BRn×n\mathbf B\in\R^{n\times n} be a lattice basis and let μi,j\mu_{i,j} be the coefficients from the Gram Schmidt orthogonalisation. The basis is called LLL-reduced if the following is satisfied.

  • Coefficient reduced: μi,j1/2\vert\mu_{i,j}\vert\le1/2 for all i<ji\lt j.
  • Lovász condition: bi222bi+122\vert\vert\mathbf b_i^*\vert\vert_2^2\le2\vert\vert\mathbf b_{i+1}^*\vert\vert_2^2 for i=1,,n1i=1,\cdots,n-1.

🤔Lemma 13. Let B\mathbf B be an LLL-reduced basis. Then b122n/2SVP(Λ(B))||\mathbf b_1||_2\le2^{n/2}\cdot SVP(\Lambda(\mathbf B)).

By Lovász condition, b122=b1222b2222i1bi22||\mathbf b_1||_2^2=||\mathbf b_1^*||_2^2\le2||\mathbf b_2^*||_2^2\le\cdots\le2^{i-1}||\mathbf b_i^*||_2^2.

By Theorem 10, SVP(Λ(B))22nb122SVP(\Lambda(\mathbf B))^2\ge 2^{-n}\cdot||\mathbf b_1||_2^2.

Taking square roots then gives the claim.

Pseudo-code

Input: A lattice basis BRn×n\mathbf B\in\R^{n\times n}

Output: An LLL reduced basis B~\tilde{\mathbf B}

  1. Compute a Gram Schmidt orthogonalisation b1,,bn\mathbf b_1^*,\cdots,\mathbf b_n^* with coefficients μi,j\mu_{i,j} and update whenever we change the order of the basis.
  2. while B\mathbf B is not LLL reduced, do
    • Apply coefficient reduction so that μi,j1/2\mu_{i,j}\le1/2.
    • If there is an index ii with bi22>2bi+122||\mathbf b_i^*||_2^2\gt2||\mathbf b_{i+1}^*||_2^2 then swap bi\mathbf b_i and bi+1\mathbf b_{i+1} in the ordering.

🤔Lemma 14. Suppose we have vectors a1,a2\mathbf a_1,\mathbf a_2 with Gram Schmidt orthogonalisation a1,a2\mathbf a_1^*,\mathbf a_2^* so that a1222a222||\mathbf a_1^*||_2^2\ge2||\mathbf a_2^*||_2^2 and let μ1/2\mu\le1/2. Let a2,a1\mathbf a_2^{**},\mathbf a_1^{**} be the Gram Schmidt orthogonalisation for the reverse order a2,a1\mathbf a_2,\mathbf a_1. Then a223/4a12||\mathbf a_2^{**}||_2\le\sqrt{3/4}\cdot||\mathbf a_1^*||_2.