Lattice basis reduction is a concept which at a glance has no right to have as many applications as it does. It appears as a very esoteric concept and is often explained quite vaguely, and I at first had a very tough time wrapping my head around what it actually means. In this article I will try to explain lattice basis reduction and how it can be used and hopefully give you a better intuition for what “reducing a lattice basis” actually means. I will assume basic linear algebra knowledge.
What is a lattice?
A lattice is a set of points in a vector space which are all integer linear combinations of a set of basis vectors. Let’s unpack that.
A basis in a vector space is a set of vectors which spans that whole space. This should be familiar coming from basic linear algebra. That means that you can reach any point in a vector space via a linear combination of the basis vectors.
If we for example have two vectors $\vec{u}$ and $\vec{v}$ in $\mathbb{R}^2$ (the 2D plane) which constitute a basis then any other point $\vec{p}$ can be expressed as $\vec{p} = a\vec{u} + b\vec{v}$ where $a, b \in \mathbb{R}$.
A lattice is the set of points created by integer linear combinations of a set of basis vectors. This means that $a$ and $b$ are instead in $\mathbb{Z}$, so instead of spanning the whole plane we span a discrete set of points, resulting in a grid-like structure.
Reduction
Now, remember that a lattice is only defined by which points it contains, it does not care for what basis vectors generate it.
Reducing a lattice basis means finding another basis for the same lattice which is smaller than the original basis. Smaller here refers to the euclidian norm, i.e the length of the basis vectors. The smaller basis will also be more orthogonal, meaning the basis vectors will be at a greater angle compared to one another, in the best case at a $90 \degree$ angle from each other.
Reducing a lattice can be used to try and find the shortest vector in a lattice. There exists algorithms like LLL which will find a small basis, but it’s only approximate. There is no guarantee that it will find the shortest vector of the lattice, because doing so for an arbitrary lattice is in fact proven to be NP-hard. Nevertheless, these approximate reductions have been shown to still be of great use.
Below is an interactive visualisation of a lattice generated by 2 basis vectors; the blue and red lines. The pink and green lines are a reduced basis for the same lattice. Try dragging them around and observe how the resulting lattice changes (I’ve not got this to work on mobile, sadly).
Starting Basis
Reduced Basis
Emulating GCD
The greatest common divisors of two numbers $a, b \in \mathbb{Z}$ is the greatest integer which divides both $a$ and $b$, and is denoted $\gcd(a, b)$.
$e = \gcd(a, b)$ is actually equivalent to the following statement: $e$ is the smallest positive integer linear combination of $a$ and $b$. (proof)
So the expression $ax + by$ will be at its smallest when it equals $\gcd(a, b)$, except for being $0$ or negative. We can actually loosen that to just being $\ne 0$ if we consider $e$ and $-e$ to be the same number with the same “size”; we only care about their magnitude. Consider it the 1D euclidian norm, if you will.
$$ \begin{bmatrix} a \\ b \\ \end{bmatrix} $$
If you bear with me, we can consider the column vector above to be a basis for a 1D lattice, with $a$ being the first basis vector and $b$ the second. Thus any vector in this lattice will be of the form:
$$ \begin{bmatrix} x & y \end{bmatrix} \cdot \begin{bmatrix} a \\ b \\ \end{bmatrix} = ax + by $$
It should now not be a great stretch to conclude that $\gcd(a, b)$ will be a short vector in this lattice. In fact, it will be the shortest non-zero vector. So by running a basis reduction algorithm on $\begin{bmatrix} a & b \end{bmatrix}^T$ we should hopefully get a basis which contains $\gcd(a, b)$.
Here I compute the LLL reduction of the lattice using Sage, which is essentially Python but with more math and less stability.
sage: a, b = 7*67, 7*37
sage: A = matrix([[a], [b]])
sage: A
[469]
[259]
sage: A.LLL()
[ 0]
[-7]
sage: gcd(a, b)
7
We get $-7$ as a basis vector, which is the exact same as $7$ as far as LLL is concerned, since it only cares about the euclidian norm.
Here it’s also important to understand why it does not just return the zero-vector as a basis. It is after all part of this lattice and it’s certainly the smallest we can get. But if it did that then the result would not be a basis for the original lattice.
Looking at the result, we can also gain some insights into the original lattice. The new basis has $0$ as a basis vector, which doesn’t do anything. The only other vector is $7$ which means that all numbers of the form $ax + by$ (when $a=7 \cdot 67$ and $b=7 \cdot 37$) must be a multiple of $7$, since they can be equivalently written as $0z+7w = 7w$.
Emulating the Extended Euclidian Algoritm
In order to arrive at the vector $7$ before, since it’s part of the lattice, LLL must have gotten it by computing $ax + by$ for some unknown $x$ and $y$. Suppose we now want to know which $x$ and which $y$ it used. This is normally the job of the extended euclidian algorithm, but we can do this using lattice reduction as well! We can do this by forcing LLL to include that information in the reduced basis. Consider the following matrix, where each row is a vector in the basis of a lattice:
$$ \begin{bmatrix} a & 1 & 0 \\ b & 0 & 1 \\ \end{bmatrix} $$
A point in this lattice will be of the form:
$$ \begin{bmatrix} x & y \end{bmatrix} \begin{bmatrix} a & 1 & 0 \\ b & 0 & 1 \\ \end{bmatrix} = \begin{bmatrix} ax + by & x & y \end{bmatrix} $$
So if we find the vector which has the smallest first entry, i.e $ax+by=\gcd(a,b)$, then the coefficients used to arrive at the answer will be sitting in the second and third entries of the vector.
We do however have a small problem. If the coefficients $x$ and $y$ are too big, significantly greater than $\gcd(a, b)$, then LLL will prioritize minimizing $x$ and $y$ instead of $ax+by$, so the answer we’re after won’t be in the reduced basis.
We can fix this by multiplying the whole first column of the matrix by some large number $S$, like this:
$$ \begin{bmatrix} x & y \end{bmatrix} \begin{bmatrix} Sa & 1 & 0 \\ Sb & 0 & 1 \\ \end{bmatrix} = \begin{bmatrix} S(ax + by) & x & y \end{bmatrix} $$
This means that a small change in $ax+by$ will now have a much greater effect on the size of the whole vector. LLL will thus turn its attention to the first entry again and do its absolute best to minimize it, hopefully leaving us with the minimal value of $\gcd(a, b)$. The coefficients $x$ and $y$ will contribute a negligible amount to the size of the vector, so they will simply hop along for the ride, so that we can extract them at the end.
This process of multiplying a column to prioritize/deprioritize it is such a common procedure that it can be nice put that in a separare weight matrix $W$, which will be a diagonal matrix where each entry represents how much to scale each column by. After we have done the reduction we can simply multiply the result by $W^{-1}$ to remove the scaling, so that the first entry will be $\gcd(a, b)$ instead of $S\gcd(a, b)$.
$$ LW=\begin{bmatrix} a & 1 & 0 \\ b & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} S & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} = \begin{bmatrix} Sa & 1 & 0 \\ Sb & 0 & 1 \\ \end{bmatrix} $$
And with that we are ready to compute:
sage: a, b = 7*67, 7*37
sage: A = matrix([
....: [a, 1, 0],
....: [b, 0, 1]
....: ])
sage: A
[469 1 0]
[259 0 1]
sage: W = diagonal_matrix([2**32, 1, 1])
sage: (A*W).LLL() / W
[ 0 37 -67]
[ -7 16 -29]
sage: xgcd(a, b)
(7, -16, 29)
As you can see we got the correct answer, just negated, which is easily accounted for. Note that LLL still cares about the size of $x$ and $y$, so it will not find values which are unnecessarily large (at least, ideally).
A slightly harder problem
Now we’re going to look at a problem where lattice reduction can actually have a real advantage over existing algorithms. We are going to solve the following equation:
$$ax + by \equiv c \pmod{n}$$
For some $a, b, c \in \mathbb{Z}_n$ where $\mathbb{Z}_n$ denotes the set of integers modulo $n$. Equivalently we want to find $x$ and $y$ such that:
$$c - ax - by \equiv 0 \pmod{n}$$
which is also equivalent to
$$c - ax - by + kn = 0$$
for some arbitrary $k \in \mathbb{Z}$ (we switched $-k$ for $k$, it doesn’t matter). So in essence we have $3$ different unknowns, $x$, $y$ and $k$. I’m now going to throw a lattice on the screen and we’ll walk through what it tries to achieve.
$$ L = \begin{bmatrix} c & 1 & 0 & 0 \\ -a & 0 & 1 & 0 \\ -b & 0 & 0 & 1 \\ n & 0 & 0 & 0 \\ \end{bmatrix} $$
A vector of this lattice will be of the form:
$$ \begin{bmatrix} w & x & y & k \end{bmatrix} L = \begin{bmatrix} cw - ax - by + kn & w & x & y \end{bmatrix} $$
Looking at the first element of the vector, $cw - ax - by + kn$, we can see that solving the above equation is equivalent to finding a value for $x$, $y$ and $k$ such that that first element is $=0$, assuming $w = 1$. Make sure you understand why that is.
Another important thing to understand here is why adding $n$ in the last row is functionally the same as working modulo $n$. Since the coefficient the last row is multiplied by ($k$ in the above example) does not appear in any of the other entries the size of $k$ will not affect the size of the resulting vector, which is good. This means that it will always keep adding/subtracting $n$ to the first entry until the result is $<\frac{n}{2}$ and $>-\frac{n}{2}$ which is the smallest remainder modulo $n$, effectively the same as the smallest positive remainder modulo $n$, which is what we’re used to.
If we were to run LLL on this lattice and get lucky it might be the case that it manages to minimize the first entry all the way down to $0$, and if we’re really lucky then maybe $w$ also equals $1$. If that is the case then we can just pick out $x$ and $y$ from the second and third element of the vector and we have our solution!
As it stands that is sadly very unlikely, but we can once again utilize weighting to encourage LLL to find the kind of vectors we want. Consider $W = \operatorname{diag}(S, S, 1, 1)$
$$ LW = \begin{bmatrix} Sc & S & 0 & 0 \\ -Sa & 0 & 1 & 0 \\ -Sb & 0 & 0 & 1 \\ Sn & 0 & 0 & 0 \\ \end{bmatrix} $$
Where $S$ is, as previously, a sufficiently large number. This will once again prioritize the first column to ensure that we find a vector where the first entry is as small as possible, ideally it should be $=0$. Additionally we want to discourage LLL from using the first row of the lattice more than once, since that would mean it’s not solving our target equation. We do this by putting a very large weight on that row, as seen in the second column. LLL will be forced to use it at least once in the new basis, this is because it is the only vector which has a non-zero entry in the second column, and hence the span of the basis would not be able to cover that “subspace”, so to speak, were it not included.
Still, LLL will be very reluctant to use it more than once since it would increase the size of the vector dramatically for each time it is used. It is thus very likely that it will use it exactly once, and if that occurs at the same time as the first entry is $=0$ then we have found our solution!
Let’s do this in practice:
sage: n = randrange(2**32)
sage: a, b, c = [randrange(n) for _ in range(3)]
sage: L = matrix([
....: [c, 1, 0, 0],
....: [-a, 0, 1, 0],
....: [-b, 0, 0, 1],
....: [n, 0, 0, 0],
....: ])
sage: W = diagonal_matrix([2**64, 2**64, 1, 1])
sage: B = (L*W).LLL() / W
sage: B
[ 0 0 -41310 2689]
[ 0 0 12343 -98097]
[ -1 0 -1816 29012]
[ 0 -1 -20274 -12635]
sage: v = next(v for v in B if v[0]==0 and abs(v[1])==1)
sage: v *= sign(v[1])
sage: v
(0, 1, 20274, 12635)
sage: x, y = v[2:]
sage: (a*x + b*y - c) % n
0
Seems to be working! v *= sign(v[1])
is to make the solution positive, since the target vector may have had $w=-1$ just as well, which was the case here as can be seen in the basis we got.
The size of $S$ here is arbitrarily chosen. You can choose a much smaller $S$ which will make LLL run faster, but in this case it’s not needed. If it is too small then you may not find a solution, but having it too big doesn’t matter in this setup, so we chose $2^{64}$ which is more than enough.
An important thing to notice is that the solutions we find here will be small solutions. LLL still wants to minimize the vector after all. There might be more solutions which this technique won’t find, but in many cases that doesn’t matter.
…
This post is a work-in-progress, I might add more examples or expand on the previous explanations more.