Chinese Remainder Theorem for Programmers

This is a quick post about the Chinese Remainder Theorem. Specifically, how to use it to solve a system of system of simple modular equations.

This came up today in an Advent of Code problem, and enough people were asking questions about the theorem to warrant a little blog post, in my opinion.

In brief, the CRT lets us solve a system of equations:

$$ \begin{aligned} x &\equiv a_1 \mod n_1 \cr \vdots \cr x &\equiv a_k \mod n_k \end{aligned} $$

by efficiently finding the $x$ that satisfies all of these constraints, provided that each of the $n_i$ are coprime, i.e. share no common divisors.

The Modulo Operation

The behavior of mod should be pretty well understood, at least for positive numbers. mod x m means dividing x with m, and looking at what’s left, so we have:

mod 220 2 == 0
mod 221 2 == 1
mod 67 16 == 3

etc.

Another way of defining this is to say that $mod(x, m)$ is the difference between $x$, and the largest multiple of $m$ less than $x$, i.e.

$$ \text{mod}(x, m) := x - k \cdot m $$

where $k \cdot m \leq x$, and $\forall k' \cdot m \leq x. \quad k' \leq k$.

The behavior of mod for negative numbers varies depending on the programming language. In haskell, for example, mod (-x) m is -(mod x m). But according to the mathematical definition we’ve just given, this wouldn’t be the case.

For example $\text{mod}(-1, 3) = 2$, because the closest multiple of $3$, below $-1$, would be $-3$, and there’s a difference of $2$ in order to reach $-1$ from there. Similarly, we would have $\text{mod}(-5, 3) = 1$, because the closest multiple of $3$ here would be $-6$.

We can define a function to emulate this behavior, in languages where mod (-x) m = -(mod x m):

fullMod :: Integer -> Integer -> Integer
fullMod x m = mod (mod x m + m) m

If x is positive, then we have mod x m + m, and we take that modulo m. But, adding a multiple of m, doesn’t change the modulus. This is because $\text{mod}(a + b, m) = \text{mod}(\text{mod}(a, m) + \text{mod}(b, m), m)$, and $\text{mod}(m, m) = 0$.

If x is negative, then we end up with -(mod x m) + m, i.e. m - mod x m after the first step. This is the correct behavior, since if we have something like $\text{mod}(-5, 3)$, we want to end up with $1$, but instead we get $-2$ first. By adding this to $3$, we end up with the $1$ we wanted. We don’t need to do another reduction at this point, so the extra mod _ m changes nothing.

Modular Arithmetic

If we look at all of the integers, modulo some m, we end up with the set of numbers $\{0, 1, \ldots, m - 1\}$. Since these are still integers, we can carry out addition and multiplication as usual, but remembering to take the modulus with m afterwards. This gives us a ring $\mathbb{Z}/m\mathbb{Z}$, or $C_m$, which is quite a bit shorter.

For example, in $C_3$, we have $1 + 2 = 0$, $2 \cdot 2 = 1$, and other possibilities.

For each integer $x \in \mathbb{Z}$, we have a corresponding modulus in $C_m$. Many integers will have the same modulus. To express that two integers have the same modulus, we say:

$$ x \equiv y \mod m $$

Usually, $y$ will be reduced completely, and so we this means $\text{mod}(x, m) = y$. The subtle difference here is that $\text{mod}$ outputs an integer in the range $\{0, \ldots, m - 1\}$, but in the equivalence equation, $x$ and $y$ don’t necessarily have to be in this range.

As further examples, we have:

$$ \begin{aligned} 6 &\equiv 0 &\mod 3 \cr 5 &\equiv -1 &\mod 3 \cr -5 &\equiv 2 &\mod 3 \end{aligned} $$

One useful property is that if:

$$ x \equiv y \mod m $$

Then:

$$ x - y \equiv 0 \mod m $$

i.e. $x - y$ is a multiple of $m$.

Bézout’s Algorithm

We just need one more tool before we can tackle the CRT: Bézout’s Algorithm. This algorithm states that if we have two integers $a, b \in \mathbb{Z}$, and their greatest common divisor is $\gcd(a, b)$, then we can find two integers $x, y \in \mathbb{Z}$, such that:

$$ x \cdot a + y \cdot b = \gcd(a, b) $$

This is basically a suped up version of Euclid’s algorithm, finding not only the greatest common divisor, but also the necessary integer factors to get the equation above.

Euclid’s Algorithm

Remember that the $gcd$ of two numbers should return the largest number dividing both of them.

Let’s go over the idea behind Euclid’s algorithm first.

Note that $\text{gcd}(a, 0) = a$. Anything divides $0$, so we just need a divisor of $a$. The largest one is just $a$ itself.

Now, let’s say we can decompose $a$:

$$ a = qb + r $$

This uses euclidean division to find a quotient and a remainder, with $r < b$.

We can show that any divisor of $a$ and $b$ is a divisor of $b$ and $r$, and vice versa.

First, let’s show $d | b, d | r \implies d | a$.

Note that if $d | b$, then $b = k_b d$ for some factor $k_b$.

With this in hand, it’s clear that if $d | b, d | r$, then we can write $a = qb + r$ as

$$ a = q k_b d + k_r d = (q k_b + k_r) d $$

This means that $d | a$.

Now for $d | a, d | b \implies d | r$.

We can write $a - qb = r$, and get $$ k_a d - q k_b d = r $$

and thus $d | r$.

Since the common divisors of $a$ and $b$ are nothing more than the common divisors of $b$ and $r$, $\gcd(a, b)$ must be $\gcd(b, r)$.

This gives us a recursive algorithm to calculate the $\gcd$:

gcd :: Integer -> Integer -> Integer
gcd a b | a < b = gcd b a
gcd a 0 = a
gcd a b =
  let r = mod a b
  in gcd b r

The first clause is the swapping rule, the second the base case, with $\text{gcd}(a, 0)$, and the third implements the recursive rule we just went over.

Extending the Algorithm

Bézout’s algorithm extends this to find not only the $\gcd$, but also two factors $x, y$ such that $xa + yb = \gcd(a, b)$

We have a similar base case:

With $\gcd(a, 0) = a$, it’s obvious that:

$$ 1 \cdot a + 0 \cdot 0 = a $$

Now, for the general case, we decompose $a$ into $qb + r$, like before.

Let’s say we’ve found $x, y$ such that:

$$ xb + yr = \gcd(b, r) = \gcd(a, b) $$

But, we can write $r$ as:

$$ r = a - qb $$

Which gives us:

$$ \gcd(a, b) = xb + yr = xb + y(a - qb) = ya + (x - qy)b $$

So $(x, y)$ becomes $(y, (x - q))$

We can write out a similar algorithm pretty easily:

bezout :: Integer -> Integer -> (Integer, Integer)
bezout a b | a < b = let (x, y) = bezout b a in (y, x)
bezout a 0 = (1, 0)
bezout a b =
  let q = div a b
      r = mod a b
      (x, y) = bezout b r
  in (y, x - q * y)

The Chinese Remainder Theorem

We now have all the tools we need to tackle the CRT itself!

To set the stage again, our starting point is a system of $k$ equations:

$$ \begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \vdots \cr x &\equiv a_k \mod n_k \end{aligned} $$

and our goal here, is to find some integer $x \in \mathbb{Z}$ satisfying each of these equations.

The CRT applies in the case where $n_1, n_2, \ldots, n_k$ are all coprime.

This means that:

$$ \gcd(n_i, n_j) = 1 $$

for any pair $i, j$.

Two Equations

Let’s first look at the case where we have just two equations:

$$ \begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv b_2 \mod n_2 \end{aligned} $$

At first this might seem tricky, but remember that we assumed that $n_1$ and $n_2$ are coprime. This means that $\gcd(n_1, n_2) = 1$. Let’s try using Bézout’s algorithm!

This gives us $m_1, m_2$ such that:

$$ m_1 n_1 + m_2 n_2 = 1 $$

Notice if that if we look at this equation modulo $n_1$, we get:

$$ m_2 n_2 \equiv 1 \mod n_1 $$

This is because modulo $n_1$, the extra multiple $m_1 n_1$ simply disappears

But multiplying anything by $1$ gives us back that same number, even modulo $n_1$.

This means that:

$$ a_1 m_2 n_2 \equiv a_1 \mod n_1 $$

But hey, that’s one of the things we wanted to be true!

We can apply a similar argument to show that:

$$ a_2 m_1 n_1 \equiv a_2 \mod n_2 $$

So, we have a solution for each part, but how can we combine them to get a solution for both parts at once?

We just add them together!

$$ a_1 m_2 n_2 + a_2 m_1 n_1 $$

This works, because modulo $n_1$, the right part disappears, and we get $a_1 m_2 n_2$, which is $a_1$, as we saw before.

Similarly, modulo $n_2$, the left part disappears, and we get $a_2 m_1 n_1$, which also works out to $a_2$, as we saw before.

This gives us a solution for the case of two equations.

Generalizing

Let’s say we have $k$ equations, instead of just $2$:

$$ \begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \vdots \cr x &\equiv a_N \mod n_N \end{aligned} $$

What we’re going to be doing is solving the first two equations, and then using that to produce a new system of $N - 1$ equations.

Let $a_{1,2}$ be a solution to the first $2$ equations, i.e.

$$ \begin{aligned} a_{1,2} &\equiv a_1 \mod n_1 \cr a_{1,2} &\equiv a_2 \mod n_2 \cr \end{aligned} $$

What we can prove is:

$$ x \equiv a_{1,2} \mod n_1 n_2 $$

if and only if

$$ \begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \end{aligned} $$

The first direction is simple. If $x \equiv a_{1, 2} \mod n_1 n_2$, then $x - a_{1, 2}$ is a multiple of $n_1 n_2$, i.e. it is $k n_1 n_2$ for some $k$. It’s then clear that $x - a_{1, 2}$ is also a multiple of $n_1$, as well as $n_2$.

This means that:

$$ \begin{aligned} x &\equiv a_{1, 2} \mod n_1 \cr x &\equiv a_{1, 2} \mod n_2 \cr \end{aligned} $$

But, remember that $a_{1, 2}$ is a solution to the first two equations, giving us:

$$ \begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \end{aligned} $$

$\square$

For the other direction, we assume that:

$$ \begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \end{aligned} $$

Using the fact that $a_{1, 2}$ is a solution, once more we get:

$$ \begin{aligned} x &\equiv a_{1, 2} \mod n_1 \cr x &\equiv a_{1, 2} \mod n_2 \cr \end{aligned} $$

This means that $(x - a_{1, 2})$ is a multiple of both $n_1$ and $n_2$,i.e.

$$ x - a_{1, 2} = k_1 n_1 = k_2 n_2 $$

But, since $\gcd(n_1, n_2) = 1$ we can show that $k_1 = k n_2$:

We can write $k_1 n_1$ as

$$ k_1 (1 - m_2 n_2) = k_2 n_2 $$

using Bézout, but then we get:

$$ k_1 = k_1 m_2 n_2 + k_2 n_2 = (k_1 m_2 n_2 + k_2) n_2 $$

showing that $k_1$ is indeed a multiple of $n_2$.

Thus we have:

$$ x - a_{1, 2} = k n_2 n_1 $$

but this means that:

$$ x \equiv a_{1, 2} \mod n_1 n_2 $$

as desired $\square$.

Since $x \equiv a_{1, 2} \mod n_1 n_2$ is equivalent to solving both of the first two equations, we can simply replace both of those equations, to get:

$$ \begin{aligned} x &\equiv a_{1, 2} \mod n_1 n_2 \cr x &\equiv a_3 \mod n_3 \cr \vdots \cr x &\equiv a_N \mod n_N \end{aligned} $$

We now have one less equation, so this works as a recursive rule.

As a base case, if we end up with a single equation:

$$ x \equiv a \mod n $$

we simply choose $a$ as our solution.

Concretely

Alrighty, that was a bit of a mouthful, but hopefully it’ll be a bit more understandable if we write this out in code.

We have:

solve :: [(Integer, Integer)] -> Integer

as our signature. This function will return the smallest positive integer solving a list of $(a, n)$ congruence pairs.

For zero equations, anything is a solution. So we just return $0$

solve [] = 0

For a single equation, the solution is obvious:

solve [(a, n)] = fullMod a n

Note that we use the fullMod function we defined earlier, so that we get the smallest positive solution. a itself would be a solution, but we normalize it here to get the “nicest” version of it.

Now for two or more equations:

solve ((a1, n1) : (a2, n2) : rest) =
  let (m1, m2) = bezout n1 n2
      a12 = a1 * m2 * n2 + a2 * m1 * n1
  in solve ((a12, n1 * n2) : rest)

So, we solve the first two equations, using Bézout, to get:

$$ a_{1,2} = a_1 m_2 n_2 + a_2 m_1 n_1 $$

And then we produce an equivalent equation $x \equiv a_{1, 2} \mod n_1 n_2$, and continue solving, having replaced those two equations with this single one.

Conclusion

Hopefully this was a clear enough explanation. The wikipedia article is actually pretty good for this subject, so I’d recommend that as a good reference.

If you like this kind of thing I’d recommend picking up an Algebra book, like “Contemporary Abstract Algebra”, or “Algebra Chapter 0”.