Chinese Remainder Theorem for Programmers

This is a quick post about the Chinese Remainder Theorem. Specifically, how to use it to solve a system of system of simple modular equations.

This came up today in an Advent of Code problem, and enough people were asking questions about the theorem to warrant a little blog post, in my opinion.

In brief, the CRT lets us solve a system of equations:

xa1modn1xakmodnk\begin{aligned} x &\equiv a_1 \mod n_1 \cr \vdots \cr x &\equiv a_k \mod n_k \end{aligned}

by efficiently finding the xx that satisfies all of these constraints, provided that each of the nin_i are coprime, i.e. share no common divisors.

The Modulo Operation

The behavior of mod should be pretty well understood, at least for positive numbers. mod x m means dividing x with m, and looking at what's left, so we have:

mod 220 2 == 0
mod 221 2 == 1
mod 67 16 == 3

etc.

Another way of defining this is to say that mod(x,m)mod(x, m) is the difference between xx, and the largest multiple of mm less than xx, i.e.

mod(x,m):=xkm\text{mod}(x, m) := x - k \cdot m

where kmxk \cdot m \leq x, and kmx.kk\forall k' \cdot m \leq x. \quad k' \leq k.

The behavior of mod for negative numbers varies depending on the programming language. In haskell, for example, mod (-x) m is -(mod x m). But according to the mathematical definition we've just given, this wouldn't be the case.

For example mod(1,3)=2\text{mod}(-1, 3) = 2, because the closest multiple of 33, below 1-1, would be 3-3, and there's a difference of 22 in order to reach 1-1 from there. Similarly, we would have mod(5,3)=1\text{mod}(-5, 3) = 1, because the closest multiple of 33 here would be 6-6.

We can define a function to emulate this behavior, in languages where mod (-x) m = -(mod x m):

fullMod :: Integer -> Integer -> Integer
fullMod x m = mod (mod x m + m) m

If x is positive, then we have mod x m + m, and we take that modulo m. But, adding a multiple of m, doesn't change the modulus. This is because mod(a+b,m)=mod(mod(a,m)+mod(b,m),m)\text{mod}(a + b, m) = \text{mod}(\text{mod}(a, m) + \text{mod}(b, m), m), and mod(m,m)=0\text{mod}(m, m) = 0.

If x is negative, then we end up with -(mod x m) + m, i.e. m - mod x m after the first step. This is the correct behavior, since if we have something like mod(5,3)\text{mod}(-5, 3), we want to end up with 11, but instead we get 2-2 first. By adding this to 33, we end up with the 11 we wanted. We don't need to do another reduction at this point, so the extra mod _ m changes nothing.

Modular Arithmetic

If we look at all of the integers, modulo some m, we end up with the set of numbers {0,1,,m1}\{0, 1, \ldots, m - 1\}. Since these are still integers, we can carry out addition and multiplication as usual, but remembering to take the modulus with m afterwards. This gives us a ring Z/mZ\mathbb{Z}/m\mathbb{Z}, or CmC_m, which is quite a bit shorter.

For example, in C3C_3, we have 1+2=01 + 2 = 0, 22=12 \cdot 2 = 1, and other possibilities.

[!note] Note: To be super pedantic, the elements of CmC_m are not {0,,m1}\{0, \ldots, m -1 \}, but rather {[0],[1],}\{[0], [1], \ldots\}, where [x][x] denotes the set of integers with a given modulus xx. In practice, these work out the same, and we won't need this level of detail.

For each integer xZx \in \mathbb{Z}, we have a corresponding modulus in CmC_m. Many integers will have the same modulus. To express that two integers have the same modulus, we say:

xymodmx \equiv y \mod m

Usually, yy will be reduced completely, and so we this means mod(x,m)=y\text{mod}(x, m) = y. The subtle difference here is that mod\text{mod} outputs an integer in the range {0,,m1}\{0, \ldots, m - 1\}, but in the equivalence equation, xx and yy don't necessarily have to be in this range.

As further examples, we have:

60mod351mod352mod3\begin{aligned} 6 &\equiv 0 &\mod 3 \cr 5 &\equiv -1 &\mod 3 \cr -5 &\equiv 2 &\mod 3 \end{aligned}

One useful property is that if:

xymodmx \equiv y \mod m

Then:

xy0modmx - y \equiv 0 \mod m

i.e. xyx - y is a multiple of mm.

Bézout's Algorithm

We just need one more tool before we can tackle the CRT: Bézout's Algorithm. This algorithm states that if we have two integers a,bZa, b \in \mathbb{Z}, and their greatest common divisor is \gcd(a,b)\gcd(a, b), then we can find two integers x,yZx, y \in \mathbb{Z}, such that:

xa+yb=\gcd(a,b)x \cdot a + y \cdot b = \gcd(a, b)

This is basically a suped up version of Euclid's algorithm, finding not only the greatest common divisor, but also the necessary integer factors to get the equation above.

Euclid's Algorithm

Remember that the gcdgcd of two numbers should return the largest number dividing both of them.

Let's go over the idea behind Euclid's algorithm first.

Note that gcd(a,0)=a\text{gcd}(a, 0) = a. Anything divides 00, so we just need a divisor of aa. The largest one is just aa itself.

Now, let's say we can decompose aa:

a=qb+ra = qb + r

This uses euclidean division to find a quotient and a remainder, with r<br < b.

We can show that any divisor of aa and bb is a divisor of bb and rr, and vice versa.

First, let's show db,drdad | b, d | r \implies d | a.

Note that if dbd | b, then b=kbdb = k_b d for some factor kbk_b.

With this in hand, it's clear that if db,drd | b, d | r, then we can write a=qb+ra = qb + r as

a=qkbd+krd=(qkb+kr)da = q k_b d + k_r d = (q k_b + k_r) d

This means that dad | a.

Now for da,dbdrd | a, d | b \implies d | r.

We can write aqb=ra - qb = r, and get

kadqkbd=rk_a d - q k_b d = r

and thus drd | r.

Since the common divisors of aa and bb are nothing more than the common divisors of bb and rr, \gcd(a,b)\gcd(a, b) must be \gcd(b,r)\gcd(b, r).

This gives us a recursive algorithm to calculate the \gcd\gcd:

gcd :: Integer -> Integer -> Integer
gcd a b | a < b = gcd b a
gcd a 0 = a
gcd a b =
  let r = mod a b
  in gcd b r

The first clause is the swapping rule, the second the base case, with gcd(a,0)\text{gcd}(a, 0), and the third implements the recursive rule we just went over.

Extending the Algorithm

Bézout's algorithm extends this to find not only the \gcd\gcd, but also two factors x,yx, y such that xa+yb=\gcd(a,b)xa + yb = \gcd(a, b)

We have a similar base case:

With \gcd(a,0)=a\gcd(a, 0) = a, it's obvious that:

1a+00=a1 \cdot a + 0 \cdot 0 = a

Now, for the general case, we decompose aa into qb+rqb + r, like before.

Let's say we've found x,yx, y such that:

xb+yr=\gcd(b,r)=\gcd(a,b)xb + yr = \gcd(b, r) = \gcd(a, b)

But, we can write rr as:

r=aqbr = a - qb

Which gives us:

\gcd(a,b)=xb+yr=xb+y(aqb)=ya+(xqy)b\gcd(a, b) = xb + yr = xb + y(a - qb) = ya + (x - qy)b

So (x,y)(x, y) becomes (y,(xq))(y, (x - q))

We can write out a similar algorithm pretty easily:

bezout :: Integer -> Integer -> (Integer, Integer)
bezout a b | a < b = let (x, y) = bezout b a in (y, x)
bezout a 0 = (1, 0)
bezout a b =
  let q = div a b
      r = mod a b
      (x, y) = bezout b r
  in (y, x - q * y)

The Chinese Remainder Theorem

We now have all the tools we need to tackle the CRT itself!

To set the stage again, our starting point is a system of kk equations:

xa1modn1xa2modn2xakmodnk\begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \vdots \cr x &\equiv a_k \mod n_k \end{aligned}

and our goal here, is to find some integer xZx \in \mathbb{Z} satisfying each of these equations.

The CRT applies in the case where n1,n2,,nkn_1, n_2, \ldots, n_k are all coprime.

This means that:

\gcd(ni,nj)=1\gcd(n_i, n_j) = 1

for any pair i,ji, j.

Two Equations

Let's first look at the case where we have just two equations:

xa1modn1xb2modn2\begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv b_2 \mod n_2 \end{aligned}

At first this might seem tricky, but remember that we assumed that n1n_1 and n2n_2 are coprime. This means that \gcd(n1,n2)=1\gcd(n_1, n_2) = 1. Let's try using Bézout's algorithm!

This gives us m1,m2m_1, m_2 such that:

m1n1+m2n2=1m_1 n_1 + m_2 n_2 = 1

Notice if that if we look at this equation modulo n1n_1, we get:

m2n21modn1m_2 n_2 \equiv 1 \mod n_1

This is because modulo n1n_1, the extra multiple m1n1m_1 n_1 simply disappears

But multiplying anything by 11 gives us back that same number, even modulo n1n_1.

This means that:

a1m2n2a1modn1a_1 m_2 n_2 \equiv a_1 \mod n_1

But hey, that's one of the things we wanted to be true!

We can apply a similar argument to show that:

a2m1n1a2modn2a_2 m_1 n_1 \equiv a_2 \mod n_2

So, we have a solution for each part, but how can we combine them to get a solution for both parts at once?

We just add them together!

a1m2n2+a2m1n1a_1 m_2 n_2 + a_2 m_1 n_1

This works, because modulo n1n_1, the right part disappears, and we get a1m2n2a_1 m_2 n_2, which is a1a_1, as we saw before.

Similarly, modulo n2n_2, the left part disappears, and we get a2m1n1a_2 m_1 n_1, which also works out to a2a_2, as we saw before.

This gives us a solution for the case of two equations.

Generalizing

Let's say we have kk equations, instead of just 22:

xa1modn1xa2modn2xaNmodnN\begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \vdots \cr x &\equiv a_N \mod n_N \end{aligned}

What we're going to be doing is solving the first two equations, and then using that to produce a new system of N1N - 1 equations.

Let a1,2a_{1,2} be a solution to the first 22 equations, i.e.

a1,2a1modn1a1,2a2modn2\begin{aligned} a_{1,2} &\equiv a_1 \mod n_1 \cr a_{1,2} &\equiv a_2 \mod n_2 \cr \end{aligned}

What we can prove is:

xa1,2modn1n2x \equiv a_{1,2} \mod n_1 n_2

if and only if

xa1modn1xa2modn2\begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \end{aligned}

The first direction is simple. If xa1,2modn1n2x \equiv a_{1, 2} \mod n_1 n_2, then xa1,2x - a_{1, 2} is a multiple of n1n2n_1 n_2, i.e. it is kn1n2k n_1 n_2 for some kk. It's then clear that xa1,2x - a_{1, 2} is also a multiple of n1n_1, as well as n2n_2.

This means that:

xa1,2modn1xa1,2modn2\begin{aligned} x &\equiv a_{1, 2} \mod n_1 \cr x &\equiv a_{1, 2} \mod n_2 \cr \end{aligned}

But, remember that a1,2a_{1, 2} is a solution to the first two equations, giving us:

xa1modn1xa2modn2\begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \end{aligned}

\square

For the other direction, we assume that:

xa1modn1xa2modn2\begin{aligned} x &\equiv a_1 \mod n_1 \cr x &\equiv a_2 \mod n_2 \cr \end{aligned}

Using the fact that a1,2a_{1, 2} is a solution, once more we get:

xa1,2modn1xa1,2modn2\begin{aligned} x &\equiv a_{1, 2} \mod n_1 \cr x &\equiv a_{1, 2} \mod n_2 \cr \end{aligned}

This means that (xa1,2)(x - a_{1, 2}) is a multiple of both n1n_1 and n2n_2,i.e.

xa1,2=k1n1=k2n2x - a_{1, 2} = k_1 n_1 = k_2 n_2

But, since \gcd(n1,n2)=1\gcd(n_1, n_2) = 1 we can show that k1=kn2k_1 = k n_2:

We can write k1n1k_1 n_1 as

k1(1m2n2)=k2n2k_1 (1 - m_2 n_2) = k_2 n_2

using Bézout, but then we get:

k1=k1m2n2+k2n2=(k1m2n2+k2)n2k_1 = k_1 m_2 n_2 + k_2 n_2 = (k_1 m_2 n_2 + k_2) n_2

showing that k1k_1 is indeed a multiple of n2n_2.

Thus we have:

xa1,2=kn2n1x - a_{1, 2} = k n_2 n_1

but this means that:

xa1,2modn1n2x \equiv a_{1, 2} \mod n_1 n_2

as desired \square.

Since xa1,2modn1n2x \equiv a_{1, 2} \mod n_1 n_2 is equivalent to solving both of the first two equations, we can simply replace both of those equations, to get:

xa1,2modn1n2xa3modn3xaNmodnN\begin{aligned} x &\equiv a_{1, 2} \mod n_1 n_2 \cr x &\equiv a_3 \mod n_3 \cr \vdots \cr x &\equiv a_N \mod n_N \end{aligned}

We now have one less equation, so this works as a recursive rule.

As a base case, if we end up with a single equation:

xamodnx \equiv a \mod n

we simply choose aa as our solution.

Concretely

Alrighty, that was a bit of a mouthful, but hopefully it'll be a bit more understandable if we write this out in code.

We have:

solve :: [(Integer, Integer)] -> Integer

as our signature. This function will return the smallest positive integer solving a list of (a,n)(a, n) congruence pairs.

For zero equations, anything is a solution. So we just return 00

solve [] = 0

For a single equation, the solution is obvious:

solve [(a, n)] = fullMod a n

Note that we use the fullMod function we defined earlier, so that we get the smallest positive solution. a itself would be a solution, but we normalize it here to get the "nicest" version of it.

Now for two or more equations:

solve ((a1, n1) : (a2, n2) : rest) =
  let (m1, m2) = bezout n1 n2
      a12 = a1 * m2 * n2 + a2 * m1 * n1
  in solve ((a12, n1 * n2) : rest)

So, we solve the first two equations, using Bézout, to get:

a1,2=a1m2n2+a2m1n1a_{1,2} = a_1 m_2 n_2 + a_2 m_1 n_1

And then we produce an equivalent equation xa1,2modn1n2x \equiv a_{1, 2} \mod n_1 n_2, and continue solving, having replaced those two equations with this single one.

Conclusion

Hopefully this was a clear enough explanation. The wikipedia article is actually pretty good for this subject, so I'd recommend that as a good reference.

If you like this kind of thing I'd recommend picking up an Algebra book, like "Contemporary Abstract Algebra", or "Algebra Chapter 0".