This is a quick post about the Chinese Remainder Theorem. Specifically, how to use it to solve a system of system of simple modular equations.
This came up today in an Advent of Code problem, and enough people were asking questions about the theorem to warrant a little blog post, in my opinion.
In brief, the CRT lets us solve a system of equations:
by efficiently finding the that satisfies all of these constraints, provided that each of the are coprime, i.e. share no common divisors.
The Modulo Operation
The behavior of mod
should be pretty well understood, at least for positive numbers.
mod x m
means dividing x
with m
, and looking at what’s left, so we have:
mod 220 2 == 0
mod 221 2 == 1
mod 67 16 == 3
etc.
Another way of defining this is to say that is the difference between , and the largest multiple of less than , i.e.
where , and .
The behavior of mod
for negative numbers varies depending on the programming language.
In haskell, for example, mod (-x) m
is -(mod x m)
. But according to the mathematical
definition we’ve just given, this wouldn’t be the case.
For example , because the closest multiple of , below , would be , and there’s a difference of in order to reach from there. Similarly, we would have , because the closest multiple of here would be .
We can define a function to emulate this behavior, in languages where mod (-x) m
= -(mod x m)
:
fullMod :: Integer -> Integer -> Integer
fullMod x m = mod (mod x m + m) m
If x
is positive, then we have mod x m + m
, and we take that modulo m
. But, adding a multiple of m
,
doesn’t change the modulus. This is because ,
and .
If x
is negative, then we end up with -(mod x m) + m
, i.e. m - mod x m
after the first step. This is the correct behavior,
since if we have something like , we want to end up with , but instead we
get first. By adding this to , we end up with the we wanted. We don’t need to do another reduction
at this point, so the extra mod _ m
changes nothing.
Modular Arithmetic
If we look at all of the integers, modulo some m
, we end up with the set of numbers .
Since these are still integers, we can carry out addition and multiplication as usual, but remembering
to take the modulus with m
afterwards. This gives us a ring , or , which is quite a bit shorter.
For example, in , we have , , and other possibilities.
Note:
To be super pedantic, the elements of are not , but rather , where denotes the set of integers with a given modulus . In practice, these work out the same, and we won’t need this level of detail.
For each integer , we have a corresponding modulus in . Many integers will have the same modulus. To express that two integers have the same modulus, we say:
Usually, will be reduced completely, and so we this means . The subtle difference here is that outputs an integer in the range , but in the equivalence equation, and don’t necessarily have to be in this range.
As further examples, we have:
One useful property is that if:
Then:
i.e. is a multiple of .
Bézout’s Algorithm
We just need one more tool before we can tackle the CRT: Bézout’s Algorithm. This algorithm states that if we have two integers , and their greatest common divisor is , then we can find two integers , such that:
This is basically a suped up version of Euclid’s algorithm, finding not only the greatest common divisor, but also the necessary integer factors to get the equation above.
Euclid’s Algorithm
Remember that the of two numbers should return the largest number dividing both of them.
Let’s go over the idea behind Euclid’s algorithm first.
Note that . Anything divides , so we just need a divisor of . The largest one is just itself.
Now, let’s say we can decompose :
This uses euclidean division to find a quotient and a remainder, with .
We can show that any divisor of and is a divisor of and , and vice versa.
First, let’s show .
Note that if , then for some factor .
With this in hand, it’s clear that if , then we can write as
This means that .
Now for .
We can write , and get
and thus .
Since the common divisors of and are nothing more than the common divisors of and , must be .
This gives us a recursive algorithm to calculate the :
gcd :: Integer -> Integer -> Integer
gcd a b | a < b = gcd b a
gcd a 0 = a
gcd a b =
let r = mod a b
in gcd b r
The first clause is the swapping rule, the second the base case, with , and the third implements the recursive rule we just went over.
Extending the Algorithm
Bézout’s algorithm extends this to find not only the , but also two factors such that
We have a similar base case:
With , it’s obvious that:
Now, for the general case, we decompose into , like before.
Let’s say we’ve found such that:
But, we can write as:
Which gives us:
So becomes
We can write out a similar algorithm pretty easily:
bezout :: Integer -> Integer -> (Integer, Integer)
bezout a b | a < b = let (x, y) = bezout b a in (y, x)
bezout a 0 = (1, 0)
bezout a b =
let q = div a b
r = mod a b
(x, y) = bezout b r
in (y, x - q * y)
The Chinese Remainder Theorem
We now have all the tools we need to tackle the CRT itself!
To set the stage again, our starting point is a system of equations:
and our goal here, is to find some integer satisfying each of these equations.
The CRT applies in the case where are all coprime.
This means that:
for any pair .
Two Equations
Let’s first look at the case where we have just two equations:
At first this might seem tricky, but remember that we assumed that and are coprime. This means that . Let’s try using Bézout’s algorithm!
This gives us such that:
Notice if that if we look at this equation modulo , we get:
This is because modulo , the extra multiple simply disappears
But multiplying anything by gives us back that same number, even modulo .
This means that:
But hey, that’s one of the things we wanted to be true!
We can apply a similar argument to show that:
So, we have a solution for each part, but how can we combine them to get a solution for both parts at once?
We just add them together!
This works, because modulo , the right part disappears, and we get , which is , as we saw before.
Similarly, modulo , the left part disappears, and we get , which also works out to , as we saw before.
This gives us a solution for the case of two equations.
Generalizing
Let’s say we have equations, instead of just :
What we’re going to be doing is solving the first two equations, and then using that to produce a new system of equations.
Let be a solution to the first equations, i.e.
What we can prove is:
if and only if
The first direction is simple. If , then is a multiple of , i.e. it is for some . It’s then clear that is also a multiple of , as well as .
This means that:
But, remember that is a solution to the first two equations, giving us:
For the other direction, we assume that:
Using the fact that is a solution, once more we get:
This means that is a multiple of both and ,i.e.
But, since we can show that :
We can write as
using Bézout, but then we get:
showing that is indeed a multiple of .
Thus we have:
but this means that:
as desired .
Since is equivalent to solving both of the first two equations, we can simply replace both of those equations, to get:
We now have one less equation, so this works as a recursive rule.
As a base case, if we end up with a single equation:
we simply choose as our solution.
Concretely
Alrighty, that was a bit of a mouthful, but hopefully it’ll be a bit more understandable if we write this out in code.
We have:
solve :: [(Integer, Integer)] -> Integer
as our signature. This function will return the smallest positive integer solving a list of congruence pairs.
For zero equations, anything is a solution. So we just return
solve [] = 0
For a single equation, the solution is obvious:
solve [(a, n)] = fullMod a n
Note that we use the fullMod
function we defined earlier, so that we get the smallest positive
solution. a
itself would be a solution, but we normalize it here to get the “nicest” version
of it.
Now for two or more equations:
solve ((a1, n1) : (a2, n2) : rest) =
let (m1, m2) = bezout n1 n2
a12 = a1 * m2 * n2 + a2 * m1 * n1
in solve ((a12, n1 * n2) : rest)
So, we solve the first two equations, using Bézout, to get:
And then we produce an equivalent equation , and continue solving, having replaced those two equations with this single one.
Conclusion
Hopefully this was a clear enough explanation. The wikipedia article is actually pretty good for this subject, so I’d recommend that as a good reference.
If you like this kind of thing I’d recommend picking up an Algebra book, like “Contemporary Abstract Algebra”, or “Algebra Chapter 0”.