On September 16, 2024, I was invited to give a talk at Google headquarters in Mountain View, California. One of the topics that I spoke about, was computer-checked proofs for papers on symmetric-key cryptography. (It’s really difficult to get these proofs right. Even the simplest proofs can contain errors as explained on page 7 of this paper.)

We’re usually taught to check the math using a computer algebra system such as SageMath, Maple, or Mathematica. (The latter is integrated in WolframAlpha.) But what if the math is out of reach for these tools?

Consider the following inequality that appears in the Chaskey paper on page 11:

(*N* – *D*_{1})! (*N* – *D*_{2})! (*N* – *D*_{3})! (*N* – *T*)! ≤ (*N* – *T* – *D*_{1} – *D*_{2} – *D*_{3})! (*N*!)^{3}

Here, the factorial *x*! is defined as *x* (*x* – 1) (*x* – 2) ··· 1. We don’t have to worry about negative integers, as the Chaskey paper clarifies that *N* ≥ *T* + *D*_{1} + *D*_{2} + *D*_{3}.

Is this formula correct? Does it hold for all values of *N*, *T*, *D*_{1}, *D*_{2}, *D*_{3}? And more importantly, how can we get a computer to check that it’s correct?

The Chaskey paper doesn’t explain how the formula was derived, even though it’s not immediately obvious. Now might be a good time to stop reading, and try to prove that this formula is correct.

Okay. Let’s see how we can approach this. First, note that the inequality becomes easy to prove if we remove two factors:

(*N* – *D*_{1})! (*N* – *D*_{2})! (*N* – *D*_{3})! ~~(~~ ≤ *N* – *T*)!~~(~~ (*N* – *T* – *D*_{1} – *D*_{2} – *D*_{3})!*N*!)^{3}

This is because we have three factors on each side of the inequality, and each factor on the left side is less than or equal to a factor on the right side. For completeness, here is how to prove this in Lean (click on the proof to open Lean 4 Web which checks that it’s correct):

```
example {D₁ D₂ D₃ N : ℕ} : (N - D₁).factorial * (N - D₂).factorial *
(N - D₃).factorial ≤ (N.factorial)^3 := by
rw [pow_three' _]
apply Nat.mul_le_mul
· apply Nat.mul_le_mul
· exact Nat.factorial_le (by omega)
· exact Nat.factorial_le (by omega)
· exact Nat.factorial_le (by omega)
```

But removing these two factors doesn’t help us at all, as (*N* – *T*)! ≥ (*N* – *T* – *D*_{1} – *D*_{2} – *D*_{3})! So, we’ll have to try a different approach.

It turns out that it’s a good idea to rewrite the inequality using the falling factorial (*x*)* _{n}*. It is defined as (

*x*)

_{n}=

*x*(

*x*– 1) (

*x*– 2) ··· (

*x*–

*n*+ 1), where the product consists of

*n*factors.

It follows that *N*! / (*N* – *D*_{1})! = (*N*)_{D₁}. If we rewrite every pair of factors in this way, we can turn

(*N* – *D*_{1})! (*N* – *D*_{2})! (*N* – *D*_{3})! (*N* – *T*)! ≤ (*N* – *T* – *D*_{1} – *D*_{2} – *D*_{3})! (*N*!)^{3}

into

(*N* – *T*)_{D₁ + D₂ + D₃} ≤ (*N*)_{D₁} (*N*)_{D₂} (*N*)_{D₃}

We can again use Lean to check that this is correct:

```
lemma MyLemma {D₁ D₂ D₃ T N : ℕ} :
(N - T).descFactorial (D₁ + D₂ + D₃) ≤
N.descFactorial D₁ * N.descFactorial D₂ * N.descFactorial D₃ := by
sorry
example {D₁ D₂ D₃ T N : ℕ} (hn : N ≥ D₁ + D₂ + D₃ + T) :
(N - D₁).factorial * (N - D₂).factorial * (N - D₃).factorial *
(N - T).factorial ≤ (N - T - D₁ - D₂ - D₃).factorial * (N.factorial)^3 := by
rw [pow_three' _]
rw [← Nat.factorial_mul_descFactorial (show D₁ + D₂ + D₃ ≤ N - T by omega)]
rw [show N - T - (D₁ + D₂ + D₃) = N - T - D₁ - D₂ - D₃ by omega]
nth_rewrite 1 [← Nat.factorial_mul_descFactorial (show D₁ ≤ N by omega)]
nth_rewrite 1 [← Nat.factorial_mul_descFactorial (show D₂ ≤ N by omega)]
nth_rewrite 1 [← Nat.factorial_mul_descFactorial (show D₃ ≤ N by omega)]
ring_nf
rw [mul_assoc _ _ (N.descFactorial D₃)]
rw [mul_assoc _ _ (N.descFactorial D₂ * N.descFactorial D₃)]
rw [← mul_assoc _ _ (N.descFactorial D₃)]
exact Nat.mul_le_mul_left _ MyLemma
```

Now we’re getting close. To prove the last part, we’ll need two theorems about falling factorials that Lean didn’t know about. So, I made a pull request and they’re now part of Lean’s mathematical library.

The first theorem is `descFactorial_mul_descFactorial`

, and it states that (*n*)* _{k}* (

*n*–

*k*)

_{m – k}= (

*n*)

*. We can use it as follows:*

_{m}(*N* – *T*)_{D₁ + D₂ + D₃} = (*N* – *T*)_{D₁} (*N* – *T* – *D*_{1})_{D₂} (*N* – *T* – *D*_{1} – *D*_{2})_{ D₃}

The second theorem is `descFactorial_le`

, and states that if *k* ≤ *m* then (*k*)* _{n}* ≤ (

*m*)

*. It completes the proof as follows:*

_{n}(*N* – *T*)_{D₁} (*N* – *T* – *D*_{1})_{D₂} (*N* – *T* – *D*_{1} – *D*_{2})_{ D₃} ≤ (*N*)_{D₁} (*N*)_{D₂} (*N*)_{D₃}

In Lean, we have:

```
lemma MyLemma {D₁ D₂ D₃ T N : ℕ} :
(N - T).descFactorial (D₁ + D₂ + D₃) ≤
N.descFactorial D₁ * N.descFactorial D₂ * N.descFactorial D₃ := by
rw [← Nat.descFactorial_mul_descFactorial (show D₁ ≤ D₁ + D₂ + D₃ by omega)]
rw [show D₁ + D₂ + D₃ - D₁ = D₂ + D₃ by omega]
rw [mul_comm]
rw [mul_assoc]
apply Nat.mul_le_mul
· exact Nat.descFactorial_le D₁ (show N - T ≤ N by omega)
· rw [← Nat.descFactorial_mul_descFactorial (show D₂ ≤ D₂ + D₃ by omega)]
rw [mul_comm]
rw [show D₂ + D₃ - D₂ = D₃ by omega]
apply Nat.mul_le_mul
· exact Nat.descFactorial_le D₂ (show N - T - D₁ ≤ N by omega)
· exact Nat.descFactorial_le D₃ (show N - T - D₁ - D₂ ≤ N by omega)
```

There’s another inequality on page 10 of the Chaskey paper. This one can be proven without any effort:

```
example {D₁ D₂ D₃ D T : ℕ} (hd : D₁ + D₂ + D₃ = D) :
2*D₁*D₂ + 2*D₁*D₃ + 2*D₂*D₃ + 2*D₁*T + 2*D₂*T + 2*D₃*T
≤ D^2 + 2*D*T := by nlinarith
```

These two inequalities complete the proof using Patarin’s H-coefficient technique, a common method in symmetric-key cryptography.

If we want to use Lean and other theorem provers, they will need to know about all the mathematical theorems that are necessary to prove a result. Freek Wiedijk has compiled a list with computer-checked proofs for the “top 100” of mathematical theorems. Only the proof of Fermat’s Last Theorem is missing, but there is an ongoing effort to prove this in Lean.

Lastly, I wouldn’t be surprised if the syntax of Lean looks confusing, and if it’s unclear which commands to use. In Lean, we can use `exact?`

and `apply?`

to get some hints. (To get started with Lean, I recommend the Natural Number Game as a starting point. It’s a highly addictive game to learn Lean that takes about an hour to complete.)

But before I get complaints that I don’t mention AI anywhere in this post, I should mention that Google DeepMind reached the silver-medal standard solving International Mathematical Olympiad problems. Their approach uses Lean as a backend.

For comments or questions (and suggestions on how to write the proofs more cleanly!), feel free to send me an email: