Secure and Verified: Post-Quantum Cryptography with F*
March 12, 2021
By Zan Xu, BlackHorse Solutions, Inc.
Cryptographic software is in a league of its own when it comes to the level of confidence needed. The algorithms involved are relatively well-specified and follow strict coding guidelines. These attributes make it a good candidate for using formal methods to guarantee code quality. As NIST nears standardizing algorithms for post-quantum security, we have been working on applying new formal methods work to post-quantum algorithms. Along the way, we've implemented the lattice-based digital signature scheme, Dilithium, in F*, a specialized language for formal verification.
F* is a functional programming language based on OCaml and F#. Its development is led jointly by INRIA and Microsoft Research. It enables us to couple proofs tightly to code by treating propositions as types in the language and theorems as functions. This style may be familiar to many as the Curry Howard isomorphism and is heavily influenced by similar theorem provers such as Coq.
The goal of F* is to produce efficient code in a user-friendly way. It produces OCaml and C code that is both human-readable enough to be manually checked while allowing us enough expressive power to produce code that is no less performant than hand-written code. Furthermore, F* is capable of automated deduction backed by the powerful Z3 SMT solver, significantly reducing the tedium of writing detailed proofs.
With F*, we can guarantee any property of our code that can be expressed in predicate logic. For example, we can require that all array accesses be provably in-bounds, giving us safe arrays without needing runtime checks. We could also prove that code never branches on secrets, or that a function always terminates, or anything else. But the most important property is the ability to match an implementation to a specification. Given two functions, we can guarantee that for all inputs, the output of both functions is the same. Nonetheless, formal methods are not a panacea -- if our spec is incorrect, our implementation will also be incorrect. What this does allow us to do is reduce the attack surface of our implementation to the attack surface of the spec, thus guaranteeing that any changes made between our implementation and spec do not change the result.
For example, we can map a low-level implementation of an algorithm to a high-level implementation. F* supports both, and we can take advantage of this to write high-level formal specs that still look natural while also producing extremely efficient low-level implementations. We can also introduce optimizations over the original algorithm, substituting a slow function but simple subroutine for an efficient but convoluted one. In this case, we get the best of both worlds -- we get the confidence of the simple subroutine, where we know it will do what we want, along with the runtime efficiency of the fast but more complicated routine.
Formal verification is obviously very hard -- adding proofs to guarantee a property will always take much more manpower than either just not checking those properties or doing runtime checks. We found that certain routines lent themselves to the prover much more easily. Z3 is very strong, so proofs than can be described as simple SMT problems (checking whether a given equation is true) can almost always proved automatically given enough resources, hence very simple numerical proofs are not a problem. But anything requiring complex or nested data structures becomes very annoying very quickly since F* must unwrap and unroll them to inspect their contents before the SMT solver can get to work on them. This means F* works somewhat better on cryptographic code than others. We did find that the logical complexity of Dilithium posed issues. For example, Dilithium has failure modes that don't exist in all other cryptographic algorithms, convoluting the proofs and adding a significant level of difficulty to the project.
There is also the issue of performance. Iterating on code requires proofs to be re-checked, which adds considerable overhead to the already tedious task of writing proofs. To deal with this, the language keeps proof hints to speed up the process of proof-checking, and unchanged proofs do not need to be re-checked. Even so, proof checking adds a very large overhead to the development process that must be planned around.
So far, all F* post-quantum implementations are lattice-based, including our own implementation of the lattice-based signature scheme, Dilithium. This appears to mostly be because lattice-based algorithms are more popular in general. Other schemes are also under consideration for standardization include code-based encryption schemes, multivariate digital signature, and supersingular isogeny-based key exchange.
- Code-based schemes, or at least the most developed implementation McEliece, requires randomization not only in key generation but also in encryption, which makes the encryption step an impure function. This can certainly be worked around and might not end up being much more than an annoyance for future McEliece F* implementers, but it will likely be awkward to deal with.
- Multivariate schemes, such as Rainbow, might very well be more suitable for formalization than Dilithium, since it appears to avoid a lot of the problems that plagued our implementation and is certainly worth looking into, so long as the underlying cryptography remains secure.
- Supersingular isogeny-based scheme is also a potential candidate for formalization, especially considering it shares similar primitives to existing elliptic curve methods which are already formalized in F*.
Despite all this, the tools we have for formalizing code are very refined and suited to the problems at hand. The level of assurance obtained by using verified code is much appreciated, hence verified code is highly preferred over unverified cryptographic code. Yet verified code has seen limited adoption even though it has been successfully incorporated in critical parts of web browsers. There is still lots of room for both the technology and adoption to grow so that verified code could be deployed to many more applications than it is currently in. We believe the utility of formal methods is indisputable and look forward to increased adoption of formal methods in crypto.
We are contributing our implementation to Hacl* (hacl-star.github.io), a cryptography library written in F*. Our code can be found here: https://github.com/BlackHorseSolutions/hacl-star.
BlackHorse is a rapidly growing technology company in the national security space with leading capabilities in cyber, electronic warfare, digital operations, artificial intelligence/machine learning, information operations, readiness support, and full spectrum operations. Headquartered in Herndon, VA, and with offices in Tampa, FL, Fayetteville, NC, and Denver, CO, we support our nation’s most pressing national security needs through contracts with the Department of Defense, Intelligence Community, and other Government agencies. BlackHorse has openings in some of the most exciting areas of National Security, which can be found at https://www.blackhorsesolutions.com/. We can also be found on LinkedIn, Twitter, Facebook, and Instagram; follow us!