CakeML projects

Projects

As the CakeML project has grown, it has become harder to navigate all of its parts. This page offers an overview of CakeML related projects and indicates which are the most active ones currently.

Most active projects

Verified Checkers

We have built (and are continuing to build) checker programs that can analyze the correctness of proof logs for automated reasoning tools in various theories (TACAS'21,AAAI'24).

All of these checkers have been proved correct down to the level of machine code that runs them.

We see these checkers as an excellent use case for CakeML, and are continuing to build CakeML tooling for verified checkers.

Do you have a proof log format you'd like a verified checker for? Get in touch on Zulip or email!

PureCake Compiler

PureCake is a verified compiler for a Haskell-inspired, lazy, purely functional programming language with monadic effects.

The PureCake compiler is built on top of the CakeML compiler. It demonstrates that the CakeML source language can be used (in its untyped interpretation) as a compilation target for other compilers.

We are actively adding features to the source language and improving the compilation of thunks, including adding thunk-related optimisations to CakeML's verified garbage collector (in the style of OCaml).

Pancake Compiler

Pancake is a new 'thin' language constructed with the aims of (1) ease of source-level verification, (2) user-controlled small memory footprint, and (3) simple transportation of correctness results from source to executable binary code. The initial intended use of the Pancake language is to write verified device drivers.

The Pancake compiler builds on the lower parts of the CakeML compiler backend.

CakeML Compiler Backend

The CakeML compiler (depicted in the diagram on the front page) is a central part of the CakeML project.

We are always looking for ways to make it generate more performant code. Currently, the focus is on:

improving WordLang and LabLang,
speeding up the verified bignum library,
implementing new optimisations (e.g. this one).

Contributing a new optimisation has been a popular project topic for interns and MSc students.

Less active projects

Candle Theorem Prover

Candle is a fully verified interactive theorem prover for higher-order logic, more specifically: a fully verified clone of HOL Light running on top of CakeML.

The Candle theorem prover goes beyond the original HOL Light by including a verified kernel primitive for fast computation (in a first-order subset of the logic).

There is also work on adding support for ad hoc overloading of constants in the style of Isabelle/HOL.

A verified checker for OpenTheory article files came out as a side effect of this work on Candle.

Compiler Bootstrapping

The CakeML compiler can bootstrap itself inside the logic of HOL4. This is one of the novel aspects of the CakeML compiler, which we have underlined since the beginning of the CakeML project.

The original paper on CakeML's bootstrap (POPL'14) was not as clear as it could have been, so a separate description (CPP'21) has also been written.

Recently, we are working on speeding up the in-logic evaluation of the compiler when applied to itself.

Currently idle projects

A Verified Hardware-Software Stack

We have extended the CakeML project also in the direction of verified hardware. The hardware target is Silver, a verified proof-of-concept processor. The result is an approach to producing verified stacks that scales to proving correctness, at the hardware level, of the execution of realistic software including compilers and proof checkers.

Choreographies

Choreographies are an abstraction for globally describing deadlock-free communicating systems. The CakeML source language has been used as a compilation target for a verified compiler, called Kalas, that compiles choreographic language to one CakeML program for each endpoint in the network. The end-to-end correctness result ensures all generated endpoints adhere to the system description, preserving the top-level communication guarantees.

Verified Space-Cost Semantics

The CakeML source language is a high-level programming language with a runtime that include a (verified) garbage collector that provides a layer of abstraction that makes memory seem unbounded. In order to answer questions such as Will this program resort to an out-of-heap-space or out-of-stack-space error? we have developed a space cost semantics that we have proved sound w.r.t. to the CakeML compiler.

Floating-Point Numbers and CakeML

There has been work, mostly by Heiko Becker, on floating-point numbers in CakeML. He developed a verified certificate checker for floating-point error bounds (FMCAD'18), and made it possible for the CakeML compiler to perform fast-math style optimisations during compilation of nummerical code (ECOOP'22, CAV'19).

Proof-Producing Synthesis of Code

The most effective way of producing verified CakeML code is to start by writing the functions as shallow embedded functions in HOL and then apply a proof-producing code synthesis tool, called the translator, to the HOL functions. The translator generates CakeML code (as a deep embedding) which it proves implements the given shallow embeddings.

Even though the CakeML translator (JFP, JAR'20) has not been further developed recently, it is an important much-used cornerstone of the CakeML project.

Characteristic Formulae for CakeML

For some applications, e.g. the verified checkers mentioned above, we want very precise control over what CakeML code is being generated and verified. In these scenarios, users are encouraged to use CakeML's version of CFML, which is a separation logic-style program logic for reasoning about ML programs. CakeML's version of CF is proved sound w.r.t. the CakeML source semantics. It supports reasoning about mutable state, exceptions and I/O.

Read more (ESOP'17) — see also ITP'19 and VSTTE'18

Compiler Explorer

We have always wanted ways to visualise the how programs get compiled by the CakeML compiler backend. There was an ambitious student project in this direction (TFP'17) some time ago. The current CakeML compiler no longer supports the full features of this student project, but it will print out each intermediate representation of a program it compiles if given --explore as an option on the command line.

Up-coming new projects

Dafny-to-CakeML Backend for Dafny

We have a small amount of funding for implementing a CakeML-generating backend for the Dafny compiler. This project is still in the starting blocks.

[ Your suggestion here! ]

Let us know (via Zulip or email) if you'd like to take CakeML in some new direction. We can potentially collaborate to make that a reality!

Join us!

We welcome students / collaborators to join us on these projects. We also have a number of project suggestions that are either subprojects of those listed above, or more ambitious standalone projects. Get in contact if you are interested!