 Verified Code Generation from Isabelle/HOL

Doctoral Dissertation, 2019
In this thesis, I develop a verified compilation toolchain from executable specifications in Isabelle/HOL to CakeML abstract syntax trees. This improves over the stateoftheart in Isabelle by providing a trustworthy...
In this thesis, I develop a verified compilation toolchain from executable specifications in Isabelle/HOL to CakeML abstract syntax trees.
This improves over the stateoftheart in Isabelle by providing a trustworthy procedure for code generation.
The work consists of three major contributions.
First, I have implemented a certifying routine to eliminate type classes and instances in Isabelle specifications.
Based on defining equations of constants, it derives new definitions that do not use type classes.
This can be used to bypass an unverified step in the current code generator.
Second, I formalized an algebra for higherorder λterms that generalizes the notions of free variables, matching, and substitution.
Terms can be thought of as consisting of a generic (free variables, constants, application) and a specific part (abstraction, bound variables).
With this algebra, it becomes possible to reason abstractly over a variety of different types.
These two parts are independent from each other and can also be used for other purposes.
For example, I have successfully instantiated the term algebra for other term types in the Isabelle universe.
Third, a compiler that works similarly to the existing code generator, but produces a CakeML abstract syntax tree together with a correctness theorem.
More precisely, I have combined a simple proof producing translation of recursion equations in Isabelle into a deeply embedded term language with a fully verified compilation chain to the target language CakeML.
 A Verified Code Generator from Isabelle/HOL to CakeML

Archive of Formal Proofs, 2019
This entry contains the formalization that accompanies my PhD thesis. I develop a verified compilation toolchain from executable specifications in Isabelle/HOL to CakeML abstract syntax trees. This improves over the...
This entry contains the formalization that accompanies my PhD thesis. I develop a verified compilation toolchain from executable specifications in Isabelle/HOL to CakeML abstract syntax trees. This improves over the stateoftheart in Isabelle by providing a trustworthy procedure for code generation.
 An Algebra for HigherOrder Terms

Archive of Formal Proofs, 2019
In this formalization, I introduce a higherorder term algebra, generalizing the notions of free variables, matching, and substitution. The need arose from the work on a verified compiler from Isabelle...
In this formalization, I introduce a higherorder term algebra, generalizing the notions of free variables, matching, and substitution. The need arose from the work on a verified compiler from Isabelle to CakeML. Terms can be thought of as consisting of a generic (free variables, constants, application) and a specific part. As example applications, this entry provides instantiations for deBruijn terms, terms with named variables, and Blanchette’s λfree higherorder terms. Furthermore, I implement translation functions between deBruijn terms and named terms and prove their correctness.
 Certifying Dictionary Construction in Isabelle/HOL

Fundamenta Informaticae, 2019
Type classes are a wellknown extensions to various type systems. Classes usually participate in type inference; that is, the type checker will automatically deduce class constraints and select appropriate instances....
Type classes are a wellknown extensions to various type systems. Classes usually participate in type inference; that is, the type checker will automatically deduce class constraints and select appropriate instances. Compilers for such languages face the challenge that concrete instances are generally not directly mentioned in the source text. In the runtime, type class operations need to be packaged into dictionaries that are passed around as pointers. This article presents the most common approach for compilation of type classes – the dictionary construction – carried out in a trustworthy fashion in Isabelle/HOL, a proof assistant.
 Verified iptables Firewall Analysis and Verification

Journal of Automated Reasoning (Open Access), 2018
This article summarizes our efforts around the formally verified static analysis of iptables rulesets using Isabelle/HOL. We build our work around a formal semantics of the behavior of iptables firewalls....
This article summarizes our efforts around the formally verified static analysis of
iptables rulesets using Isabelle/HOL. We build our work around a formal semantics of the
behavior of iptables firewalls. This semantics is tailored to the specifics of the filter table and
supports arbitrary match expressions, even new ones that may be added in the future. Around
that, we organize a set of simplification procedures and their correctness proofs: we include
procedures that can unfold calls to userdefined chains, simplify match expressions, and
construct approximations removing unknown or unwanted match expressions. For analysis
purposes, we describe a simplified model of firewalls that only supports a single list of rules
with limited expressiveness. We provide and verify procedures that translate from the complex
iptables language into this simple model. Based on that, we implement the verified generation
of IP space partitions and minimal service matrices. An evaluation of our work on a large
set of realworld firewall rulesets shows that our framework provides interesting results in
many situations, and can both help and outcompete other static analysis frameworks found
in related work.
 libisabelle

2018
A Scala library which talks to Isabelle. It works with multiple Isabelle versions.
A Scala library which talks to Isabelle. It works with multiple Isabelle versions.
 Isabelle/CakeML

Archive of Formal Proofs, 2018
CakeML is a functional programming language with a provencorrect compiler and runtime system. This entry contains an unofficial version of the CakeML semantics that has been exported from the Lem...
CakeML is a functional programming language with a provencorrect compiler and runtime system. This entry contains an unofficial version of the CakeML semantics that has been exported from the Lem specifications to Isabelle. Additionally, there are some handwritten theory files that adapt the exported code to Isabelle and port proofs from the HOL4 formalization, e.g. termination and equivalence proofs.
 A Verified Compiler from Isabelle/HOL to CakeML

European Symposium on Programming (ESOP, Open Access), 2018
Many theorem provers can generate functional programs from definitions or proofs. However, this code generation needs to be trusted. Except for the HOL4 system, which has a proof producing code...
Many theorem provers can generate functional programs from definitions
or proofs. However, this code generation needs to be trusted. Except for
the HOL4 system, which has a proof producing code generator for a subset of ML.
We go one step further and provide a verified compiler from Isabelle/HOL to CakeML. More precisely we combine a simple proof producing translation of recursion equations in Isabelle/HOL into a deeply embedded term language with a fully verified compilation chain to the target language CakeML.
 Dictionary Construction

Archive of Formal Proofs, 2017
Isabelle's code generator natively supports type classes. For targets that do not have language support for classes and instances, it performs the wellknown dictionary translation, as described by Haftmann and...
Isabelle's code generator natively supports type classes. For targets that do not have language support for classes and instances, it performs the wellknown dictionary translation, as described by Haftmann and Nipkow. This translation happens outside the logic, i.e., there is no guarantee that it is correct, besides the penandpaper proof. This work implements a certified dictionary translation that produces new classfree constants and derives equality theorems.
 Constructor Functions

Archive of Formal Proofs, 2017
Isabelle's code generator performs various adaptations for target languages. Among others, constructor applications have to be fully saturated. That means that for constructor calls occuring as arguments to higherorder functions,...
Isabelle's code generator performs various adaptations for target languages. Among others, constructor applications have to be fully saturated. That means that for constructor calls occuring as arguments to higherorder functions, synthetic lambdas have to be inserted. This entry provides tooling to avoid this construction altogether by introducing constructor functions.
 Lazifying case constants

Archive of Formal Proofs, 2017
Isabelle's code generator performs various adaptations for target languages. Among others, case statements are printed as match expressions. Internally, this is a sophisticated procedure, because in HOL, case statements are...
Isabelle's code generator performs various adaptations for target languages. Among others, case statements are printed as match expressions. Internally, this is a sophisticated procedure, because in HOL, case statements are represented as nested calls to the case combinators as generated by the datatype package. Furthermore, the procedure relies on laziness of match expressions in the target language, i.e., that branches guarded by patterns that fail to match are not evaluated. Similarly, ifthenelse is printed to the corresponding construct in the target language. This entry provides tooling to replace these special cases in the code generator by ignoring these target language features, instead printing case expressions and ifthenelse as functions.
 Translating Scala Programs to Isabelle/HOL

International Joint Conference on Automated Reasoning (IJCAR), 2016
We present a trustworthy connection between the Leon verification system and the Isabelle proof assistant. Leon is a system for verifying functional Scala programs. It uses a variety of automated...
We present a trustworthy connection between the Leon verification
system and the Isabelle proof assistant. Leon is a system for verifying functional
Scala programs. It uses a variety of automated theorem provers (ATPs) to check
verification conditions (VCs) stemming from the input program. Isabelle, on the
other hand, is an interactive theorem prover used to verify mathematical specifications
using its own input language Isabelle/Isar. Users specify (inductive)
definitions and write proofs about them manually, albeit with the help of semiautomated
tactics. The integration of these two systems allows us to exploit Isabelle’s
rich standard library and give greater confidence guarantees in the correctness
of analysed programs.
 Algorithms for Reduced Ordered Binary Decision Diagrams

Archive of Formal Proofs, 2016
We present a verified and executable implementation of ROBDDs in Isabelle/HOL. Our implementation relates pointerbased computation in the Heap monad to operations on an abstract definition of boolean functions. Internally,...
We present a verified and executable implementation of ROBDDs in Isabelle/HOL. Our implementation relates pointerbased computation in the Heap monad to operations on an abstract definition of boolean functions. Internally, we implemented the ifthenelse combinator in a recursive fashion, following the Shannon decomposition of the argument functions. The implementation mixes and adapts known techniques and is built with efficiency in mind.
 Clone Detection in Isabelle Theories

Isabelle Workshop, 2016
Duplicated code fragments within software projects complicate maintenance and require refactoring. Clone detection frameworks, such as ConQAT, offer wellengineered clone detection functionalities for a number of different programming languages. In...
Duplicated code fragments within software projects complicate maintenance
and require refactoring. Clone detection frameworks, such as ConQAT,
offer wellengineered clone detection functionalities for a number of different
programming languages. In this work, we developed a tool to search Isabelle theory
sources for clones. This analysis takes the rich structure of Isabelle theories
into account by extracting semantic information from document markup. After
extraction, clone detection is performed using ConQAT’s builtin facilities.
 Iptables Semantics

Archive of Formal Proofs, 2016
We present a big step semantics of the filtering behavior of the Linux/netfilter iptables firewall. We provide algorithms to simplify complex iptables rulests to a simple firewall model and to...
We present a big step semantics of the filtering behavior of the Linux/netfilter iptables firewall. We provide algorithms to simplify complex iptables rulests to a simple firewall model and to verify spoofing protection of a ruleset. Internally, we embed our semantics into ternary logic, ultimately supporting every iptables match condition by abstracting over unknowns. Using this AFP entry and all entries it depends on, we created an easytouse, standalone Haskell tool called fffuu. The tool does not require any input — except for the iptablessave dump of the analyzed firewall — and presents interesting results about the user's ruleset. RealWord firewall errors have been uncovered, and the correctness of rulesets has been proved, with the help of our tool.
 IP Addresses

Archive of Formal Proofs, 2016
This entry contains a definition of IP addresses and a library to work with them. Generic IP addresses are modeled as machine words of arbitrary length. Derived from this generic...
This entry contains a definition of IP addresses and a library to work with them. Generic IP addresses are modeled as machine words of arbitrary length. Derived from this generic definition, IPv4 addresses are 32bit machine words, IPv6 addresses are 128bit words. Additionally, IPv4 addresses can be represented in dotdecimal notation and IPv6 addresses in (compressed) colonseparated notation. We support toString functions and parsers for both notations. Sets of IP addresses can be represented with a netmask (e.g. 192.168.0.0/255.255.0.0) or in CIDR notation (e.g. 192.168.0.0/16). To provide executable code for set operations on IP address ranges, the library includes a datatype to work on arbitrary intervals of machine words.
 SemanticsPreserving Simplification of RealWorld Firewall Rule Sets

Formal Methods (FM), 2015
The security provided by a firewall for a computer network almost completely depends on the rules it enforces. For over a decade, it has been a wellknown and unsolved problem...
The security provided by a firewall for a computer network
almost completely depends on the rules it enforces. For over a decade, it
has been a wellknown and unsolved problem that the quality of many
firewall rule sets is insufficient. Therefore, there are many tools to analyze
them. However, we found that none of the available tools could
handle typical, realworld iptables rulesets. This is due to the complex
chain model used by iptables, but also to the vast amount of possible
match conditions that occur in realworld firewalls, many of which are
not understood by academic and open source tools.
In this paper, we provide algorithms to transform firewall rulesets. We
reduce the execution model to a simple list model and use ternary logic
to abstract over all unknown match conditions. These transformations
enable existing tools to understand realworld firewall rules, which we
demonstrate on four decentlysized rulesets. Using the Isabelle theorem
prover, we formally show that all our algorithms preserve the firewall’s
filtering behavior.
 Directed Security Policies: A Stateful Network Implementation

Third International Workshop on Engineering Safety and Security Systems (ESSS), 2014
Large systems are commonly internetworked. A security policy describes the communication relationship between the networked entities. The security policy defines rules, for example that A can connect to B, which...
Large systems are commonly internetworked. A security policy describes the communication relationship
between the networked entities. The security policy defines rules, for example that A can
connect to B, which results in a directed graph. However, this policy is often implemented in the
network, for example by firewalls, such that A can establish a connection to B and all packets belonging
to established connections are allowed. This stateful implementation is usually required for the
network’s functionality, but it introduces the backflow from B to A, which might contradict the security
policy. We derive compliance criteria for a policy and its stateful implementation. In particular,
we provide a criterion to verify the lack of side effects in linear time. Algorithms to automatically
construct a stateful implementation of security policy rules are presented, which narrows the gap
between formalization and realworld implementation. The solution scales to large networks, which
is confirmed by a large realworld case study. Its correctness is guaranteed by the Isabelle/HOL
theorem prover
 The Next 1100 Haskell Programmers

Haskell Symposium, 2014
We report on our experience teaching a Haskellbased functional programming course to over 1100 students for two winter terms. The syllabus was organized around selected material from various sources. Throughout...
We report on our experience teaching a Haskellbased functional
programming course to over 1100 students for two winter terms.
The syllabus was organized around selected material from various
sources. Throughout the terms, we emphasized correctness through
QuickCheck tests and proofs by induction. The submission architecture
was coupled with automatic testing, giving students the possibility
to correct mistakes before the deadline. To motivate the students,
we complemented the weekly assignments with an informal
competition and gave away trophies in a award ceremony.
 Interactive Simplifier Tracing and Debugging in Isabelle

Intelligent Computer Mathematics (CICM), 2014
The Isabelle proof assistant comes equipped with a very powerful tactic for term simplification. While tremendously useful, the results of simplifying a term do not always match the user’s expectation:...
The Isabelle proof assistant comes equipped with a very powerful
tactic for term simplification. While tremendously useful, the results
of simplifying a term do not always match the user’s expectation: sometimes,
the resulting term is not in the form the user expected, or the
simplifier fails to apply a rule. We describe a new, interactive tracing
facility which offers insight into the hierarchical structure of the simplification
with userdefined filtering, memoization and search. The new
simplifier trace is integrated into the Isabelle/jEdit Prover IDE.
 Properties of Random Graphs – Subgraph Containment

Archive of Formal Proofs, 2014
Random graphs are graphs with a fixed number of vertices, where each edge is present with a fixed probability. We are interested in the probability that a random graph contains...
Random graphs are graphs with a fixed number of vertices, where each edge is present with a fixed probability. We are interested in the probability that a random graph contains a certain pattern, for example a cycle or a clique. A very high edge probability gives rise to perhaps too many edges (which degrades performance for many algorithms), whereas a low edge probability might result in a disconnected graph. We prove a theorem about a threshold probability such that a higher edge probability will asymptotically almost surely produce a random graph with the desired subgraph.
 A Visualization Toolkit for Simplifier Traces in Isabelle/jEdit

Master's Thesis in Informatics, Technische Universität München, 2013
The Isabelle proof assistant comes equipped with some very powerful tactics to discharge goals automatically, or to at least simplify them significantly. One of these tactics is a rewriting engine,...
The Isabelle proof assistant comes equipped with some very powerful tactics to discharge goals automatically, or to at least simplify them significantly.
One of these tactics is a rewriting engine, called the simplifier.
It repeatedly applies rules to a term by replacing the lefthand side of an equation by the righthand side.
While tremendously useful, the results of simplifying a term not always match the user's expectation:
sometimes, the resulting term is not small enough, or the simplifier even failed to apply any rule.
For these cases, the simplifier offers a trace which logs all steps which have been made.
However, these traces can be huge, especially because the library of Isabelle/HOL offers many predefined rewriting rules.
It is often very difficult for the user to find the necessary piece of information about why and what exactly failed.
Furthermore, there is no way to inspect or even influence the system while the simplification is still running.
Hence, a simple, linear trace is not sufficient in these situations.
In this thesis, a new tracing facility is developed, which offers structure, interactivity and a high amount of configurability.
It combines successful approaches from other logic languages and adapts them to the Isabelle setup.
Furthermore, it fits neatly into the canonical IDE for Isabelle and is thus easy to use.
 Development of an associative file system

Bachelor's Thesis in Informatics, Technische Universität München, 2011
Organizing multimedia data, e. g. pictures, music or videos is a rather common use case for modern file systems. There are quite a number of applications which try to expose...
Organizing multimedia data, e. g. pictures, music or videos is a rather common use
case for modern file systems. There are quite a number of applications which try to
expose an userfriendly interface for dealing with tagging, sorting and editing these
files. This becomes necessary because sets of such files do not have an intrinsic
hierarchic structure. For example, pictures taken with a digital camera carry EXIF
metadata which can be used to retrieve a picture by date, time or location instead
of an artificial folder structure.
However, the major problem shared by all of those multimedia applications is
that the files are actually stored in folders on a traditional file system. As such,
any operation done by an user outside of the application leads to inconsistencies
inside of the application. Also, metadata produced by one application cannot be
consumed by another one because of proprietary formats.
In this thesis, a file system which uses the established RDF standard for storing
metadata is developed which imposes only little structural requirements on the
data. The system features both an API which enables highlevel operations on file
contents and metadata and a CLI which resembles ideas from versioning systems
like Git.
Also, a formalization of the most important operations is given, including a
concept of transactions, which has been adapted from relational database systems
to fit in the environment of a file system.