[PRL] Torture Chamber Wednesday 3:00 PM, WVH 366

Tue Sep 4 11:25:14 EDT 2012

Hello everyone,
  I'll be doing a practice run of my HOPE talk "Verifying an Open Compiler from ML to Assembly" tomorrow at 3:00 in 366.  An abstract is below.  Please come and torture me!
 --Jamie

There has been remarkable progress in recent years on proving
correctness of compilers for increasingly realistic languages. Most of
this work, however, proves compiler correctness under a closed-world
assumption, that is, assuming that the verified compiler will always
compile whole programs.  This is an unrealistic assumption since the
"whole" programs that we run are invariably comprised of code linked
together from various sources (including low-level libraries, code
compiled using other compilers, and possibly code written in different
source languages).  To address this problem, Benton and Hur have
proposed that one provide an extensional specification (via a logical
relation) of when a low-level program correctly realizes a high-level
one, and then prove that the compiler respects this high-low
relationship.  While a big step forward, this approach has serious
practical limitations: it does not support linking with non-trivial
target code produced from a different source language (only with code
produced by different compilers for the same source language), and it
seems nontrivial to scale to multi-pass compilers.

We propose a new approach to verifying an open compiler: we give an
operational semantics for interoperability between the compiler's
source and target languages (S and T) by embedding theme into a single
language using boundaries in the style of Matthews and Findler.
Compiler correctness can then be stated as contextual equivalence in
the combined language: we prove that if e compiles to e', then e is
equivalent to ST(e') -- where ST(e') represents the target code e'
wrapped in a boundary so that it may be used as a source term.  Our
approach permits reasoning about linking with arbitrary target code
(no matter its source).  For multi-pass compilers, we simply formalize
interoperability using boundaries between each pair of adjacent
languages in the compiler; we can stack these boundaries to allow
interoperability between the compiler's source and target languages.
Since the desired compiler correctness property is a contextual
equivalence, we can reason about each compiler pass in turn, then
easily link the results together.  This talk will discuss the
high-level implications of our approach and the challenges involved in
formalizing it for source languages that include features such as
parametric polymorphism and mutable references, and for low-level
target languages.