[PRL] Fwd: Whatever happened to partial evaluation?

Mon Nov 28 13:56:45 EST 2011

---------- Forwarded message ----------
From: Alexey Radul <axch at mit.edu>
To: prl at lists.ccs.neu.edu
Date: Mon, 28 Nov 2011 10:54:11 -0500
Subject: Whatever happened to partial evaluation?

Hello PRL,

A question for your collective wisdom:

What ever happened to partial evaluation?  It seems like such a
promising compilation technique -- the ability to write embedded
interpreters with no performance cost could be called the holy grail
of the Lisp Way -- but research interest appears to have fallen off a
cliff at the turn of the century.  partial-eval.org is dead.  The
partial evaluation section on read-scheme.org just stops.  I may be
misreading the paper titles and abstracts, but PEPM (the workshop on
Partial Evaluation and Program Manipulation) seems to have drifted off
to be pretty much just Program Manipulation.

My question is, what happened?  Did the community decide that partial
evaluation is done, and it's up to industry to adopt it?  If so, am I
right that industry largely did not?  Also, is there any good summary
of the research consensus about the "right way" to do partial
evaluation, or the tradeoffs that it is up to industry to make?

If not, did the community decide that partial evaluation doesn't work
and give up?  If they did, is there any summary anywhere of what the
insurmountable problem turned out to be?  And what ways to surmount it
were attempted and why they didn't work?  I have a hunch that the
killer is intermediate expression bulge of huge numbers of specialized
variants during compilation, but I would really appreciate reading a
cogent discussion of attempts to mitigate this problem and how well or
poorly they worked.

Alternately, is research activity still proceeding under a different
name?  What are "supercompilation" and "distillation", and how are
they related?

The reason I ask is that my group is trying to get automatic
differentiation to be both first-class and efficient.  (Why?  Because
AD can generate gradients of machine learning models without manual
input -- if made practical, this could revolutionize machine
learning.)  Efficient but not first-class, and in particular poorly
nestable, automatic differentiation can be had from the
source-to-source transformers at autodiff.org; first-class but
inefficient automatic differentiation can be had by operator
overloading.  We were eying partially evaluating away all the
allocation overhead of operator overloading to get a first-class and
efficient implementation.

Does this sound like a plausible research agenda?  What is the
canonical reference for how to implement a partial evaluator for this
kind of purpose?  (Yes, I read Jones, Gomard, and Sestoft 1993
"Partial Evaluation and Program Generation"; is there anything more
modern?)  Else, is there argumentation that partial evaluation will
not serve this purpose?

Thank you,
~Alexey Radul