[PRL] At long last, my ICFP notes [long]
Mitchell Wand
wand at ccs.neu.edu
Sat Oct 8 12:07:28 EDT 2005
Notes from ICFP 2005 and Workshops
================================================================
Sat Sep 24
Scheme Workshop
[I counted 24 people in the room]
================================================================
Type Classes without Types
Ron Garcia (IU)
Adds to scheme a system of "predicate classes" like Haskell type
classes. Think of this as system for generic functions and predicate
dispatch.
Interface:
define-class let-class
define-instance let-instance
define-qualified define-open-qualified ; for defining generic functions
(define-class (Eq a) ; a is a predicate variable
((== a a) (lambda (l r) (not (/= l r))))
((/= a a) (lambda (l r) (not (== l r))))))
;; predicate variables determine dispatch
interesting wrinkle: generic functions must use the instances that
are visible at the point of call, not at point of definition.
uses identifier macros for classes and generic functions.
Uses non-hygienic macros to capture correct method table inside a
let-instance.
Does not support subclassing. Not clear whether this is inherent in
the design or not.
================================================================
Eager Comprehensions: the design of SRFI 42.
Sebastian Egner
Sample: (list-ec (:range k n) k)
Three notions of comprehension:
1. Mathematical {f(x) | x in S, P(x)}
2. Lazy comprehension: the lazy list obtained from lazy list s by
filtering with the predicate p and mapping with function f
3. Eager: (do-ec (: x s) (if (p x)) (f x))
"Run variable x through sequence of bindings specified by s, for each
value thest if predicate p is satisfied, and if so call procedure f
for side-effects"
Must be able to add application-specific set-builders (comprehensions)
and enumerators (generators), eg multiset-ec and :edges in
(multiset-ec (:edges e g) (color e))
Implemented using nice hygienic macros in continuation-passing style.
Basis is imperative (but see below): do-ec, but then can build
functional-looking things on top of this. Don't want to always build
a list first.
Nice discussion of design dimensions.
Basis is not really imperative; uses rebinding loop, but side-effects
are possible.
Interesting example:
{(n,k) | n in [1,p], k in [1,n]}
Here scoping is static but scopes are non-contiguous!
================================================================
Abstraction and Performance from Explicit Monadic Reflection
Jonathan Sobel
If you implement your monad ops as macros, then you can get some
decent performance. The talk was much better than the paper. There
may be something interesting here, but I'm still not sure what it is.
================================================================
An Operational Semantics for R5RS Scheme
Jacob Matthews
Uses multilevel contexts to manage non-deterministic order of
evaluation correctly.
Also does multiple-value return better.
Implemented using PLT Redex.
Martin Gasbichler says he's formalized the Scheme macro system (both
syntax-rules and syntax-case) using PLT Redex. He says he uses
explicit substitutions. (Dave, you should contact him).
================================================================
Ubiquitous Email
Manuel Serrano
An IMAP synchronizer in Bigloo: synchronize your email on 2 different
iamp servers.
People seemed excited by this app; alas, I was still recovering from
lunch.
================================================================
A Replacement for Bibtex in Scheme
Jean-Michael Hufflen
System called MlBibtex. (Multi-lingual Bibtex)
Bibtex: widely used, stable, but: old (1991); uses old-fashioned
language for bibliography styles; advanced features are put into action
by intserting Latex commands in the output files (not so good if you
want to output xml...)
Other successors to bibtex:
Bibtex8, Bibtex++ (interprets styles in Java), Bibulus (uses Perl and
xml)
Parses .bib files to xml tree (using SXML). Bibtex entries can have
attributes for multilingual stuff. Bibtex entries can have attributes
for multilingual stuff.
.bst files replaced by "nbst": something close to XSLT (but not
identical). This is nice and declarative and "seems to be suitable
for style designers".
First implementation was in C, but this was not extensible or
modular. eg: i8n is nasty: need to worry about different alphabet
orderings, etc. Need to be able to do this in an extensible way....
Also want to be able to translate tex annotations flexibly:
\textsf{Scheme} workshop ==> <emph ..> ...</emph> Worskshop
\pgname{Scheme} workshop ==> <...>...</...> Workshop
Implemented in MIT Scheme. Does not use graphics or parser generator
(but does use side effects in an allegedly disciplined way).
Ported to Bigloo and PLT Scheme (almost) w/o difficulty. Promises
public release "by October".
He says he misses an interface for looking for files eg
\bibliography{my-entries}.
I wonder how hard it would be to do this in a tasteful way (eg with a
parser-generator and SXT?)
================================================================
I cut over to CUFP for
Jonathan Sobel:
The OMG's MDA: a marketing TLA for FPLers
In industry, people are resistant to new ideas in any language.
People seem to think compilers, etc., even for little languages, take
years and years to build.
One way to battle this is to coopt things that have lots of support,
eg MDA. (MDA = UML + making some code that doesn't break wrt the
UML). Their slogan: design at a high level and then generate the
code. Every system has a design, even if it exists only in the
programmer's head at the time he writes the code (and then immediately
lost!). So we want to allow the programmer to write high-level
/executable/ artifacts.
The OMG has produced some very nice documents that make this argument,
and argue that this is only an evolutionary step from Fortran, Java,
etc.
Don't tell them you're giving them Erlang, tell them you're giving
them a high-level modelling tool (!).
Phil Wadler was quoted as saying: "Why should we care about XML? Ans:
People are afraid of parentheses, but for some reason they're not
afraid of XML."
Suggestion from audience: go from your Erlang owhy, and generate UML
diagrams or use-case disagrams with nice little men, etc.
================================================================
Then I went over to TFP for
Kenichi Asai
Logical Relations for Delimited Continuations
Extension of [Wand 93].
Start w/ offline specializer for CBN from Wand 93.
Get offline specializer for CBV-- using higher-order program in the static
computation, using something close to CPS. Get CBVness by using
let-insertion on applications. Prove correctness using logical
relations at continuation types, also.
Question: can you write this in DS? Ans: yes, but you need to use
shift/reset.
Next question: how do you prove this correct? Ans: use logical
relations again. But at function type, need to wrap the operator in a
reset (both for ordinary function types and continuation types).
Next Question: Can we add shift/reset to the base language? Yes,
prove by LRs using Danvy-Filinski 89 type system, keeping track of target
types of the different continuations.
================================================================
Sun Sep 25
FDPE
================================================================
How to Design Class Hierarchies
Matthew Flatt (for Matthias)
Matthew gave an excellent account of the HtDC methodology. He started
off with the best version of the Design Recipe that I've heard yet.
For me, at least, this was the first time I really understood the
Template step. He used the rhetoric of "Template (Inventory)"-- at
this step you make a list of the information you have available,
including things like (ant-weight anAnt), etc. This was new to me,
and also fits into how I like to attack proofs, etc.
================================================================
Snuck down to the Erlang workshop for
TypEr: A Type Annotator for Erlang
Lindahl
Erlang is like Scheme" it is both "dynamically-typed" and
"type-safe". Goal is to annotate Erlang pgms with types, much as you
would with Scheme programs. He showed a lattice of types in Erlang.
Apparently Erlang operators respect this lattice better than Scheme's
do.
Types by constraints. Seemed pretty simple-minded, and took some
criticism from the audience: poor recursive types, no polymorphism, no
process types.
================================================================
Back to FDPE for
Laziness w/o all the Hard Work
Eli Barzilay
Students consider the defined language a toy; they don't use a feature
in the defined language enough to really understand it. Better to
exercise it in both defined and defining language.
So what to do about teaching laziness? Ans: make the implementing
language lazy-- create a "Lazy Scheme" language level.
Code turns from red to green when it's executed-- so they see the
laziness demonstrated.
================================================================
Teaching Image Synthesis in Functional Style
Jerzy Karcmarczuk
Nifty pictures from small specs, using a functional subset of Python
to script POV-Ray. I wonder if this fits in our curriculum
somewhere. Or is our course more about how to build tools like
POV-Ray?
================================================================
MinCaml: A simple and efficient compiler for a minimal functional
language
Eijiro Sumii
Compiler written in 2000 lines of OCaml (including comments and blank
lines(!)) But produces code comparble to OCamlOpt and GCC for some
serious appllications (eg Ray tracing, Huffman encoding, etc.)
Asai told me that the compiler code is really clean.
MinCaml is also the name of the source language-- it is strict, typed, HO,
functional, and impure (it has imperative arrays). But no gc,
datatypes/patterns, polymorphism, exceptions, objects. Some
students added these, just for fun.
Gasbichler said that Mike Sperber had written something similar.
================================================================
Then back to the Erlang workshop for a talk, from which I learned the
following: When writing your slides, do not put the key formulas in
light green on a white background.
================================================================
Mon Sep 26
ICFP Invited talk
Francois Pottier
Where is ML type inference headed?
1. STLC type inference -- by reduction to constraints. Constraint
solving is just unification.
Can move type environments from source lang to lang of constraints by
writing things like
[x:t] = (x=t) {??}
[\x.e:t] = (Exists a1 a2).(def x:a1 in [e:a2], a1->a2 = t)
2. Can do similar thing to generate constraints for Hindley-Milner
polymorphism, but need to allow type schemes to carry constraints.
s ::= (all a C).t
Interpret constraints so that
def x: s in C == [s/x]C
(all a,C.t) << t' == E a.(C & t = t')
Now say
[x:t] = x << t
[let x = e1 in e2 : t] = def x: (A e[ [e1:a] ].a in [e2:t]
The constrained type scheme (all a [[e1:a]].a is principal for e1.
After this all you have left is constraint solving. But that's ok,
since the semantics of constraints is clear.
This is good because constraint generation is linear; generation and
solving are seeparate [important!]; and the the constraint language
remains simples as the programming language grows.
// I really like the no-mixing property!!//
Part II. Generalized Algebraic Data types (GADTs)
eg define a type "term" s.t.
datatype (term a) where
Lit :: int-> (term int)
..
Pair :: A a,b. term a -> term b -> term (a * b)
//note: in ML, you can't write term (a * b).
..
Then can write a pgm defining eval :: (All a).(term a -> a) , easily
enough.
Seems easy to generate constraints for things like eval. But this
requires introducing implication into the constraint language. So now
you've got the first-order rtheory of equality of trees, which is
decidable but intractable. but constraints no longer have mgu's
(oops!), eg : many type schemes for the "cast" function, but none is
principal. Could have a principal _constrained_ type scheme, but
that's not an acceptable soln.
Solution: principal types ar recovered by means of mandatory type
annotations, and a local shape intefrence layer is added so as to
allow omiting some of these annotations.
(* I missed a section here *)
//damn, slides going by too fast now...//
//Ahh, the Talk is on the web, at his web page...//
Eliminate coercion annotatons because you can compute "shapes" of type
variables locally (by ordinary unification...)
Book chapter, w/ Didier Remy, is in the works.
Martin Sulzmann says this is all constraint logic programming...
================================================================
Anatomy of a Loop
Olin Shivers
letrec/tail-recursion is insufficiently structured/modular.
A loop clause throws bits of the clause into different parts of the
complete loop. So clause order != scope or control order.
Essence of lexical scope:
1. Programmer can locally chooose variable names
2. Definitions control-dominate uses (ie, bindings control-dominate
references) -- can't get to var reference w/o going through its
binder:
(lambda (x) ... x ...)
control can't get to x w/o going through the binding.
Important for humans and computers. (No refs to unbound vars).
[cf example at end of Egner's talk!]
Design idea: Let's take feature 2 as our definition of lexical scope.
System consists of 4500 LOC, w/ 83 macros, of which about 70 are cps
macros ("cps macros are a way of writing Scheme macros that translate
from one language to another, neither of which are Scheme").
Bob Harper: where do you put the invariant?
Greg Morrisett: the labels are not first-class values, so you have a
much better chance of formulating the correctness properties.
================================================================
Acute: High Level Programming for Distributed Programming
Peter Sewell
Problem 1: Distributed PL shouldn't have built-in notion of network
communication. But want type-safe marshalling. Then you could build
comm primities as libraries.
Problem 2: Version change. Also have pervasive distributed
development and deploymenent. Must interoperate among different
versions, execution domains, etc.
Problems of typesafe marshalling:
send(marshall (fun x. -> printint(x+1)))
Want printint to be rebound to local resource at unmarshal-time. What
else should be dynamically bound?
Solution: mark a region of the sending pgm; refs to variables above
the line are dynamically unmarshalled.
What abt marshalling abstract types? Receiver must respect sender's
invariants. eg Balanced Trees
Soln: construct globally-meaningful runtime type names. Need to hash
both modules and runtime instantiations (for effectful modules). Use
160-bit nums for this. Or let pgmer force compile-time generation,
for linking against a shared module... Type system based on singleton
kinds...
Also need shared typed expression-level names, eg handles, channel
names, etc. 4 different scenarios...
Problem: versioning. Use version hashes..
But these interact in nasty ways.
================================================================
An Expressive Language of Signatures
Norman Ramsey
Let's _program_ with interfaces, not just edit them w Emacs.
ML interfaces are called signatures; a signature is the type of a
module.
Signatures supprt data abstraction: type in signature can be
abstract. An abstract type can be refined later (fibration).
Several ways to program w interfaces
1. combine properties from distinct interfaces (mixins)
2. Adapter pattern on modules
3. Extending modules
Nifty language for writing specifications by inheritance, renaming,
etc. IOP folks should read this.
================================================================
Combining Programming with Theorem Proving Hongwei Xi
This looked interesting, but I couldn't follow it in real time. The
key feature of this system is that types are said to be static and do
not depend on programs.
================================================================
Modelling Substructural State
Matthew Fluett
"Uniqueness types" for "taming" state (& allowing strong update).
Unique objects are too restricted; need to have "mixed objects" with
subtle differences in the type systems.
Presents model of core language with mutable references, type-varying
mutation, explicit deallocation, etc.
Conventional type systems satisfy structural properties: exchange,
contraction (copy), weakening (drop).
Substructural type systems don't. We'll keep exchange but drop the
others.
Involves 4 type qualifiers: Unrestricted, Affine, Relevant, Linear
These qualifiers form a neat lattice.
Paper is scary, but the slides were good. Now if I could only stay
awake...
================================================================
Principled construction of operating system in Haskell
Andrew Tolmach
This talk was definitely a highlight of the meeting. Andrew got up to
the podium and explained how he had been in kind of a rush and had
forgotten to boot his machine. He brought up the boot menu and said
"Well, what OS shall I boot? I could boot Linux, but it would take
forever. And Windows, well, when I'm done all I would have is
Windows. Wait! What's this other option? 'House'? Let's try it."
And the machine boots successfully in under 10 secs. Big round of
applause.
House is a kernel (scheduler, resource mgt) written in Haskell. Does
privileged hardware ops (raw IO, page table manipulation) directly,
not thru C library. Still requires runtime support in C (GC, etc.)
Haskell is safe & pure, so should be good for high-assurance
development. Memory safety rules out many kinds of bugs.
But: the IO monad hides many sins these days. so what level of
assurance can we have about a rpogram that uses the IO Monad?
Approach: Use the H (Hardware) monad-- a constrained subset of IO
monad, including only primitives for privileged IA32 ops.
The operations are partially specified by assertions in a logic called
"P-logic" from the Programatic project. Slogan: "Programming as if
properties matter."
Looks very nifty!
================================================================
Tue Sep 27
Brendan Eich
Javascript at 10 years
History: Opportunity to create scripting lang before Mozilla/Java
deal. Prototype hacked together in a week
Design goals: make it easy to copy/paste snippets of code (for viral
marketing!). Tolerate minor errors. Simplified onclick, etc., event
handling, inspired by hypercard. Pick a few hardworking powerful
primitives. First class functions for procedural abstraction.
Objects everywhere, prototype-based. Leave all else out.
Event handling: functions as event handlers must be first class.
Netscape management fretted: why two pgmg langs? Answer: division of
labor. Java for high-priced components/widgets. "Mocha" for mass
market web designers. Object-based, if not object oriented. Name
"Mocha" changed to Javascript in late 1995, mostly a marketing scam.
engineering pressure to be like Java did cause us to follow Jva into
some dead ends, leading to confusion ever since.
JS in two slides:
Objects map strings to values (properties)
Functions are first class objects
Methods are function-valued properties
Permissiveness throughout. Oops. [sic]
--some examples of surprising behavior..
grob = obj.frob;
grob(6); => undefined+6 = Nan
prop = "Hello"
grob(6) ==> hello6
-- now millions of web pages depend on these errors.
JS is "self"ish
All functions can construct
All functions have a proptotype property
Powerful when combined with closures.
AJAX is for XMLHttpRequest, which MS added to IE when they gave Java
the boot -- ironic! (Zimbra: nifty AJAX webmail app (60KL JS))
Good hackers discovered JS's FP and Self-ish OOP features, leading to
viral marketing. These features included easy-to-extend user-defined
objects, eg ruby.js for Ruby generic method emulation. Or built-in
objects. eg "Prototype" library.
Would another winner-take-all lang have done as well in this space?
No: it's hard to work around lack of closures (eg Java anonymous inner
classes (yuck)).
Edition 4 of ECMA262 spec: "Waldemar" (a person so famous he was
identified only by his first name) wrote out a semantics in typed LC
and from that auto-generated the spec. ECMA couldn't deal with that,
so it never got adopted as a standard.
Next ECMA spec will include property attributes (type annotations).
Conclusion: In a rush, target your audience and simplify. Pick the
right primitives, support extensions. The right primitives for event
handling include first-class functions and and closures. (proof: C#
starts w Java and grows delegates, lambdas). Don't let Marketing name
your language.
"Lambda is the Ultimate". I never did tail-recursion and it isn't in
the standard (yet).
Jonathan Sobel: What about macros? Ans: I've thought about it, but
it doesn't seem like a good investment for the small segment of the
community that would use it. The C/Java heritage makes macros ugly at
best. Boo is an example: Python + Macros
If I'd been tracking Python at the time, I might have started with
Python instead...
Making all functions constructors seems like a mistake in retrospect.
================================================================
Didier Remy
Simple, partrial type inference for System F based on type containment
Second-order types are good for lots of things, but how to support
type inference?
MLF (impredicative)
start w/ Peyton Jones 2005a. (predicative).
Type inference: translate from partially typed program to explicitly
typed program..
or use elaboration: algorithmic specification of how to transalte
from partially typed program to expllicityly typed program.
Or: use elaboration to go from one pt pgm to another one, and then use
type inference. ("Stratified type inference").
================================================================
Principal typings for mixin modules
Henning Makhom (w/ Joe Wells)
Presented a calculus for linking "mixin" modules. Linking is very
similar to PLT modules, except linking is symmetric (like linking in
Unix??) [interesting design decision!]
Symmetric linking ==> recursion may arise "spontaneously" (eg even
linked w/ odd).
This gives rise to very interesting typing problems. For example, consider
\x\y.(Link(x,y)).f + 42
x and y may need to exhcange values before one of them can compute a
vale for f. Wexactly one of them must export f. Soln: use
constraints.
System "Martini" which has tractable principal typings.
Paper claims the first written-down efficient implementation of
row-unification. Will need to check this against mine.
================================================================
HighLevel Views of Low Level Data
Diatsko (w/ Mark Jones et al)
Want to work w/ bit data as if it were ordinary algebraic data.
New keyword: bitdata, a type for bitvectors
bitdata Timeout
= Period (e :: bit 5, m ::Bit 10)
as B0 # e # m
} ...
================================================================
Scrap Your Nameplate
James Cheney
Nameplate == Boilerplate code for alpha-equivalence testing, CAS
(capture-avoiding substitution) etc.
Using FreshML approach ("nominal abstract syntax") all this is much
better behaved.
Much of this functionality can be provided within Haskell using a class
library.
Then use Lammel/Peyton Jones "scrap your boilerplate" generic
programming techniques to provide the instances "for free".
================================================================
Scrap your Boiler plate with class
Simon PJ and Ralf Lammel
This seems to be the Haskell version of the extnesion problem. Much
better idea: PARAMETERISE OVER THE SUPERCLASS: so the data dictionary
contains a context dictionary.
class (cxt a) => data cxt a where ...
cxt has kind * -> pred
[This all made some sense at the time, but it's pretty opaque to me
now, sorry.]
================================================================
Wed Sep 28 09:02:00 2005
Mechanizing Language Definitions
Bob Harper
Language definitions are good: Most successfully, type systems and
operational semantics.
But a lang definition is also a burden. It's like a codebase:
somebody has to maintain it, and it's not easy to make changes.
Defs can be mistaken, too. They can be internally incoherent,
difficult or impossible to impleent.
Need a body of metatheory to ennsure that the op sem is type sound,
tychecking is decidable, etc.
Developing and maintaining the metatheory is onerous.
Can we mechanize this stuff? Can we do this at scale?
This talk: about using twelf for this. Others are using Coq,
Isabelle, others for similar purposes. Harper says he thinks it's too
early to say what's "best"
Self-summary: Twelf is a convenient and effective tool for meachanized
metatheory. It has a natural pattern matching style of presentation.
It is easy to state and verify simple but informative invariants. And
simple sanity checks are powerful! This is analogous to a "type
system" for language definitions-- it checks very simple but still
very useful properties.
But this is no waterfall model. The definitional technique is
influenced by the demands of mechanization. Mechanization uncovers
mistakes, etc. In Bob's opinion, LF tends to enforce good hygiene,
rather than require contortions.
LF/Twelf is not the last word: it is robust, likely to remain useful,
but there is a clear path to improvement, eg linearity, structual
congruences.
====
LF methodology in a nutshell: establish a compositional bijection
between objects of each syntactic category of the object language and
canonical forms ("syntactic values") of the associated type of the LF
calculus.
Here "compositional" means "commutes with substitution", giving meaning to
variables.
[mw thinks: why not "commutes with alpha, giving meaning to binding"?]
Syntactic categories typically include type derivations and evaluation
derivations, leading to "judgements as types" slogan.
Important point: "Adequacy theorem" says objects of a syntactic
category are isomorphic to the canonical forms of some type. But
these c.f's are in general open, and we have to watch out for the
contexts (types of the free vbls).
Metareasoning: adequacy ensures that we can reason about the object
lang by just thinking about LF terms. Could do this on paper if you
want, not just by machine.
Twelf supports checking of proofs in Pi1 (ie, AE sentences). This is
enough for preservation, etc. The truth of an AE sentence comes down
to showing the totality of some relation defined in LF.
Substitution, easking, contraction are allprovided "free" by the
frameowrk, and life is really good.
But life's not always so easy. but you can get surprisingly far with
a bit of experience and ingenuity
Samle hard case: adding a store. This *almost* works. But can't
prove type preservation. Problem is that you need linearity: need to
show typing is preserved by allocation. Must consider a coherent
family of type systems, not just one at a time (ie if it works for
store typing L and L \subset L', then it works for L'.) If you add the
obvious rule then you lose adequacy of the encoding.
Trick: remove the location tping from assumptions. This sidesteps the
mismatch, but is substitution still valid? This illustrates a recurring
technique of isolating variables for special treatment, without
abandoning hoas as a principle.
Penalty: must now check that substitution preerves typing. But this
is easily verified using Twelf.
This pattern recurs. eg in poplmark challenge. Need permutation, but
what if permutation fails? eg changing the bound in a subtype
assumption. <<some discussion of technicalities, which I didn't
follow>>. This was the hardest problem in the poplmark challenge. the
rest was handled easily using standard methods with no serious
complications. We did it in about a week.
Our students don't dare write a POPL paper without checking the defs
in twelf.
Can this scale up to the Defn of SML? Unlikely to work directly.
Here's a possible indirect route: formalize the elaboration of SML
into HSIL (Harper-Stone Intermediate Language, a la "A Type-Theoretic
Definition of SML"). This elaboration takes care of all the
"conveniences" of SML. Then formalize HSIL as conentional type system
w/ SOS rules. Then need to show type safety for HSIL, and that
elaborated programs are well-typed in HSIL.
We've oversome a few obstacles: explicit store and label management,
handling variables w/ restricted permutation. Found some bugs in
HSIL.
More difficulties: need to worry harder about tpe equality.
Note: can't do logical relations arguments: the quantifier complexity goes
up with types (!), but twelf is "resolutely pi-1".
================================================================
Associated type synonyms
Gabrielle Keller w/Chakravatry and Payton Jones
In Haskell, sometimes want to have _partial fcns_ type -> type, not
just type classes.
This is currently done w/ functional dependencies (also ML modules,
etc.).
These all pretty heavyweight, ours is a leightweight extension of
Haskell.
Can type sprintf in our framework, using singleton types to reflect the
formatting info.
================================================================
Aspectual Caml: an Aspect Oriented functional Language
Hidehiko Masuhara
FPs also have crosscutting concerns -- arent you bothered by logging
code?
Advanced FP features are great, but not always help.
A'Caml = O'Caml + (AspectJ - Java)
variant types
pointcut and advice
intertype declarations.
Objects and module system not considered here.
Can add fields to a constructor or extra constructors to a type
Prototype: translator to Ocaml, 2KLOC
================================================================
Thu Sep 29 08:56:22 2005
This was the first ML Workshop in 7 years (!!)
The Next ML? (ZML?)
Greg Morrisett
The whole OO push has been bad. Wrong defaults everywhere: everything
is a reference, can be null, is mutable. Models are complicated a
non-orthogonal "recursive vivisection" (Dave MacQueen's term for
method overriding) "narcissistic programming" (Andrew Appel's term for
the preoccupation with "self").
What we do ("Functional Programming") is better described as
"Value-Oriented Programing"
Every language design needs a focus. Some people think we should pick
a hot or emerging application domain. Personally, I think we shoud
focus on the domains of TEACHING & RESEARCH. This is where we've had
tremendous impact: ML is the lingua franca of PL research. Most of
the design research starts in the functional setting and then
eventually migrates to real languages. By that time, it's boring and
incremental.
We shold steal from:
PLT Scheme: fantastic IDE. Language levels, algebraic steppers.
GHC (& Hugs): Expressive type system. Effective DSLs and language
encoding. Isolation of effects.
Some topics to consider: sexy types, effects separated and reflected
in types. Dependency and refinement. Rethinking the idea of code:
beyond ascii ("separate rerpresentation and presentation of code?")
Sexy types: ML core type system problems: value restriction? (need to
eta-expand the combinators). Polymorphic recursion. Existentials!
Higher-order type constructors. Pointer equality (viz CAML).
Annoying to have to write down explicit datatype declarations in order
to do type inference. At a minimum I want F-omega.
I also want some form of qualified types & certainly GADTs.
-- solves record & other overloading problems
-- reduced pressure for subtyping
-- see MLF, GHC, Omega
-- but I shouldn't need Oleg to be effective [sic]
Someday: dependency, refinements etc. eg: Omega, Epigram, Cayenne,
ATS. But the jury's still out on these. Maybe for SML_{N+2}.
With existentials, I can get at the essence of interfaces
(classes/records) and heterogeneous encapsulaton (existentials) that
are the real wins of OO languages.
class Widget a
scene: (list (exists.a Widget a => a))
operational model is dirt simple-- an evidence-passing stranslation:
print : Show a => a -> stirng
becomes
print : (a -> string) -> a -> string
I love the fact that Haskell uses the type system to separate
effectful computations from pure computations. Then can isolate
equational reasoning, enable small & large optimizations (eg CSE &
deforestation, respectively). "It's just the right thing to do"
(S. P-J).
Why not GHC? a. Laziness b. the way effects are treated.
I like laziness, but it shouldn't be the default for value-oriented
programming". Reasons:
1. The compiler gets to be much simpler, vs. "heroic" GHC w/
strictness analysis to get most Haskell code to run eagerly. Also, in
Haskell, everything's a shared mutable reference, so this loses on
write barriers.
2. For SML it's easy to argue that length x terminates because the
list datatype isinductive and variables range over values. In
Haskell, can't use the type system to distinguish betwen values and
computations, nor inductive vs coinductive datatypes. This starts to
matter when you take dependency and program logic seriously. We need
coinductive data types, but we need inductive data types, too, which
don't exist in Haskell.
3. Better models: can do both a simple extensional model
(substitution) and big-O; allocation semantics for ML is easy and
accurate. [Not perfect: traditional big-O models typically don't
account for GC-- you can't really trade space for time.]
Effects != Monads. Monads are a useful abstration, but the IO monad
is awkward at best-- you get 2 languages with only 1-way communication
Can't encapsulate effects (eg exceptions). Need a serious refinement
of this notion!
eg ordinary map vs
mapM :: Monad M => (a -> M b) -> list a -> M list b
You wind up have two versions of everything-- a direct one and a
monadic one.
Does this overdeterminize the code? That's ok: I like the idea of
deterministic computations. Leave it to the compiler to determine
when it's safe to reorder things.
"Naive" fix: Have a lattice of IO monads: index monads by the effects that
are used, and then allow polymorphism oer the indexes. Hmm, we just
reinvented FX. (a -> T_phi b) === ( a -phi-> b) But maybe that's not
so terrible: st -{throws FileOpen}-> st
But the two-maps problem still shows up.
Real issue: what's the language of indices (effects)? Typical soln:
sets of primitive effects (w/ algebra w/ union, restriction, etc.) Has
nice inference properties (sets as rows, handled nicely by Haskell
qualified types). Lots of experience in FX. Tofte-Talpin, Cyclone,
and analysis community.
But dependency quickly creeps in:
throw : {x:exn} -{throws x}-> void
Worse: to support restriction (phi\A) in the algebra of effects, need
to reason about type inequality, which is harder... especially in the
presence of generative exceptions.
Potential solutions: nominal logic; substructural logics??
Other issues: termination reflected in effects? Sets of effects vs
sequences vs honest logics? Maybe should use comonads instead of
monads (may work better for exceptions). But this makes inference and
teaching harder.
Summary: Haskell has right intentions: sexy types, separation of
Church and state. But wrong defaults-- should be CBV, etc.
====
Pattern matching is good for human cognitive models. But ML
patter-matching syntax is confusing... Patterns only cover concrete
repreenations. Discourages abstractions (see Fluet & Pucella's paper).
Views haven't succeeded because of effects. Proposal: patterns should
be libraries with domain specific decision procedures, eg regexps,
xml, boolean constraints, for exhaustiveness checks.
====
Rethinking Code: maybe we should separate the internal representation
(xml??) from what's presented to the programmer. Then you could keep
the types down in the xml. Can use a style sheet to determine how the
code should be presented, eg the style sheet could say things like
"display the types of the toplevel code" or "use my indentation
style". Then could have more meta-tools (eg macros [sic])
====
"Tools are contravariant in the language" So extending the language
doesn't help the tools. Tools, teaching, and mechanism provide
backpressure on language extension.
Leroy is apparently working on a compiler that produces both code and
proof of correctness.
Dave MacQueen: Problem w/ type classes and qualified types is that
they lead to "action at a distance" in the type system. GM: "I
disagree. I think it works very well. Explicit instantiation of
hooks might help."
Leroy: the problem w/ sexy types is that they don't compose well.
================================================================
An Overview of alphaCaml
Francois Pottier
"Another language in the family of FreshML, FreshOCaml, FreshLib"
Motivation: Why does FreshOCaml have to be a compiler patch?
Three facets: specification language, implementation technique, and
translation from one to the other.
Prior art: abstraction usually takes the form <a>e or <a1,..,an>e (or
as in Fresh Objective Caml <e1>e2). Abstractions are always binary.
If you encode a multiple-declaration let as
term list * <var list>term
you have junk in the representation, because the representation
doesn't require the two lists to be of the same length.
Root problem: this assumes that physical and binding structures
coincide. Approach: decouple them by using "so-called patterns" w/
"inner" and "outer"
Atoms are represented as a pair of an integer and a string. The string
serves as a hint for display.
Sets of renamings are encoded as patricia trees.
Renamings are suspended and composed, so you get things happening in
linear time.
Translation: atoms and abstractions are abstract types. Boilerplate
for sets of free or bound atoms, applying renamings, etc., are
generated.
Distribution comes w/ 2 demos: naive tpechecker and evaluator for
Fsub, and a naive evaluator for a calculus of mixins from Hirschowitz
et al.
Limitations: one must go through "open" functions to examine
abstractons. Deep pattern matching is impossible. (FreshOCaml, as a
compiler hack, can do this). Clients can write meaningless code, such
as a function that preteds to collect the bound atoms in an
expression.
Towards alpha-{your-favorite-prover}? How about translating a something
like alphaCaml into theorems and proofs (Pitts, Urban and Tasson,
Norrish)?
[Note: both Pottier and Bob Harper pronounced "deBruijn" as
"deBrown"]
================================================================
Mark Shinwell
Fresh OCaml: Nominal Logic for the masses
[Shinwell looks just like Matthew Flatt...]
Emphasis in talk willbe on pragmatics.
Good points: Lightweight syntax, but includes deep matching.
What we've got:
--A family of types 'a name
--The "fresh" epresson for creating new atoms.
--Expressions <<exp>>exp for representing object-level binding
operations. These are called abstraction expressons. Can pull
these apart using pattern-matching, and this is the only way of looking
inside an abstraction.
Syntax becomes clumsy when describing complicated binding structures.
Some binding structures are impossible to describe. No ordering or
hashing on atoms. Incorrect treatment of mutable state. Crashes when
deaing with cyclic structures.
New version in progress: more efficient. moe expressive bindings.
================================================================
ML Module Mania: A type-safe, separately compiled, extensible
interpreter
Norman Ramsey
People seem not to get the point of the module system.
eg: NR says: A distinguished colleague whom I admired said, "I never saw an
object or a functor that I liked". NR says that he agreed about the
object part :)
Big programs accumulate barnacles: command-line options,
environment variables, X resources, configuration files, registry
entries, etc., etc.
Solution, due to Ousterhut 1990: reusable scripting langauge A
wonderful idea, instantiated with a truly terrible language: Tcl
In C-- project, we use scripting language for regression testing, can
compare front-ends, etc.
Often want impromptu scripting. Problem: to debug compiler, print out
the flow graph. Want to insert the print phase anywhere in
pipeline. Here the compiler phases are ontrolled by a scripting
language. So don't need to mess with the ML mess.
We use lua as the scripting language.
Want to integrate this extensible lang as type-safe, separately
compilable modules.
Use two ideas:
old: two-level types
new: extensions (or "library" in Lua talk) as ML modules
[[Norman proceeded to explain both of these in detail, and how
functorization (and fibration!) solved the problems along the way.
But you should read the paper for that...]]
Use embed & project functions to merge different libraries in a single
userdata type. ("project" serves as downcast, can raise "bad arg"
exception).
================================================================
Practical datatype specializations
Matthew fluett w/ Ricardo Pucella
Problem: want to statically enforce program invariants
Soln: datatype specialization. This is a programming technique for
standard ML. Not a language extension, but could be baked in later.
Not a complete soln, but useful.
Example: Imagine we want to represent boolean formulas + core
functionality (eg randomFmla, toString) + key invariants (eg DNF):
val toDnf : fmla -> fmla
val dnfSAT: fmla -> bool
But we don't have a static way of specifying that the fmla in question
must be in dnf.
Possible soln: introduce datatype of DNF. But this hurts core
functionality: no longer can apply toString to a dnf. Various
obvious workarounds, none very nice.
Tension: core functionality avaible to all clients vs key
invariant to be enforced for a particular client.
Introduces some notation for a refinement type.
Claim: we show a technique that is avaible NOW to all SML
programmers. Strictly weaker than a language extension eg refinement
types (Freeman 91).
How to do this: Use phantom types to encode & enforce subtyping
relationships. Also recursion schemes to recover pattern matching
idioms.
================================================================
A type-safe embedding of XDuce into ML
Martin Sulzmann
the very very very very end (amen!)
================================================================
Things we were served for morning coffee break:
Sausage roll
Minipizzas with olives
Puff pastry w/ salmon
Puff pastry w/ rice
================================================================
Photos:
http://www.flickr.com/photos/mwand
http://www.brics.dk/~danvy/Pictures/ICFP05/
http://www.brics.dk/~danvy/Pictures/dfried/ICFP05/
http://www.brics.dk/~danvy/Pictures/shivers/ICFP05/
http://www.brics.dk/~danvy/Pictures/fp/ICFP05/
More information about the PRL
mailing list