[PRL] basic terminology: abstraction
David Herman
dherman at ccs.neu.edu
Thu Aug 21 12:37:24 EDT 2003
This is a little embarrassing, but I'm not sure I understand some of the
most basic vocabulary we use in CS. It seems that the term "abstraction" is
badly abused: sometimes people use it when they really mean modularity,
sometimes when they mean encapsulation. I think encapsulation is pretty
clear and I'm not as much interested in that at the moment. But there are
three other concepts I can think of, and I'm not sure how to map them to
the correct terms.
(1) generalization -- I think this is the real meaning of abstraction; it's
the process you go through when you pull out the common parts of multiple
specific parts of a program (computation, system, idea, etc.) into a more
general version, which is parameterized on the parts that vary across the
specific instances. E.g., in FP, functions are the primary mechanism for
generalization: first-class functions generalize common calculations, and
higher-order functions are generalized calculations parameterized on other
calculations.
(2) modularity -- Separating mutually independent parts of a program so
that (a) the code becomes more legible (you are only ever looking at one
conceptually cohesive part of the program at once) and (b) changing the
program is easier because changes are localized to the module to which they
relate.
(3) layering -- This is the one I don't quite know how to classify.
Sometimes you hear "levels of abstraction" and "abstraction boundaries,"
but I'm not sure that abstraction is the right term (I'm not sure it's not,
either). Typical examples are the view of systems as layers of languages or
virtual machines (e.g., gates -> microcode -> assembly -> OS -> JVM -> Java
application) or the OSI network protocol stack. In a sense this is
modularity, because it's separating a system into conceptually disjoint
pieces, but it's vertical rather than horizontal, so I think it's worth
distinguishing from modularity. I don't, however, see how this relates to
the idea of generalization.
I'm most unclear on the third term. I've always felt uncomfortable with the
term "abstraction boundaries" -- I have an intuitive sense of when a smelly
program crosses these boundaries, but I'm not comfortable with the label.
A recent example of where using the word "abstraction" for these two
different ideas leads to confusion is Guido van Rossum's addled blog entry
about loops and recursion:
I'd say that a loop is a higher-level concept than recursion; recursion
is more powerful, but also more low-level, like assembly language. ...
I had my epiphany about loops as a higher level of abstraction than
recursion ...
<http://www.artima.com/weblogs/viewpost.jsp?forum=106&thread=4550>
If loops are higher-level than recursion, then they are a *specialization*
of recursion. Calling them a "higher level of abstraction" makes it sound
like they are *more* abstract, when in fact he's saying the opposite. (The
result is that anyone who's heard Matthias's lecture on "Scheme's 241
Loops" choked on their breakfast upon reading Guido's blog.)
Thanks,
Dave
More information about the PRL
mailing list