Compiling Standard ML for Efficient Execution on Modern Machines (Thesis)
Abstract:
Many language theoreticians have taken great efforts in designing
higher-level programming languages that are more elegant and more
expressive than conventional languages. However, few of these new
languages have been implemented very efficiently. The result is that
most software engineers still prefer to use conventional languages,
even though the new higher-level languages offer a better and simpler
programming model.
This dissertation concentrates on improving the performance of
programs written in Standard ML (SML)---a statically typed functional
language---on today's RISC machines. SML poses tough challenges to
efficient implementations: very frequent function calls, polymorphic
types, recursive data structures, higher-order functions, and
first-class continuations. This dissertation presents the design and
evaluation of several new compilation techniques that meet these
challenges by taking advantage of some of the higher-level language
features in SML.
Type-directed compilation exploits the use of compile-time type
information to optimize data representations and function calling
conventions. By inserting coercions at each type instantiation and
abstraction site, data objects in SML can use the same unboxed
representations as in C, even with the presence of polymorphic functions.
Measurements show that a simple set of type-based optimizations improves
the performance of the non-type-based compiler by about 19% on a
DECstation 5000.
Space-efficient closure representations utilizes the compile-time
control and data flow information to optimize closure representations.
By extensive closure sharing and allocating as many closures in registers
as possible, the new closure conversion algorithm achieves very good
asymptotic space usage, and improves the performance of the old compiler
by about 14% on a DECstation 5000, even without using a stack.
Further empirical and analytic studies show that the execution cost of
stack-allocated and heap-allocated activation records is similar, but heap
allocation is simpler to implement and allows very efficient first-class
continuations.
Unrolling lists takes advantage of the higher-level language
abstraction in SML to support more efficient representations for lists.
By representing each cons cell using multiple car fields
and one cdr field, the unrolled list reduces the memory used
for links and significantly shortens the length of control-dependence
and data-dependence chains in operations on lists.