Power
Acknowledgement: This note created by Pramod Subramanyan and David Walker.
Polymorphism and Higher-Order Programming
Good programmers are lazy: they never writes the same piece of code twice. Instead they strive to factor out the common bits in to meaningful, reusable components. Write a component once, find and fix all bugs, and use it many times. That is the path to becoming efficient programmer. If you need to update the component, perhaps for performance or to fix an error, it only needs to be updated in one place.
In OCaml, higher-order and polymorphic functions are powerful tools for code reuse. Higher-order functions are those functions that take other functions as arguments or return functions as results, while polymorphic functions are functions that act over values with many different types. Together, they enable a great deal of code reuse. In this lecture, we will look specifically at how to use higher-order and polymorphic functions to represent complex, recursive, control-flow patterns.
Higher-Order Programming
Consider the following two functions which (a) increments all the elements in a list and (b) squares all the elements in a list.
let rec inc_all (xs:int list) : int list = match xs with | [] -> [] | hd::tl -> (hd+1)::(inc_all tl) let rec square_all (xs:int list) : int list = match xs with | [] -> [] | hd::tl -> (hd*hd)::(square_all tl)
The only difference between inc_all
and square_all
is in the
expressions hd+1
and hd*hd
--- the other parts of these
functions are exactly the same. OCaml's higher-order functions make it
easy to extract these commonalities out in to a reuseable component.
Below, we present the map
function, which applies its
argument f
to all elements of a list.
let rec map (f:int->int) (xs:int list) : int list = match xs with | [] -> [] | hd::tl -> (f hd)::(map f tl);;
map
is one of the most ubiquitous OCaml functions --- you
should get used to reading and writing programs that use it. With
map, recursive functions like inc_all
and square_all
become simple, non-recursive one-liners
as shown below.
let inc x = x+1;; let inc_all xs = map inc xs;; let square y = y*y;; let square_all xs = map square xs;;
Anonymous Functions
When programming with higher-order functions like
map
, one has a tendency to need many little functions
like inc
and square
, which are then often only used
once. Rather than defining a named
function to be used just once, we can define it without a name
and use it in place. For instance we would usually write
inc_all
and square_all
as follows.
let inc_all xs = map (fun x -> x + 1) xs;; let square_all xs = map (fun y -> y * y) xs;;
The expression fun x -> x + 1
is an
anonymous function that takes one argument (x
) as
input and returns x+1
as a result. One can also define
multi-argument functions using the syntax
fun x y z -> x + y * z. However, one cannot define recursive functions --- one must have a function name for that.
Conceptually, anonymous functions are no more complicated than anonymous
numbers (like 3 or 4), anonymous strings ("hello"
)
or any other anonymous value. It would certainly be annoying if instead
of writing:
print_string ("hello" ^ " " ^ "world")one had to explicitly bind names to each of the strings first:
let hello = "hello" in let space = " " in let world = "world" in print_string (hello ^ space ^ world)
Why should function values be treated differently from other values like integers or strings? They shouldn't!
Non-anonymous (Conspicuous?) Functions as Anonymous Functions
It turns out that the function definitions we have been using so far are actually abbreviations. The following code:
let square x = x*x ;; let add x y = x+y ;;is just syntactic sugar for:
let square = (fun x -> x*x) ;; let add = (fun x y -> x+y) ;;
With this in mind, it is easy to see that several of the functions we have written earlier are equivalent:
let square x = x*x in map square xs == let square = fun x -> x*x in map square xs == map (fun x -> x*x) xsThe 3rd expression is derived from the second by substituting
fun x -> x*x
for the variable square
.
A comment on style: One must be somewhat careful with
anonymous functions. They are great when one needs to define a small
function (like square
or increment
)
that is used once. However, if one must define a larger function,
it is typically better to give it a name, because it will be easier
for a colleague or teammate to read. Use your judgement and
for more tips, see our style guide.
Polymorphic Functions
map
seems like a pretty great function until we stumble across div_all
:
let rec div_all (xs:float list) : float list = match xs with | [] -> [] | hd::tl -> (hd /. 2.0)::(div_all tl)Once, again, the code of
div_all
is almost identical to the code of
square_all
or inc_all
and it seems like we should be
able to implement this using map
, but we can't. map
operates over integer lists whereas we need a function that operates over floating point
lists. Fortunately, we can redefine map to make it more general. There is no reason
to constrain map
to operate over integers alone, we can define it to
work over lists with elements of any type 'a
and transform them in to lists of any other
type 'b
.
let rec map (f:'a -> 'b) (xs:'a list) : 'b list = match xs with | [] -> [] | hd::tl -> (f hd)::(map f tl);;
In general, in OCaml, whenever a type name is preceded by an apostrophe (as in
'a
and 'b
), it is a type variable that may stand for
any type. If one were to write out the full type of map, it would be the following:
map : ('a -> 'b) -> 'a list -> 'b listWe would read the type as saying "for all types
'a
and 'b
,
map takes a first argument with type 'a -> 'b
, a second argument with
type 'a list
and produces a result with type 'a list
."
To understand how we might use a polymorphic value like map
, we can
substitute any concrete type we like for the type variables that appear in map
's
type. For instance, if we substitute int
for 'a
and
bool
for 'b
in the type of map, we wind up with a type
like this:
(int -> bool) -> int list -> bool listConsequently, we could use map at that type in the following expression:
let pos : int -> bool = fun n -> n > 0 in map pos [1;2;3;-1;-2;-3]
Alternatively, if we substitute float
for 'a
and
float
for 'b
in the type of map, we wind up with a type
like this:
(float -> float) -> float list -> float listWe can use map at that type to implement
div_all
:
map (fun x -> x /. 2.0) [5.0; 7.0; -3.2]
Finally, there is nothing stopping us from substituting arbitrarily complex types like
list types and option types and tuple types and other function types for the type variables
'a
and 'b
. For instance, below, we substitute the type
int list
for 'a
and also for 'b
.
map (map (fun x -> x + 1)) [[2]; [4;5]] ;;
A Generic Reducer
The higher-order function map
implements one very common
recursion pattern over lists, but there are more.
Consider the following two functions. What do they have in common?
let rec sum (xs:float list) : float = match xs with | [] -> 0.0 | hd::tl -> hd +. (sum tl) ;; let rec all_pos (xs:int list) : bool = match xs with | [] -> true | hd::tl -> (hd > 0) && (prod tl) ;;
Both functions are defined using two cases -- one base case for the empty list, and one
recursive case for a non-empty list. The base case returns a specific, pre-determined value.
The recursive case makes a recursive call over the tail of the list and uses the result
of that recursive call, together with the head of the list in a computation that produces
the final result for that case. To capture this recursion pattern, we will define
a function reduce
that has the following property.
reduce f u [x1; x2; x3; ...; xn] == f x1 (f x2 (... (f xn u)))For instance:
reduce (+.) 0.0 [1.0; 2.0; 3.0] == 1.0 +. (2.0 +. (3.0 + 0.0))or
let pos (x:int) (b:bool) : bool = (x > 0) && b) ;; reduce pos true [1; 2; 3] == pos 1 (pos 2 (pos 3 true))Here is our definition.
let rec reduce (f:'a -> 'b -> 'b) (u:'b) (xs:'a list) : 'b = match xs with | [] -> u | hd::tl -> f hd (reduce f u tl);;
A Note on MapReduce
It is worth noting that the functions very similar to the map and reduce we've defined above are the basis of Google's MapReduce framework. If you're interested in learning more, this paper from OSDI 2004 and a related paper from HPCA 2007 are good places to start reading.
Curried Functions and Partial Application
It turns out that all functions in OCaml are unary (1-argument) functions! But how can this be? Didn't we just write functions with 2 and 3 three arguments while writing map and reduce?
The following declaration of a function that seemingly accepts two arguments:
let add = (fun x y -> x+y)
is actually shorthand for the following.
let add = (fun x -> (fun y -> x+y))
Let's parse the complicated definition of add
. We will start from the
inside and work our way out. The innermost expression fun y -> x + y
declares a function that takes as argument an integer y
and returns
the integer x+y
.
Where does x
come from? We see that x
is bound from the
definition of outer function (fun x -> (fun y -> x+y))
. So the way to
understand this is that this expression creates a function that takes a single
argument x
and returns the function fun y -> x + y
. In
other words, add is itself a single argument function and when applied on an
argument x, it turns returns another single argument function that adds x to
the argument supplied (y) to the latter function.
This is a slightly subtle concept, so let's look at an example OCaml session that might help explain it.
# let add = (fun x -> fun y-> x + y);; val add : int -> int -> int =# add;; - : int -> int -> int =
The session shows that add has type int -> int -> int which we now realize means that it is of type int -> (int -> int), or equivalently add takes an argument of type integer and returns a function of type integer to integer.
# let add2 = add 2;; val add2 : int -> int =
We've defined add2 by applying the argument 2 to the function add. As we'd expect, the type of add2 is a function from integer to integer.
And what does add2 do?
# add2 3;; - : int = 5 # add2 10;; - : int = 12 # add2 100;; - : int = 102
It simply adds the integer 2 to its argument.
Partial Application
This process of applying fewer than n arguments to a n-argument function is called partial application. The function add2 was defined by partially applying the argument 2 to the function add.
Another example of partial application is the following:
let inc = add 1;;