Power
O'Caml Programming Basics
These notes introduce a few of the most basic concepts you need to know in order to begin programming in a functional language like O'Caml. However, we focus on central concepts rather than on giving any kind of complete description of the language. You will want to use these notes in conjunction with other resources such as the O'Caml manual, the O'Caml standard library, and our programming style guide. When you finish with these notes, you should also definitely glance at the Pervasive Library -- it is always open and contains many of the most highly used operations and functions. If this tutorial doesn't do it for you, consider looking at Scott Smith's introduction to O'Caml, which has a few more details.
Types, Expressions, Values, Declarations
Basic O'Caml programs are made up 5 different kinds of things: types, values, expressions, declarations and comments. Comments are easy: they begin with(*
and
end with *)
. For example:
(* I am a comment. (* Did you know comments can be nested? *) Cool! *)Some examples of types, values, expressions and declarations are given below.
- Types: types describe a set of values and operations on those values. Examples:
int
float
char
bool
string
int -> bool
(the type of functions with integer arguments and boolean results)int -> bool -> string
(the type of functions with two arguments: first an integer and then a boolean; the function result has type string)
- Values: values are the data that results from executing a computation. Examples:
- integer values:
-44
,0
,1
, ... - float values:
-3.0
,0.
,0.0
,2.5e3
,2.5E3
,2500.
, ...- note that 0, 1, -2 are integer values but not float values.
- float constants must contain a "." to distinguish them from int constants
- character values:
'a'
,'b'
,'\n'
- boolean values:
true
,false
- string value:
"hello, my name is Dave\n"
- function value:
fun x -> x-3
- A function value with argument x returning a result three less than x.
- And yes, I'm glad you asked: A function is a value! It is just as much a value as 2 or "hello". It can be a result of a computation.
- integer values:
- Expressions: expressions are the basis of all computation in a functional language.
Each expression has a type and, if it terminates, produces a value. Examples:
- integer expression:
(2 + (-3)*3*5 - 1) / 32
- float expression:
(2.5 +. (-3.)*7.2 -. 1.) /. 32
- operations such as + and - are integer operations
- operations such as +. and -. are float operations
- float operations end with "." to distinguish them from int operations
- character expression:
char_of_int 88
- The function
char_of_int
computes the character with the given ASCII code. (In this case, the character with code 88 is 'X'.) - There are a number of other conversions between types in the pervasives library. They are
typically called
type2_of_type1
wheretype1
is the type of the argument andtype2
is the type of the result.
- The function
- boolean expression:
not ('a' <= 'b') || (true && 2 != 1)
- string expression:
"hello," ^ " my name" ^ " is Dave\n"
- The operator ^ concatenates two strings
- It is left-associative like the integer subtraction operation.
- function expression:
compose (fun x -> x+2) (fun x -> x-1)
- compose is a function that takes two others functions as arguments. (It is an easy function to write in O'Caml.)
- integer expression:
- Declarations: declarations introduce new variables (ie, names) to stand for types and values. Here
are some example declarations:
(* school is a new type name; it is equal to the type string *) type school = string;; (* age is a new type name; it is equal to the type int *) type age = int;; (* compare is a new type name; * it is equal to age -> age -> bool * and also equal to int -> int -> bool *) type compare = age -> age -> bool;; (* dylan_age and my_age are new value names; * dylan_age is equal to 3; my_age is equal to 39 * both dylan_age and my_age have type age (which is the same as type int) * by the way, "value names" are usually called "variables"; * we will call them that from now on *) let dylan_age : age = 3;; let my_age : age = dylan_age * 10 + 9;;
O'Caml Toplevel
To begin to play with O'Caml, you can start the O'Caml toplevel interpreter (or just the "toplevel") for short. To
do so, simply type ocaml
in your shell. If $
is your unix prompt, you should see the following:
$ ocaml Objective Caml version 3.12.0 #
Now, you are in the O'Caml toplevel environment and you can use it like a sophisticated calculator. Type in any O'Caml expression you want, terminated with a double semi-colon. When you press return, the toplevel will determine the type of the expression and evaluate the expression, producing a value. For example, if you type:
# 2 + 3;;then O'Caml responds with:
- : int = 5 #Here is a complete session where the user typed in several expressions as well as some declarations. The session was terminated when the user typed
#quit;;
. Directives
to the O'Caml toplevel always begin with a #
symbol.
$ ocaml Objective Caml version 3.12.0 # 2+3;; - : int = 5 # "hello" ^ " " ^ "world";; - : string = "hello world" # 2.0 +. 3.8;; - : float = 5.8 # max_int;; - : int = 1073741823 # not true;; - : bool = false # let x = 2 + 3;; val x : int = 5 # x + x + 3;; - : int = 13 # type age = int;; type age = int # let my_age : age = 3;; val my_age : age = 3 # #quit;; $
O'Caml Compiler
That was fun, but you need a way to save your work. Like any ordinary programming
language, you can enter declarations and expressions in to a file and compile the
file. For instance, you can create a file hello.ml
and include the following expression
inside it (or just click here to download the file):
print_endline "hello world";;The
print_endline
is a function in the pervasive (ie: always open) library that prints
a string followed by an end-of-line character. (By the way, the pervasive library
is here and you
should take a glance through it. You will use it a lot. It contains all of the most common
functions on base types like integers, booleans, floats, strings and a few other things.)
Ok, now let's compile this file and create an executable named "hello" by typing the following.
ocamlc -o hello hello.mlIf O'Caml is properly installed, you should be able to run your executable by typing this:
./helloAfter typing the above and pressing enter, you should see "hello world" printed out:
hello worldLet's try creating a slightly more sophisticated file -- one containing a function. In O'Caml, you can create function simply by using a let declaration. Click here for the file
cube.ml
or see below.
(* n^3 *) let cube (n:int) : int = n * n * n ;; (* print n and its cube *) let message (n:int) : string = "The cube of " ^ string_of_int n ^ " is " ^ string_of_int (cube n) ;; let arg1 : int = 2;; let arg2 : int = 3;; (* the main expression *) print_endline (message arg1);;
The code above contains two functions, cube
and message
.
The basic syntax for writing a function is as follows.
let function-name (argument-name : argument-type) : result-type = expression-that-computes-result-of-functionIf desired, you can create multi-argument functions simply by adding more argument names with their types. For instance, here is a function called
crazy
with
integer arguments x
, y
and z
.
let crazy (x : int) (y : int) (z : int) : int = (x + y) / zThe types of functions and results are optional, but it is good style to include them (particularly on top-level functions). To call a function, you simply write down the function name and its argument. For instance, we called the
string_of_int
function on an argument n
as follows:
string_of_int nSome languages require parentheses around arguments, but O'Caml does not -- simply place a space between the function name and the argument. However, sometimes we call one function and then use its result as an argument to the next function. We did that when we called
string_of_int
on the result of calling cube
on n
:
string_of_int (cube n)If we had left off the parentheses like this:
string_of_int cube nO'Caml would have assumed that
string_of_int
was a 2-argument
function that accepted cube
as its first argument and n
as
its second argument. If we then compiled the file, we would see the following error message:
$ ocamlc -o cube cube.ml File "cube.ml", line 8, characters 48-61: Error: This function is applied to too many arguments; maybe you forgot a `;'The main point here is that while parentheses around a function argument are not a necessary part of the function call syntax (like they are in C or Java), they do serve a purpose for defining the extent of an expression. You have all done this before when writing mathematical expressions such as
3*(2+1)
. The
parentheses serve to indicate that the operation 2+1 should be computed first and its result should be multiplied
by 3.
Debugging with the Toplevel
The toplevel environment can be very useful when debugging code that you have written down in files. To load the declarations defined in a file (and run the expressions in that file), simply type
#use "filename";;at the prompt. When you do that, all of the declarations defined in the file become available to you. If you are using emacs, and have Tuareg mode set up, you can also just type C-c C-b when inside a file with a .ml termination. This will open the O'Caml interpreter and read in the file automatically.
As an example,
here is an O'Caml session that uses and explores the cube.ml
file. It assumes
that cube.ml
is in the directory from which you launched the ocaml interpreter.
(If cube.ml
is not in the current directory, you can move from directory to another.
See here for more directives, including those for changing directories.)
$ ocaml Objective Caml version 3.12.0 # #use "cube.ml";; val cube : int -> int =You will notice that at the top, after we used cube.ml, the system printed out a list of variables, the values associated with them (where appropriate -- the code for a function value is not printed), and the types of the variables. In this file, the function values includeval message : int -> string = val arg1 : int = 2 val arg2 : int = 3 The cube of 2 is 8 - : unit = () # cube 0;; - : int = 0 # cube 2;; - : int = 8 # cube 3;; - : int = 27 # cube arg2;; - : int = 27 # cube 2 + cube 3;; - : int = 35 # message 2;; - : string = "The cube of 2 is 8" # #quit;;
cube
and
message
. The integer values include arg1
and arg2
.
This line:
The cube of 2 is 8shows the result of executing the main expression. (There does not have to be just 1 main expression; there can be any number of expressions embedded in a file. The O'Caml toplevel will evaluate all of them.) After using the file, we are free to play with the definitions by typing in further expressions (or declarations) that use them. This is a good way to start debugging declarations and expressions you have just written.
Here is one more simple example function.
let silly (x:int) (y:int) : int = (x+y) * (x+y) - (x+y) ;;That code is silly for a number of reasons. One of the reasons is that good programmers don't write an expression like
x+y
multiple times -- they write down the expression
once and associate it with a variable. In O'Caml, a let expression
serves this purpose. Here is an example.
let less_silly (x:int) (y:int) : int = let z = x+y in z * z - z ;;The way to think about a let expression is that the variable introduced (
z
) in this case) is exactly equal to the value computed
(x+y
in this case) and that variable may be used in the
expression following the in
keyword. In general,
a let expression has the following form:
let variable-name = expression1 in expression2The
variable-name
may be used in expression2
, but unlike
a let declaration (which does not use the keyword "in
"),
the variable-name
may not be used more widely.
It is hidden from outsiders. This is a great thing! Information hiding is the basis
for modular computation. If you were to type the code for the less_silly
in to the toplevel, O'Caml would tell you you had created one declaration:
val less_silly : int -> int -> int = < fun >It would not mention the inner variable z.
Now, there is no reason to treat integer variables like z any differently than
variables with function type. Consequently, local let expressions defining functions
are also allowed. Consider the function double_square
,
which contains within it another function square
.
let double_square (x:int) (y:int) : int = let square (n:int) : int = n * n in square x + square y ;;Just like
z
was a local integer variable, square is a local function variable.
There is little difference.
In terms of style, all top-level functions like double_square
should be annotated
with there types. However, it is acceptable to leave the types off of local functions like
square. Hence I might rewrite the above as follows.
let double_square (x:int) (y:int) : int = let square n = n * n in square x + square y ;;
Using Other Modules
We will discuss how to define our own modules later. For now, it suffices to know how to use
values defined in other modules in the standard libraries. If the module is named Mod
and it defines a value x
then one uses dot notation to refer to it: Mod.x
.
For example, the Char
library contains a number of functions for
operating over characters. Here, we use them within the toplevel.
$ ocaml Objective Caml version 3.12.0 # let lower_x = Char.lowercase 'X';; val lower_x : char = 'x' # let upper_x = Char.uppercase lower_x;; val upper_x : char = 'X' # #quit;;
Summary
We have covered a good chunk of the core syntax of O'Caml so far. Here are the expression forms we have seen:
- values
- variables
- variable defined by another module:
ModuleName.var
- expression with a type annotation:
(exp : type)
- binary operator op:
exp1 op exp2
- function
f
applied to arguments:f exp1 exp2 ... expk
- let expression declaring a local variable:
let var = exp1 in exp2
- let expression with a type annotation:
let var : type = exp1 in exp2
- let expression declaring a local function:
let f var1 var2 ... vark = exp1 in exp2
- local function with type annotations:
let f (var1:type1) ... (vark:typek) : type_result = exp1 in exp2
Here are the declarations we have seen:
- type declaration:
let type_name = type;;
- let declaration:
let var : type = exp1;;
- function declaration:
let f (var1:type1) ... (vark:typek) : type_result = exp1;;