Caml
Power

OCaml Programming Basics

These notes introduce a few of the most basic concepts you need to know in order to begin programming in a functional language like OCaml. However, we focus on central concepts rather than on giving any kind of complete description of the language. You will want to use these notes in conjunction with other resources such as the OCaml manual, the OCaml standard library, and our programming style guide. When you finish with these notes, you should also definitely glance at the Stdlib module -- it is always open and contains many of the most highly used operations and functions.

Types, Expressions, Values, Declarations

Basic OCaml programs are made up 5 different kinds of things: types, values, expressions, declarations and comments. Comments are easy: they begin with (* and end with *). For example:
(* I am a comment.  

   (* Did you know comments can be nested? *)

   Cool!  This is sometimes useful for debugging,
   when you want to comment out a section of code
   that itself already has comments in it!

*)
Some examples of types, values, expressions and declarations are given below.

  • Types: types describe a set of values and operations on those values. Examples:
    • int
    • float
    • char
    • bool
    • string
    • int -> bool (the type of functions with integer arguments and boolean results)
    • int -> bool -> string (the type of functions with two arguments: first an integer and then a boolean; the function result has type string)
  • Values: values are the data that results from executing a computation. Examples:
    • integer values: -44, 0, 1, ...
    • float values: -3.0, 0., 0.0, 2.5e3, 2.5E3, 2500., ...
      • note that 0, 1, -2 are integer values but not float values.
      • float constants must contain a "." to distinguish them from int constants
    • character values: 'a', 'b', '\n'
    • boolean values: true, false
    • string value: "hello, my name is Dave\n"
    • function value: fun x -> x-3
      • A function value with argument x returning a result three less than x.
      • And yes, I'm glad you asked: A function is a value! It is just as much a value as 2 or "hello". It can be a result of a computation.
  • Expressions: expressions are the basis of all computation in a functional language. Each expression has a type and, if it terminates, produces a value. Examples:
    • integer expression: (2 + (-3)*3*5 - 1) / 32
    • float expression: (2.5 +. (-3.)*7.2 -. 1.) /. 32
      • operations such as + and - are integer operations
      • operations such as +. and -. are float operations
      • float operations end with "." to distinguish them from int operations
    • character expression: char_of_int 88
      • The function char_of_int computes the character with the given ASCII code. (In this case, the character with code 88 is 'X'.)
      • There are a number of other conversions between types in the pervasives library. They are typically called type2_of_type1 where type1 is the type of the argument and type2 is the type of the result.
    • boolean expression: not ('a' <= 'b') || (true && 2 != 1)
    • string expression: "hello," ^ " my name" ^ " is Dave\n"
      • The operator ^ concatenates two strings
      • It is left-associative like the integer subtraction operation.
    • function expression: compose (fun x -> x+2) (fun x -> x-1)
      • compose is a function that takes two others functions as arguments. (It is an easy function to write in OCaml.)
  • Declarations: declarations introduce new variables (ie, names) to stand for types and values. Here are some example declarations:
    (* school is a new type name; it is equal to the type string *)
    type school = string
    
    (* age is a new type name; it is equal to the type int *)
    type age = int
    
    (* compare is a new type name; 
     * it is equal to    age -> age -> bool 
     * and also equal to int -> int -> bool *)
    type compare = age -> age -> bool
    
    (* dylan_age and my_age are new value names;
     * dylan_age is equal to 3; my_age is equal to 39
     * both dylan_age and my_age have type age (which is the same as type int)
     * by the way, "value names" are usually called "variables"; 
     * we will call them that from now on *)
    let dylan_age : age = 3
    
    let my_age : age = dylan_age * 10 + 9
    
    

OCaml Compiler

That was fun, but you need a way to save your work. Like any ordinary programming language, you can enter declarations and expressions in to a file and compile the file. For instance, you can create a file hello.ml and include the following expression inside it (or just click here to download the file):

print_endline "hello world"
The print_endline is a function in the pervasive (ie: always open) library that prints a string followed by an end-of-line character. (By the way, the pervasive library is here and you should take a glance through it. You will use it a lot. It contains all of the most common functions on base types like integers, booleans, floats, strings and a few other things.) Ok, now let's compile this file and create an executable named "hello.byte" by typing the following.
ocamlbuild hello.byte
If OCaml is properly installed, you should notice that a file named hello.byte has appeared in the current directory, as well as directory called _build. You can take a peak inside _build if you'd like -- you should see a bunch of different kinds of files in there. You don't need to worry about the different kinds of files at this point though. You should be able to run the executable you created in your current directory by typing this:
./hello.byte
After typing the above and pressing enter, you should see "hello world" printed out:
hello world

Let's try creating a slightly more sophisticated file -- one containing a function. In OCaml, you can create function simply by using a let declaration. Click here for the file cube.ml or see below.
(* n^3 *) 
let cube (n:int) : int =
  n * n * n

(* print n and its cube *)
let message (n:int) : string = 
  "The cube of " ^ string_of_int n ^ " is " ^ string_of_int (cube n)

(* declare two integer constants named "arg1" and "arg2" *)
let arg1 : int = 2
let arg2 : int = 3

(* Instead of declaring a new named value, we just want to execute some code.
   let _ = ... will execute the right-hand side and throw away the result.
   The underscore binding "_" is the "don't care" name.   *)
let _ = print_endline (message arg1)

The code above contains two functions, cube and message. The basic syntax for writing a function is as follows.

let function-name (argument-name : argument-type) : result-type =
  expression-that-computes-result-of-function
If desired, you can create multi-argument functions simply by adding more argument names with their types. For instance, here is a function called crazy with integer arguments x, y and z.
let crazy (x : int) (y : int) (z : int) : int =
  (x + y) / z
The types of functions and results are optional, but it is good style to include them (particularly on top-level functions). To call a function, you simply write down the function name and its argument. For instance, we called the string_of_int function on an argument n as follows:
string_of_int n
Some languages require parentheses around arguments, but OCaml does not -- simply place a space between the function name and the argument. However, sometimes we call one function and then use its result as an argument to the next function. We did that when we called string_of_int on the result of calling cube on n:
string_of_int (cube n)
If we had left off the parentheses like this:
string_of_int cube n
OCaml would have assumed that string_of_int was a 2-argument function that accepted cube as its first argument and n as its second argument. If we then compiled the file, we would see the following error message:
$ ocamlbuild cube.ml
File "cube.ml", line 8, characters 48-61:
Error: This function is applied to too many arguments;
maybe you forgot a `;'

The main point here is that while parentheses around a function argument are not a necessary part of the function call syntax (like they are in C or Java), they do serve a purpose for defining the extent of an expression. You have all done this before when writing mathematical expressions such as 3*(2+1). The parentheses serve to indicate that the operation 2+1 should be computed first and its result should be multiplied by 3.

Here is one more simple example function.

let silly (x:int) (y:int) : int =
  (x+y) * (x+y) - (x+y)
That code is silly for a number of reasons. One of the reasons is that good programmers don't write an expression like x+y multiple times -- they write down the expression once and associate it with a variable. In OCaml, a let expression serves this purpose. Here is an example.
let less_silly (x:int) (y:int) : int =
  let z = x+y in
  z * z - z
The way to think about a let expression is that the variable introduced (z) in this case) is exactly equal to the value computed (x+y in this case) and that variable may be used in the expression following the in keyword. In general, a let expression has the following form:
let variable-name = expression1 in expression2
The variable-name may be used in expression2, but unlike a let declaration (which does not use the keyword "in"), the variable-name may not be used more widely. It is hidden from outsiders. This is a great thing! Information hiding is the basis for modular computation. If you were to type the code for the less_silly in to the toplevel, OCaml would tell you you had created one declaration:
val less_silly : int -> int -> int = < fun >
It would not mention the inner variable z.

Now, there is no reason to treat integer variables like z any differently than variables with function type. Consequently, local let expressions defining functions are also allowed. Consider the function double_square, which contains within it another function square.

let double_square (x:int) (y:int) : int =
  let square (n:int) : int = n * n in
  square x + square y
Just like z was a local integer variable, square is a local function variable. There is little difference.

In terms of style, all top-level functions like double_square should be annotated with there types. However, it is acceptable to leave the types off of local functions like square. Hence I might rewrite the above as follows.

let double_square (x:int) (y:int) : int =
  let square n = n * n in
  square x + square y

Using Other Modules

We will discuss how to define our own modules later. For now, it suffices to know how to use values defined in other modules in the standard libraries. If the module is named Mod and it defines a value x then one can use dot notation to refer to it: Mod.x. For example, the Char library contains a number of functions for operating over characters. Here, we define our own function lower, which does nothing more than call the standard library function Char.lowercase:

let lower (c:char) = Char.lowercase c

(Very minor note: Char.lowercase is deprecated and if you are using OCaml 4.03.0 or later, you may want to try Char.lowercase_ascii)

Summary

We have covered a good chunk of the core syntax of OCaml so far. Here are the expression forms we have seen:

  • values
  • variables
  • variable defined by another module: ModuleName.var
  • expression with a type annotation: (exp : type)
  • binary operator op: exp1 op exp2
  • function f applied to arguments: f exp1 exp2 ... expk
  • let expression declaring a local variable: let var = exp1 in exp2
  • let expression with a type annotation: let var : type = exp1 in exp2
  • let expression declaring a local function: let f var1 var2 ... vark = exp1 in exp2
  • local function with type annotations:
    • let f (var1:type1) ... (vark:typek) : type_result = exp1 in exp2

Here are the declarations we have seen:

  • type declaration: let type_name = type;;
  • let declaration: let var : type = exp1;;
  • function declaration:
    • let f (var1:type1) ... (vark:typek) : type_result = exp1;;