Caml
Power

Problem Set 4: Interpreters and Proofs

Quick Links:

Objectives

In this assignment, you will develop an environment-based interpreter to replace the substitution-based interpreter defined in class. You will also practice proving your functional programs are correct. You will do this assignment individually.

Note: You can do Parts 1 and 2 in either order (or switch back and forth). You do not need to complete Part 1 before beginning Part 2.

Getting Started

Download the code here. Unzip and untar it:

$ tar xfz a4.tgz
You should see:

  • A collection of .ml and .mli files
  • a Makefile
  • proofs.txt

A few important things to remember before you start:

  • This assignment must be done individually.
  • As in the previous assignment, all of your programs must compile. Programs that do not compile will receive an automatic zero. Make sure that the functions you are asked to write have the correct names and the number and type of arguments specified in the assignment.
  • In this problem set, it is important to use good style (style will factor in to your grade). Style is naturally subjective, but the COS 326 style guide provides some good rules of thumb. As always, think first; write second; test and revise third. Aim for elegance.

Part 1: Environment-based Evaluator

In class, we developed a substitution-based interpreter in OCaml. We also discussed the fact that while substitution makes for a good theoretical model of how OCaml programs evaluate, practical implementations use an environment-based model of execution. Fortunately, the two models coincide, so you can safely use substitution to reason about your programs while your compiler uses environments to implement them.

In this part of the assignment, you will implement an environment-based interpreter for a simplified OCaml-like language. In your tar file, you should find the following files:

  • Syntax.ml: The datatype which defines the abstract syntax of the language you will be evaluating. The language contains integers, pairs, lists, and recursive functions. This file also contains a definition of environments and a few small functions you must complete.
  • EvalSubst.ml: A substitution-based interpreter. We have partially implemented a substitution-based interpreter for you. This is to give you a (partial) reference implementation that you can look at and to ensure that you understand the semantics of recursive functions, variables, let expressions, etc. YOU DO NOT HAVE TO IMPLEMENT ANY MORE OF THIS INTERPRETER. (You can for fun, of course.)
  • EvalEnv.ml: An environment-based interpreter. In this interpreter, variable bindings are stored in an environment. When we encounter a variable in our code, we look up the value of the variable in the environment. In addition, to evaluate a recursive function definition, we create a closure, which is effectively a pair of the recursive function definition and its environment.
  • EvalUtil.ml: A couple of utilities shared between both the substitution-based interpreter and the environment-based interpreter.
  • Testing.ml: A small collection of test cases. You will have to edit this file to add a few additional test cases (and then as many more as you want to test your code).
  • Printing.ml: Some pretty-printing routines. You do not have to touch this file.
  • Main.ml: A driver that runs the test cases. It is set up to test EvalEnv. Change one line to test EvalSubst, the substitution-based interpreter.

Prelude

It is a good idea to begin by familiarizing yourself with the code before you start writing any new code yourself.

  • Start by looking at Syntax.ml. That is the place where we define the datatype for the language. Bring questions about how the various elements of the language work to Precept.
  • Next, look at some of the test cases in Testing.ml. Those show you how to use the data types to construct example programs. Even though the substitution-based interpreter is incomplete, you should be able to execute a few of the examples, such as fact4. Try it. The examples also help show you how to use pairs and lists. If you aren't sure what some of the elements of the language (like pairs or lists) are suppose to do or can't figure out how some of the examples we have given you work, ask in Precept, on Piazza or go to office hours.
  • Next, look at some of the code in EvalSubst.ml. That code will illustrate the basic structure of a (substitution-based) interpreter. Your environment-based interpreter will be a little different, of course.

Part 1.1

Your main task is to implement the environment-based interpreter. To do this, you will initially have to modify 2 files:

  • In Syntax.ml, fill in the small definitions of lookup and update. (See Syntax.ml for a description of what they should do.)
  • In EvalEnv.ml, you will have to fill in the functions is_value which defines which elements of the language are values and eval_body, which is the bulk of the interpreter. We have given you 1 case of the interpreter: The case for variables. That case simply looks up a variable in the current environment. You should not change this case. The other key cases are:
    • The case for Rec. Rec is a recursive function definition. In this case, you will have to return not a Rec (like in the subsitution-based interpreter), but instead a Closure, which packages the components of the recursive function definition up with an environment.
    • The case for function application. We will let you figure out what to do there.
    Of course, you will also have to fill in cases involving pairs, lists and match statements.

Part 1.2

In Testing.ml, we ask you to create several specific new test cases. Please see that file and create the new functions that are requested. (Of course, you might want to do this while writing and testing your interpreter above.) You may create as many more test cases as you like.

Part 1.3

In functional programming language implementations, it is important to optimize the size of the closures that are created. The most naive thing one can do when creating a closure is to put ALL variables in the current execution environment in to the closure's environment. However, any given recursive function only requires access to its free variables. For a refresher on free and bound variables read this note.

Revise your implementation of the Rec case of the interpreter to create smaller closures --- closures that only contain the free variables of the recursive function. To do so, you'll need to write a function that analyzes an arbitrary function to extract the free variables of the function. Note: be careful comparing strings. You should use String.compare. We placed a couple of helper functions in Syntax.ml for comparing variables for equality and inequality (var_eq and var_neq).

Part 2: Program Correctness

In this section, you must answer several theoretical questions. You must present your proofs in the 2-column style illustrated in class with facts that you conclude (like a new expression being equal to a prior one) on the left and a justification (like by "evaluation" or "simple math") on the right. Hand in the text file proofs.txt filled in with your answers.

Part 2.0

To prepare for this part of the assignment, read the online notes on equational reasoning. Hand in proofs in the style presented in those notes.

Part 2.1: Complex Numbers

Let us represent complex numbers as a pair of integers. Consider the following addition function that adds the corresponding components of the pairs.

type complex = int * int

let cadd (c1:complex) (c2:complex) = 
  let (x1,y1) = c1 in
  let (x2,y2) = c2 in
  (x1+x2, y1+y2)
;;

Prove that for any three complex numbers, a, b, c : complex, complex addition is associative:

cadd a (cadd b c) == cadd (cadd a b) c

Part 2.2: A Minimal Proof

Consider the following code (and please note that min_int is a constant with type int that OCaml provides you with):

let max (x:int) (y:int) = 
  if x >= y then x else y
;;

let rec maxs (xs:int list) =
  match xs with
      [] -> min_int
    | hd::tail -> max hd (maxs tail)
;;

let rec append (l1:'a list) (l2:'a list) : 'a list =
  match l1 with 
    [] -> l2
  | hd::tail -> hd :: append tail l2
;;

You may use the fact that max, maxs and append are all total functions. You may also use the following simple properties of max without proving them:

For all integer values a, b, c:

(commutativity)  max a b == max b a

(associativity)  max (max a b) c == max a (max b c)

(idempotence)    max a (max a b) == max a b

(min_int)        max min_int a == a

Now, prove that for all integer lists xs and ys,

max (maxs xs) (maxs ys) == (maxs (append xs ys))

Your proof is by induction on the structure of the list xs. Mention where you use properties such as associativity, commutativity, idempotence or min_int, if and when you need to use them. Also state where it is necessary to appeal to the idea that some expression is valuable or some function is total (if necessary).

Part 2.3: Map and back

In previous assignments, we sometimes asked you to write recursive functions over lists using map and fold and some times without. Now you can prove that it doesn't matter which way you choose to write your function -- the two styles are equivalent.

let rec map (f:'a -> 'b) (xs:'a list) : 'b list =
  match xs with
      [] -> []
    | hd::tail -> f hd :: map f tail
;;

let bump1 (xs:int list) : int list = 
  map (fun x -> x+1)  xs
;;

let bump2 (xs:int list) : int list =
  match xs with
    [] -> []
  | hd::tail -> (hd+1) :: map (fun x -> x+1) tail
;;

let rec bump3 (xs:int list) : int list =
  match xs with
    [] -> []
  | hd::tail -> (hd+1) :: bump3 tail
;;

(a) Prove that for all integer lists l, bump1 l == bump2 l.

(b) Prove that for all integer lists l, bump1 l == bump3 l.

(c) In one sentence, what's the big difference between the part (a) and part (b)?

Part 2.4: Zippity do da

Consider the following functions:

let rec zip (ls:'a list * 'b list) : ('a * 'b) list =
  let (xs,ys) = ls in
  match (xs,ys) with
      ([],_) -> []
    | (_,[]) -> []
    | (x::xrest, y::yrest) -> (x,y)::zip (xrest,yrest)
;; 

let rec unzip (xs:('a * 'b) list) : 'a list * 'b list =
  match xs with
      [] -> ([],[])
    | (x,y)::tail -> 
      let (xs,ys) = unzip tail in
      (x::xs, y::ys)
;; 

Prove or disprove each of the following.

(a) For all l : ('a * 'b) list, zip(unzip l) == l.

(b) For all l1 : 'a list, l2 : 'b list, unzip(zip (l1,l2)) == (l1,l2).

NOTE: The way that you disprove a theorem is you give a counter example. For example, if a theorem says "for all x of type t, ... property about x ..." then to disprove the theorem, all you need to do is find 1 value v with type t and show that the property does not hold for v. If you can, then clearly the property does not hold "for all" x of type t.

Handin Instructions

This problem set is to be done individually.

You must hand in these files to dropbox (see link on assignment page):

  1. Syntax.ml
  2. EvalEnv.ml
  3. Testing.ml
  4. proofs.txt