ADTAbstract Data Types

From Coq Require Import Strings.String. (* for manual grading *)
From VFA Require Import Perm.
From VFA Require Import Maps.
From VFA Require Import SearchTree.

Let's consider the concept of lookup tables, indexed by keys that are natural numbers, mapping those keys to values of arbitrary (parametric) type. We can express this in Coq as follows:

Module Type TABLE.
Parameter V: Type.
Parameter default: V.
Parameter table: Type.
Definition key := nat.
Parameter empty: table.
Parameter get: key → table → V.
Parameter set: key → V → table → table.
Axiom gempty: ∀ k, (* get-empty *)
       get k empty = default.
Axiom gss: ∀ k v t, (* get-set-same *)
      get k (set k v t) = v.
Axiom gso: ∀ j k v t, (* get-set-other *)
      j ≠ k → get j (set k v t) = get j t.
End TABLE.

This means: in any Module that satisfies this Module Type, there's a type table of lookup-tables, a type V of values, and operators empty, get, set that satisfy the axioms gempty, gss, and gso.

For example, we can easily give an implementation of TABLE in terms of our function-based maps. For specificity, we'll take V to be the type bool and use a default value of false.

Module MapsTable <: TABLE.
Definition V := bool.
Definition default: V := false.
Definition key := nat.
Definition table := total_map V.
Definition empty : table := t_empty default.
Definition get (k: key) (m: table) : V := m k.
Definition set (k: key) (v: V) (m: table) : table := t_update m k v.
Theorem gempty: ∀ k,
     get k empty = default.
Proof. intros. apply t_apply_empty. Qed.
Theorem gss: ∀ k v t,
     get k (set k v t) = v.
Proof. intros. apply t_update_eq. Qed.
Theorem gso: ∀ j k v t,
     j ≠k →
     get j (set k v t) = get j t.
Proof. intros. apply t_update_neq. omega. Qed.
End MapsTable.

The specialization of TABLE to TABLE_STRING exposes the definition of V as string to the world outside the module. Without it, all that would be known is what TABLE exposes, which is that V is some Type.

Now, let's calculate: put (1,true) and then (3,false) into a map, then lookup 1.

Eval compute in MapsTable.get 1 (MapsTable.set 3 false (MapsTable.set 1 true MapsTable.empty)).
(* = true *)

An Abstract Data Type comprises:

A type with a hidden representation (in this case, t).
Interface functions that operate on that type (empty, get, set).
Axioms about the interaction of those functions (gempty, gss, gso).

So, MapsTable is an implementation of the TABLE abstract type.

Exercise: 4 stars, standard (ListsTable)

Give an alternative implementation of TABLE using lists of (key,value) pairs as the hidden representation. Again, take V to be bool and default to be false.

(* Do not modify the following line: *)
Definition manual_grade_for_ListsTable : option (nat ×string) := None.
☐

The problem with MapsTable and ListsTable is that they are each very inefficient: linear time per get operation. If you do a sequence of N get and set operations, it can take time quadratic in N. For a more efficient implementation, we can use our search trees from SearchTreeAlt.

Module TreeTable <: TABLE.
Definition V := bool.
Definition default : V := false.
Definition table := tree V.
Definition key := nat.
Definition empty : table := empty_tree V.
Definition get (k: key) (m: table) : V := lookup V default k m.
Definition set (k: key) (v: V) (m: table) : table :=
     insert V k v m.
Definition elements (m:table) : list (key ×V) :=
     elements V m.
Theorem gempty: ∀ k,
     get k empty = default.
Proof. intros. apply lookup_empty. Qed.
Theorem gss: ∀ k v t,
     get k (set k v t) = v.
Proof. intros. apply lookup_insert_eq. Qed.
Theorem gso: ∀ j k v t,
     j ≠k →
     get j (set k v t) = get j t.
Proof. intros. apply lookup_insert_neq. omega. Qed.
End TreeTable.

Now let's consider a richer interface ETABLE for Tables that support an elements operation.

Module Type ETABLE.
Parameter V: Type.
Parameter default: V.
Parameter table: Type.
Definition key := nat.
Parameter empty: table.
Parameter get: key → table → V.
Parameter set: key → V → table → table.
Parameter elements: table → list (key ×V).
Axiom gempty: ∀ k, (* get-empty *)
       get k empty = default.
Axiom gss: ∀ k v t, (* get-set-same *)
      get k (set k v t) = v.
Axiom gso: ∀ j k v t, (* get-set-other *)
     j ≠ k →
     get j (set k v t) = get j t.
Axiom gie: ∀ k v t,
     get k t = v →
     In (k ,v ) (elements t) ∨ ¬ In k (map fst (elements t)) ∧ v = default.
Axiom ieg: ∀ k v t,
     In (k ,v ) (elements t) →
     get k t = v.
Axiom edistinct : ∀ t,
     NoDup (map fst (elements t)).
End ETABLE.

Since we know our search tree implementation supports elements, let's try to use it:

Module TreeETable <: ETABLE.
Definition V := bool.
Definition default : V := false.
Definition table := tree V.
Definition key := nat.
Definition empty : table := empty_tree V.
Definition get (k: key) (m: table) : V := lookup V default k m.
Definition set (k: key) (v: V) (m: table) : table :=
     insert V k v m.
Definition elements (m:table) : list (key ×V) :=
     elements V m.
Theorem gempty: ∀ k,
     get k empty = default.
Proof. intros. apply lookup_empty. Qed.
Theorem gss: ∀ k v t,
     get k (set k v t) = v.
Proof. intros. apply lookup_insert_eq. Qed.
Theorem gso: ∀ j k v t,
     j ≠k →
     get j (set k v t) = get j t.
Proof. intros. apply lookup_insert_neq. omega. Qed.
Theorem gie: ∀ k v t,
     get k t = v →
     In (k ,v ) (elements t) ∨ ¬ In k (map fst (elements t)) ∧ v = default.
Proof.
   intros. apply lookup_In_elements; auto.
   (* oops! we don't know this *)
Admitted.
Theorem ieg: ∀ k v t,
     In (k ,v ) (elements t) →
     get k t = v.
Proof.
   intros. apply In_elements_lookup; auto.
Admitted.
Theorem edistinct : ∀ t,
     NoDup (map fst (elements t)).
Proof.
   intros. apply elements_keys_distinct; auto.
Admitted.
End TreeETable.

Giving proofs of the specifications involving elements requires that t have the BST property. This is a common problem with ADTs: we need values of the (hidden) representation type to be well-formed, i.e. to satisfy some invariant. ALthough we know that the official operations on ADTs do maintain the invariant, we must ensure that the client of an ADT cannot "forge" values, that is, cannot coerce the representation type into the abstract type; especially ill-formed values of the representation type. This "unforgeability" is enforced in some real programming languages: ML (Standard ML or Ocaml) with its module system; Java, whose Classes can have "private variables" that the client cannot see.

A Brief Excursion into Dependent Types

Let's briefly look at a different way of enforcing the representation invariant: every value of the representation type will carry along with it a proof that the value satisfies the representation invariant.

To do this, we'll use sigma types, aka dependent sums, which are defined in the Coq standard library. Suppose P is a predicate on type A, that is, P : A → Prop. Then sig A P is the type of elements of A that satisfy P. You could think of this as a subset: sig A P is the subset of values of A that satisfy P. Based on that intuition, there is a notation {x : A | P} that means sig A P.

Check sig.
(* ====>
sig : forall A : Type, (A -> Prop) -> Type *)

(* As an example, here is a type for even natural numbers. Nat.Even : nat → Prop is defined in the standard library. The A argument to
sig is implicit, so we don't write it below. *)

Definition even_nat := sig Nat.Even.
Print even_nat.
(* ====>
even_nat = {x : nat | Nat.Even x}
: Set
*)

Check exist.
(* forall {A : Type} (P : A -> Prop) (x : A),
   P x -> {x | P x}  *)
Check proj1_sig.
(* forall {A : Type} {P : A -> Prop},
   {x | P x} -> A *)
Check proj2_sig.
(* forall (A : Type) {P : A -> Prop} (e : {x | P x}),
   P (proj1_sig e) *)

We'll apply that idea to search trees. The type A will be tree V. The predicate P(x) will be BST(x).

Module TreeETable2 <: ETABLE.
Definition V := bool.
Definition default : V := false.
Definition table := {x | @BST V x }.
Definition key := nat.
Definition empty : table :=
          exist (BST V) (empty_tree V) (empty_tree_BST V).
Definition get (k: key) (m: table) : V :=
          (lookup V default k (proj1_sig m)).
Definition set (k: key) (v: V) (m: table) : table :=
   exist (BST V) (insert V k v (proj1_sig m))
          (insert_BST _ _ _ _ (proj2_sig m)).
Definition elements (m:table) := elements V (proj1_sig m).
Theorem gempty: ∀ k,
     get k empty = default.
Proof. intros. reflexivity. Qed.
Theorem gss: ∀ k v t,
     get k (set k v t) = v.
Proof. intros. apply lookup_insert_eq. Qed.
Theorem gso: ∀ j k v t,
     j ≠k →
     get j (set k v t) = get j t.
Proof. intros. apply lookup_insert_neq. omega. Qed.
Theorem gie: ∀ k v t,
     get k t = v →
     In (k ,v ) (elements t) ∨ ¬ In k (map fst (elements t)) ∧ v = default .
Proof.
    intros. apply lookup_In_elements; auto.

Now: t is a package with two components: The first component is a tree, and the second component is a proof that the first component has the SearchTree property. We can destruct t to see that more clearly.

    destruct t as [a Ha].
    (* Watch what this simpl does: *)
    simpl.
    auto.
Qed.
Theorem ieg: ∀ k v t,
     In (k ,v ) (elements t) →
     get k t = v.
Proof.
   intros. apply In_elements_lookup; auto.
   destruct t as [a Ha]. simpl. auto.
Qed.
Theorem edistinct : ∀ t,
      NoDup (map fst (elements t)).
Proof.
   intros.
   apply elements_keys_distinct; auto.
   destruct t as [a Ha]. simpl. auto.
Qed.

End TreeETable2.

Exercise: 4 stars, advanced, optional (ListsETable)

Another simple (though inefficient) representation of an ETABLE is as a list of (key,value) pairs, without duplicate key values (but not necessarily sorted). Using TreeETable2 as a model, write a module definition

Module ListsETable <: ETABLE

that uses dependent types to enforce the no-duplicates invariant.

(* Do not modify the following line: *)
Definition manual_grade_for_ListsETable : option (nat ×string) := None.
☐

(End of the brief excursion into dependent types.)

Abstracting over invariants.

Even without using dependent types, we can build the requirement for representation invariants into module signatures. Of course, we do not want the details of the invariant to be visible as part of the signature. The trick is to specify the existence of an invariant and the fact that it is maintained by suitable operations and required by others, while leaving the actual definition of the invariant abstract.

For example, we can modify the ETABLE signature to use this approach as follows.

1. We introduce an abstract invariant

is_table: table → Prop

as a new parameter of the signature.

2. We add axioms specifying that empty tables obey the invariant and that set operations maintain the invariant.

3. We add assumptions to the specifications for gie, ieg, and edistinct to require that the invariant holds.

Module Type ETABLE_INV.
Parameter V: Type.
Parameter default: V.
Parameter table: Type.
Definition key := nat.
Parameter empty: table.
Parameter get: key → table → V.
Parameter set: key → V → table → table.
Parameter elements: table → list (key ×V).
Parameter is_table: table → Prop.
Axiom is_table_empty: is_table empty.
Axiom is_table_set: ∀ t k v,
     is_table t →
     is_table (set k v t).
Axiom gempty: ∀ k, (* get-empty *)
       get k empty = default.
Axiom gss: ∀ k v t, (* get-set-same *)
      get k (set k v t) = v.
Axiom gso: ∀ j k v t, (* get-set-other *)
     j ≠ k →
     get j (set k v t) = get j t.
Axiom gie: ∀ k v t,
     is_table t →
     get k t = v →
     In (k ,v ) (elements t) ∨ ¬ In k (map fst (elements t)) ∧ v = default.
Axiom ieg: ∀ k v t,
     is_table t →
     In (k ,v ) (elements t) →
     get k t = v.
Axiom edistinct : ∀ t,
     is_table t →
     NoDup (map fst (elements t)).
End ETABLE_INV.

Now we can prove that our search trees match this extended signature, by instantiating the abstract invariant with the BST property, and invoking the lemmas we proved in SearchTree to show that this property is correctly maintained. The proofs of the elements-related specifications now take a BST assumption, and go through easily.

Module TreeETable3 <: ETABLE_INV.
Definition V := bool.
Definition default : V := false.
Definition table := tree V.
Definition key := nat.
Definition empty : table := empty_tree V.
Definition get (k: key) (m: table) : V := lookup V default k m.
Definition set (k: key) (v: V) (m: table) : table :=
     insert V k v m.
Definition elements (m:table) : list (key ×V) :=
     elements V m.
Definition is_table t := BST V t.
Theorem is_table_empty: is_table empty.
Proof. intros. apply empty_tree_BST. Qed.
Theorem is_table_set: ∀ t k v,
     is_table t →
     is_table (set k v t).
Proof. intros. apply insert_BST; auto. Qed.
Theorem gempty: ∀ k,
     get k empty = default.
Proof. intros. apply lookup_empty. Qed.
Theorem gss: ∀ k v t,
     get k (set k v t) = v.
Proof. intros. apply lookup_insert_eq. Qed.
Theorem gso: ∀ j k v t,
     j ≠k →
     get j (set k v t) = get j t.
Proof. intros. apply lookup_insert_neq. omega. Qed.
Theorem gie: ∀ k v t,
     is_table t →
     get k t = v →
     In (k ,v ) (elements t) ∨ ¬ In k (map fst (elements t)) ∧ v = default.
Proof. intros. apply lookup_In_elements; auto. Qed.
Theorem ieg: ∀ k v t,
     is_table t →
     In (k ,v ) (elements t) →
     get k t = v.
Proof. intros. apply In_elements_lookup; auto. Qed.
Theorem edistinct : ∀ t,
     is_table t →
     NoDup (map fst (elements t)).
Proof. intros. apply elements_keys_distinct; auto. Qed.
End TreeETable3.

Module signatures for model-based Specification

Finally, let's reconsider the model-based specification for search trees that we developed towards the end of SearchTree. Recall that in this style of specification, we introduce an abstraction function (or more generally relation) that associates each concrete value of the ADT implementation with a value in a more abstract, already well-understood, type, and shows that the concrete and abstract operations are related in a sensible way.

In the case of seach trees, you showed (in some exercises) that we can relate each tree to a functional partial map, such that lookup and insert operations on the tree correspond to find and update operations on the map.

We can capture this approach in a module signature by adding the abstraction function as a parameter and the lemmas involving it as axioms. Note that the abstraction function can remain abstract in the signature! That is, we don't really care how the abstraction function works, as long as it lets us prove the lemmas.

Module Type ETABLE_INV_MODEL.
Parameter V: Type.
Parameter default: V.
Parameter table: Type.
Definition key := nat.
Parameter empty: table.
Parameter get: key → table → V.
Parameter set: key → V → table → table.
Parameter elements: table → list (key ×V).
Parameter is_table: table → Prop.
Axiom is_table_empty: is_table empty.
Axiom is_table_set: ∀ t k v,
     is_table t →
     is_table (set k v t).
Parameter Abs : table → partial_map V.
Axiom empty_relate: Abs empty = Maps.empty.
Axiom lookup_relate : ∀ t k,
     is_table t →
     get k t = find V default (Abs t) k.
Axiom insert_relate : ∀ t k v,
     is_table t →
     Abs (set k v t) = update (Abs t) k v.
Axiom elements_relate : ∀ t,
     is_table t →
     list2map V (elements t) = Abs t.
End ETABLE_INV_MODEL.

Exercise: 3 stars, standard (TreeTableModel)

Give an implementation of ETABLE_INV_MODEL using the abstraction function Abs from SearchTree. All the proofs of the relate axioms should be simple applications of the lemmas already proved as exercises in that chapter. Again, take V to be bool and default to be false.

(* Do not modify the following line: *)
Definition manual_grade_for_TreeTableModel : option (nat ×string) := None.
☐

Exercise: 2 stars, advanced, optional (TreeTableModel2)

Repeat the previous exercise, this time using the alternative Abs' function from SearchTree. Hint: This requires only very small changes from the previous exercise's solution!

(* Do not modify the following line: *)
Definition manual_grade_for_TreeTableModel2 : option (nat ×string) := None.
☐

Summary of Abstract Data Type Proofs

Section ADT_SUMMARY.
Variable V: Type.
Variable default: V.

Step 1. Define a representation invariant. (In the case of search trees, the representation invariant is the BST predicate.) Prove that each operation on the data type preserves the representation invariant. For example:

Check (empty_tree_BST V).
  (*  BST (empty_tree V) *)
Check (insert_BST V).
  (* forall (k : key) (v : V) (t : tree V),
       BST t -> BST (insert k v t) *)

Notice two things: Any operator (such as insert) that takes a tree parameter can assume that the parameter satisfies the representation invariant. That is, the insert_BST theorem takes a premise, BST t.

Any operator that produces a tree result must prove that the result satisfies the representation invariant. Thus, the conclusions, BST (empty_tree V) and BST (insert k v t) of the two theorems above.

Finally, any operator that produces a result of "base type", has no obligation to prove that the result satisfies the representation invariant; that wouldn't make any sense anyway, because the types wouldn't match. That is, there's no "lookup_BST" theorem, because lookup doesn't return a result that's a tree.

Step 2 (only if using model-based specification). Define an abstraction relation. (In the case of search trees, it's the Abs function.) This relates the data structure to some mathematical value that is (presumably) simpler to reason about.

Check (Abs V). (* tree V -> partial_map V *)

For each operator, prove that: assuming each tree argument satisfies the representation invariant and the abstraction relation, prove that the results also satisfy the appropriate abstraction relation.

Check (empty_relate V). (*
       Abs V (empty_tree V) = empty   *)
Check (lookup_relate V). (*
     forall (d: V) (t : tree V) (k : key),
        BST V t ->
        lookup V default k t = find V default (Abs V t) k  *)
Check (insert_relate V). (*
     forall (t : tree V) (k : key) (v : V),
        BST V t ->
        Abs V (insert V k v t) = update (Abs V t) k v *)

Step 3. Using the representation invariant, pove that all the axioms of your ADT are valid. For example...

Check TreeETable2.gso. (*
      : forall (j k : TreeTable2.key) (v : TreeTable2.V)
         (t : TreeTable2.table),
       j <> k ->
       TreeTable2.get j (TreeTable2.set k v t) = TreeTable2.get j t  *)

End ADT_SUMMARY.

Exercise in Data Abstraction

The rest of this chapter is optional.

Require Import List.
Import ListNotations.

Here's the Fibonacci function.

Fixpoint fibonacci (n: nat) :=
match n with
| 0 ⇒ 1
| S i ⇒ match i with 0 ⇒ 1 | S j ⇒ fibonacci i + fibonacci j end
end.

Eval compute in map fibonacci [0;1;2;3;4;5;6].

Here's a silly little program that computes the Fibonacci function.

Fixpoint repeat {A} (f: A →A) (x: A) n :=
match n with O ⇒ x | S n' ⇒ f (repeat f x n') end.

Definition step (al: list nat) : list nat :=
List.cons (nth 0 al 0 + nth 1 al 0) al.

Eval compute in map (repeat step [1;0;0]) [0;1;2;3;4;5].

Definition fib n := nth 0 (repeat step [1;0;0] n) 0.

Eval compute in map fib [0;1;2;3;4;5;6].

Here's a strange "List" module.

Module Type LISTISH.
Parameter list: Type.
Parameter create : nat → nat → nat → list.
Parameter cons: nat → list → list.
Parameter nth: nat → list → nat.
End LISTISH.

Module L <: LISTISH.
Definition list := (nat ×nat ×nat)%type.
Definition create (a b c: nat) : list := (a ,b ,c ).
Definition cons (i: nat) (il : list) := match il with (a,b,c) ⇒ (i ,a,b) end.
Definition nth (n: nat) (al: list) :=
   match al with (a,b,c) ⇒
      match n with 0 ⇒ a | 1 ⇒ b | 2 ⇒ c | _ ⇒ 0 end
   end.
End L.

Definition sixlist := L.cons 0 (L.cons 1 (L.cons 2 (L.create 3 4 5))).

Eval compute in map (fun i ⇒ L.nth i sixlist) [0;1;2;3;4;5;6;7].

Module L implements approximations of lists: it can remember the first three elements, and forget the rest. Now watch:

Definition stepish (al: L.list) : L.list :=
L.cons (L.nth 0 al + L.nth 1 al) al.

Eval compute in map (repeat stepish (L.create 1 0 0)) [0;1;2;3;4;5].

Definition fibish n := L.nth 0 (repeat stepish (L.create 1 0 0) n).

Eval compute in map fibish [0;1;2;3;4;5;6].

This little theorem may be useful in the next exercise.

Lemma nth_firstn:
∀ A d i j (al: list A), i <j → nth i (firstn j al) d = nth i al d.

Proof.
induction i; destruct j,al; simpl; intros; auto; try omega.
apply IHi. omega.
Qed.

Exercise: 4 stars, standard, optional (listish_abstraction)

In this exercise we will not need a representation invariant. Define an abstraction relation:

Inductive L_Abs: L.list → List.list nat → Prop :=
(* FILL IN HERE *)
.

Definition O_Abs al al' := L_Abs al al'.

(* State these theorems using O_Abs, not L_Abs.
You'll see why below, at "Opaque". *)
Lemma create_relate : True. (* change this line appropriately *)
(* FILL IN HERE *) Admitted.

Lemma cons_relate : True. (* change this line appropriately *)
(* FILL IN HERE *) Admitted.

Lemma nth_relate : True. (* change this line appropriately *)
(* FILL IN HERE *) Admitted.

Now, we will make these operators opaque. Therefore, in the rest of the proofs in this exercise, you will not unfold their definitions. Instead, you will just use the theorems create_relate, cons_relate, nth_relate.

Opaque L.list.
Opaque L.create.
Opaque L.cons.
Opaque L.nth.
Opaque O_Abs.

Lemma step_relate:
  ∀ al al',
   O_Abs al al' →
   O_Abs (stepish al) (step al').
Proof.
(* FILL IN HERE *) Admitted.

Lemma repeat_step_relate:
∀ n al al',
O_Abs al al' →
O_Abs (repeat stepish al n) (repeat step al' n).
Proof.
(* FILL IN HERE *) Admitted.

Lemma fibish_correct: ∀ n, fibish n = fib n.
Proof. (* No induction needed in this proof! *)
(* FILL IN HERE *) Admitted.
☐

Exercise: 2 stars, standard, optional (fib_time_complexity)

Suppose you run these three programs call-by-value, that is, as if they were ML programs. fibonacci N fib N fibish N What is the asymptotic time complexity (big-Oh run time) of each, as a function of N? Assume that the plus function runs in constant time. You can use terms like "linear," "N log N," "quadratic," "cubic," "exponential." Explain your answers briefly.

fibonacci: (* fill in here *)
fib: (* fill in here *)
fibish: (* fill in here *)

☐

(* Tue Jan 28 13:58:40 EST 2020 *)