SortInsertion Sort

Sorting can be done in O(N log N) time by various algorithms (quicksort, mergesort, heapsort, etc.). But for smallish inputs, a simple quadratic-time algorithm such as insertion sort can actually be faster. And it's certainly easier to implement -- and to prove correct.

Recommended Reading

If you don't already know how insertion sort works, see Wikipedia or read any standard textbook; for example:
Sections 2.0 and 2.1 of Algorithms, Fourth Edition, by Sedgewick and Wayne, Addison Wesley 2011; or
Section 2.1 of Introduction to Algorithms, 3rd Edition, by Cormen, Leiserson, and Rivest, MIT Press 2009.

The Insertion-Sort Program

Insertion sort is usually presented as an imperative program operating on arrays. But it works just as well as a functional program operating on linked lists!

From VFA Require Import Perm.

Fixpoint insert (i:nat) (l: list nat) :=
  match l with
  | nili::nil
  | h::tif i <=? h then i::h::t else h :: insert i t
 end.

Fixpoint sort (l: list nat) : list nat :=
  match l with
  | nilnil
  | h::tinsert h (sort t)
end.

Example sort_pi: sort [3;1;4;1;5;9;2;6;5;3;5]
                 = [1;1;2;3;3;4;5;5;5;6;9].
Proof. simpl. reflexivity. Qed.
What Sedgewick/Wayne and Cormen/Leiserson/Rivest don't acknowlege is that the arrays-and-swaps model of sorting is not the only one in the world. We are writing functional programs, where our sequences are (typically) represented as linked lists, and where we do not destructively splice elements into those lists. Instead, we build new lists that (sometimes) share structure with the old ones.
So, for example:

Eval compute in insert 7 [1; 3; 4; 8; 12; 14; 18].
(* = 1; 3; 4; 7; 8; 12; 14; 18 *)
The tail of this list, 12::14::18::nil, is not disturbed or rebuilt by the insert algorithm. The nodes 1::3::4::7::_ are new, constructed by insert. The first three nodes of the old list, 1::3::4::_ will likely be garbage-collected, if no other data structure is still pointing at them. Thus, in this typical case,
  • Time cost = 4X
  • Space cost = (4-3)Y = Y
where X and Y are constants, independent of the length of the tail. The value Y is the number of bytes in one list node: 2 to 4 words, depending on how the implementation handles constructor-tags. We write (4-3) to indicate that four list nodes are constructed, while three list nodes become eligible for garbage collection.
We will not prove such things about the time and space cost, but they are true anyway, and we should keep them in consideration.

Specification of Correctness

A sorting algorithm must rearrange the elements into a list that is totally ordered.

Inductive sorted: list nat Prop :=
| sorted_nil:
    sorted nil
| sorted_1: x,
    sorted (x::nil)
| sorted_cons: x y l,
   x y sorted (y::l) sorted (x::y::l).
Is this really the right definition of what it means for a list to be sorted? Basically, this definition is saying that for any two adjacent element of the list, the first must be <= the second. But it can take some practice to read and believe in inductive specifications like this.
One might prefer a non-inductive definition of sortedness. Moreover, it might seem more obvious to state things in terms of arbitrary elements of the list, not just adjacent ones, i.e. something like this: given any two distinct elements of the list, the left one should be <= the right one. To make this precise, we can think in terms of indices into the list, and say something like: for any indices i and j, if i < j then element-at(i) <= element-at(j).
Unfortunately, encoding this idea is a little messy, because we only want to consider values of i and j that are valid indices into the list. In fact, we're forced to think about this because the any Coq function implementing the "element-at" concept must be total, so we must somewhow tell it what to do if we pass an index that is out of range for the list.
The Coq standard library actually contains two versions of this function:

Check nth.
(* ==>
nth : forall A : Type, nat -> list A -> A -> A
 *)


Check nth_error.
(* ==>
nth_error: forall A : Type, list A -> nat -> option A
*)

The difference between these two functions (other than the trivial but irritating fact that they take their element and list arguments in opposite orders) is that nth requires an additional default argument (of type A) to be returned if the index is out of range, whereas nth_error has an option return type and returns Some v if the index is in range and None otherwise.
Which one of these should we use? The type of nth may seem a little simpler, but using it requires us to guard our indices to make sure that they are valid, resulting in a definition looking something like this:

Definition sorted'' (al: list nat) :=
  i j, i < j < length al nth i al 0 nth j al 0.
Note that the choice of default value, here 0, is unimportant, because it will never be returned for the i and j we pass.
The alternative using nth_error might look like this:

Definition sorted' (al: list nat) :=
  i j iv jv, i < j nth_error al i = Some iv
                            nth_error al j = Some jv iv jv.
Here, the validity of i and j are implicit in the fact that we get Some results back from each call to nth_error. In practice, this definition is simpler to work with than sorted'' because it doesn't need to mention the length function, so we'll largely forget about sorted'' in what follows.
Both sorted and sorted' are reasonable definitions. They should be equivalent. Later on, we'll prove that they really are equivalent. For now, let's use the first one to define what it means to be a correct sorting algorthm.

Definition is_a_sorting_algorithm (f: list nat list nat) :=
   al, Permutation al (f al) sorted (f al).
The result (f al) should not only be a sorted sequence, but it should be some rearrangement (Permutation) of the input sequence.

Proof of Correctness

Exercise: 3 stars, standard (insert_perm)

Prove the following auxiliary lemma, insert_perm, which will be useful for proving sort_perm below. Your proof will be by induction, but you'll need some of the permutation facts from the library, so first remind yourself by doing Search.

Search Permutation.

Lemma insert_perm: x l, Permutation (x::l) (insert x l).
Proof.
(* FILL IN HERE *) Admitted.

Exercise: 3 stars, standard (sort_perm)

Now prove that sort is a permutation.

Theorem sort_perm: l, Permutation l (sort l).
Proof.
(* FILL IN HERE *) Admitted.

Exercise: 4 stars, standard (insert_sorted)

This one is just a bit tricky. However, there just a single induction right at the beginning, and you do not need to use insert_perm or sort_perm.

Lemma insert_sorted:
   a l, sorted l sorted (insert a l).
Proof.
(* FILL IN HERE *) Admitted.

Exercise: 2 stars, standard (sort_sorted)

This one is easy.

Theorem sort_sorted: l, sorted (sort l).
Proof.
(* FILL IN HERE *) Admitted.
Now we wrap it all up.

Theorem insertion_sort_correct:
    is_a_sorting_algorithm sort.
Proof.
  split.
  - apply sort_perm.
  - apply sort_sorted.
Qed.

Making Sure the Specification is Right

It's really important to get the specification right. You can prove that your program satisfies its specification (and Coq will check that proof for you), but you can't prove that you have the right specification. Therefore, we took the trouble to write two different specifications of sortedness (sorted and sorted'), and now we prove that they mean the same thing. This increases our confidence that we have the right specification, though of course it doesn't prove that we do.

Exercise: 4 stars, advanced (sorted_sorted')

Lemma sorted_sorted': al, sorted al sorted' al.
Hint: Instead of doing induction on the list al, do induction on the sortedness of al. This proof is a bit tricky, so you may have to think about how to approach it, and try out one or two different ideas.
Proof.
(* FILL IN HERE *) Admitted.

Exercise: 3 stars, advanced (sorted'_sorted)

Lemma sorted'_sorted : al, sorted' al sorted al.
Proof.
Here, you can't do induction on the sorted'-ness of the list, because sorted' is not an inductive predicate. But the proof is not hard.
(* FILL IN HERE *) Admitted.

Proving Correctness from the Alternative Spec

Depending on how you write the specification of a program, it can be much harder or easier to prove correctness. We saw that the predicates sorted and sorted' are equivalent; but it is significantly harder to prove correctness of insertion sort directly from sorted'.
Give it a try! Our best proof makes essential use of the auxiliary lemma nth_default_insert, so you may want to prove that first. And some other auxiliary lemmas may be needed too. But maybe you will find a simpler appraoch!
DO NOT USE sorted_sorted', sorted'_sorted, insert_sorted, or sort_sorted in these proofs!

Exercise: 5 stars, standard, optional (insert_sorted')


Lemma nth_error_insert : l a i iv,
    nth_error (insert a l) i = Some iv
    a = iv i', nth_error l i' = Some iv.
Proof.
(* FILL IN HERE *) Admitted.

Lemma insert_sorted':
   a l, sorted' l sorted' (insert a l).
Proof.
(* FILL IN HERE *) Admitted.

Theorem sort_sorted': l, sorted' (sort l).
Proof.
  induction l.
  - unfold sorted'. intros. destruct i; inv H0.
  - simpl. apply insert_sorted'. auto.
Qed.

The Moral of This Story

The proof of insert_sorted was easy; the proof of insert_sorted' was difficult; and yet sorted al sorted' al. Different formulations of the functional specification can lead to great differences in the difficulty of the correctness proofs.
Suppose someone required you to prove sort_sorted', and never mentioned the sorted predicate to you. Instead of proving sort_sorted' directly, it would be much easier to design a new predicate (sorted), and then prove sort_sorted and sorted_sorted'.

(* Tue Jan 28 13:58:39 EST 2020 *)