It turns out that this problem really only bites hard in the
rule for if-then-else. First I'll describe how to handle everything
besides if-then-else.
The first part of defining the type
checking algorithm is defining the subtyping function sub. When
you call sub(t1,t2), the function should return true if
t1 is a subtype of t2 and return false otherwise. The
subtyping rules in the Fun language definition are nondeterministic
and cannot be implemented directly primarily because of the
transitivity rule. The transitivity rule says:
t1 < t2
        t2 < t3
------------------------
(transitivity)
t1 < t3
If we were trying to read
this rule bottom-up like a function then it would say t1 is a
subtype of t3 if we can magically conjure up some type
t2 such that t1 is a subtype of t2 and t2 is
a subtype of t3. Fortunately, if restructure our other
subtyping rules a little bit to make sure that overall our subtype in
relation is transitive, we don't really need to implement the
transitivity rule directly. In particular, this means it will be
necessary to combine the rules for tuples and check all of the
conditions for subtyping on tuples at once.
The next part of the algorithm involves implementing the type checking algorithm for expressions themselves. This algorithm will invoke sub when necessary. Notice that in the declarative rules there is a troublesome rule called subsumption:
G |-- e : t         t <
t'
-------------------------
G |-- e : t'
Implementing this rule directly in the function tc_exp is problematic because if we try to read this rule as a function bottom-up in which the arguments to the function are the context ?, the expression e and the type t' then we will somehow have to "guess" what type t to recursively invoke tc_exp with. Moreover, the subsumption rule can always be used. A stupid type checker might try to check subsumption over and over again and go into an infinite loop.
Instead, what we will do is call tc_exp ctxt e and expect it either to fail (by reporting an ErrorMsg.error) or to succeed and return the least type t that e can possibly have in the context ctxt.
If the type-checker reports an error, it shouldn't necessarily give up. It should try to recover so that it can report other errors to the user. One way to recover is just pretend that the erroneous expression has type int and hope for the best.
The only places that we will use subtyping in our algorithm is when we absolutely have to (or else type checking will fail). For instance, one place that we might need to use subtyping is when we have a function call f (e). If f has type t1 -> t2 and e has type t1' then our type checker should succeed if t1' is a subtype of t1. Then, the type of f(e) can be any supertype of t2; in order to report the least supertype of t2, then just return t2 as the result of tc_exp ("f(e)").
Note: The type-checker needs to traverse expressions and maintain a context at the same time, in a way not so different from the programs you wrote for alpha-conversion and constant-folding.
The most difficult problem is the
type-checking of if-then-else. Consider this Fun expression, in a
context where f
if x then <0, f> else
<1,2,3>
This expression is guaranteed to return a
record whose first field is an int. That is, the least supertype of
all the values that could be returned by this if-expression is
<int>. Below I will describe the high-tech solution to
this problem. Until you get everything else working, make the
following approximation: the then-clause and the else-clause must have
exactly the same type, that is,
join complain (t1,t2) = if t1=t2 then t1 else (complain
"t1 and t2 do not join"; t1)
This is not
faithful to the Fun semantics; it will reject some legal Fun
programs. However, it will not accept any illegal Fun programs. (This
algorithm is sound but not complete.)
When you call join
complain (t1,t2) it should return a third type t3 such that
t1 is a subtype of t3 and t2 is also a subtype of
t3. If no such type exists, use complain to report an
error (you'll want to pass join a complain-function that calls
ErrorMsg.error with an appropriate position). Furthermore, join
should return the "least" type t3. For example,
tc_join (<int, (int->int)>) (<int>) = <int>
Even though <> is a supertype of both
<int,(int->int)> and <int>, it is also a supertype of
<int>. Therefore, the join function should return <int>
not <>. <int> is the least supertype of both
<int,bool> and <int> and therefore it is a
"join." Hint: there is a substantial twist when implementing
the join function for function types. You'll need to implement a
second, mutually recursive helper function in this case, to compute
the greatest common subtype, called the "meet".