Typeclasses

Credits: Lecture and literate Haskell file borrowed from Ranjit Jhala, UCSD CSE 230

> {-# LANGUAGE OverlappingInstances, FlexibleInstances, TypeSynonymInstances #-}
> import Control.Arrow

We have already seen that the + operator works for a bunch of different underlying data types. For example

ghci> 2 + 3 
5
ghci> :type it
it :: Integer

ghci> 2.9 + 3.5
6.4
ghci> :type it
it :: Double

Similarly we can compare all sorts of values

ghci> 2 == 3 
False

ghci> [2.9, 3.5] == [2.9, 3.5]
True

“So?”, I can hear you shrug.

Indeed, this is quite unremarkable, since languages since the dawn of time has supported some form of operator “overloading” to support this kind of ad—hoc polymorphism.

However, in Haskell, there is no caste system. There is no distinction between operators and functions. All are first class citizens in Haskell.

Well then, what type do we give to functions like + and == ? Something like

(+) :: Integer -> Integer -> Integer 

would be too anemic, since we want to add two doubles as well! Can type variables help?

(+) :: a -> a -> a 

Nope. Thats a bit too aggressive, since it doesn’t make sense, to add two functions to each other! Haskell solves this problem with an insanely slick mechanism called typeclasses, introduced by Wadler and Blott.

BTW: The paper is one of the best examples of academic writing I have seen. The next time you hear a curmudgeon say all the best CS was done in the 60s, just point them to the above.

Qualified Types

To see the right type, lets just (politely, always) ask

ghci> :type (+)
(+) :: (Num a) => a -> a -> a

We call the above a qualified type. Read it as, + takes in two a values and returns an a value for any type a that is a Num or is an instance of Num.

The name Num can be thought of as a predicate over types. Some types satisfy the Num predicate. Examples include Integer, Double etc, and any values of those types can be passed to +. Other types do not satisfy the predicate. Examples include Char, String, functions etc, and so values of those types cannot be passed to +.

ghci> 'a' + 'b'

<interactive>:1:0:
No instance for (Num Char)
arising from a use of `+' at <interactive>:1:0-8
Possible fix: add an instance declaration for (Num Char)
In the expression: 'a' + 'b'
In the definition of `it': it = 'a' + 'b'

As promised, now these kinds of error messages should make sense. Basically Haskell is complaining that a and b are of type Char which is not an instance of Num.

OK, so what is a Typeclass?

In a nutshell, a typeclass is a collection of operations (functions) that must exist for the underlying type. For example, lets look at possibly the simplest typeclass Eq

class  Eq a  where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool

That is, a type a can be an instance of Eq as long as there are two functions that determine if two a values are respectively equal or disequal. Similarly, the typeclass Show captures the requirements that make a particular datatype be viewable,

class  Show a  where
show :: a -> String

Indeed, we can test this on different (built-in) types

ghci> show 2
"2"

ghci> show 3.14
"3.14"

ghci> show (1, "two", ([],[],[]))
"(1,\"two\",([],[],[]))"

When we type an expression into ghci, it computes the value and then calls show on the result. Thus, if we create a new type by

> data Unshowable = A | B | C 

then we can create values of the type,

ghci> let x = A
ghci> :type x
x :: Unshowable

but can’t view or compare them

ghci> x

<interactive>:1:0:
No instance for (Show Unshowable)
arising from a use of `print' at <interactive>:1:0
Possible fix: add an instance declaration for (Show Unshowable)
In a stmt of a 'do' expression: print it

ghci> x == x

<interactive>:1:0:
No instance for (Eq Unshowable)
arising from a use of `==' at <interactive>:1:0-5
Possible fix: add an instance declaration for (Eq Unshowable)
In the expression: x == x
In the definition of `it': it = x == x

Again, the previously incomprehensible type error message should make sense to you.

Automatic Derivation

Of course, this is lame; we should be able to compare and view them. To allow this, Haskell allows us automatically derive functions for certain key type classes, namely those in the standard library.

To do so, we simply dress up the data type definition with

> data Showable = A' | B' | C' deriving (Eq, Show) 

and now we have

ghci> let x' = A'

ghci> :type x'
x' :: Showable

ghci> x'
A'

ghci> x' == x'
True

Standard Typeclass Hierarchy

Let us now peruse the definition of the Num typeclass.

ghci> :info Num
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
(*) :: a -> a -> a
(-) :: a -> a -> a
negate :: a -> a
abs :: a -> a
signum :: a -> a
fromInteger :: Integer -> a

There’s quite a bit going on there. A type a can only be deemed an instance of Num if

  1. The type is also an instance of Eq and Show, and
  2. There are functions for adding, multiplying, subtracting, negating etc values of that type.

In other words in addition to the “arithmetic” operations, we can compare two Num values and we can view them (as a String.)

Haskell comes equipped with a rich set of built-in classes.

Standard Typeclass Hierarchy

Standard Typeclass Hierarchy

In the above picture, there is an edge from Eq and Show to Num because for something to be a Num it must also be an Eq and Show. There are a few other ones that we will come to know (and love!) in due course…

Using Typeclasses

Lets now see how slickly typeclasses integrate with the rest of Haskell’s type system by building a small library for Maps (aka associative arrays, lookup tables etc.)

> data BST k v = Empty 
> | Bind k v (BST k v) (BST k v)
> deriving (Show)

We will call this type BST to abbreviate Binary Search Tree which are trees where keys are ordered such that at each node, the keys appearing in the left and right subtrees are respectively smaller and larger than than the key at the node. For example, this is what a tree that maps the strings "burrito", "chimichanga" and "frijoles" to their prices might look like

BST example

BST example

We must ensure that the invariant is preserved by the insert function. In the functional setting, the insert will return a brand new tree.

> insert k v (Bind k' v' l r) 
> | k == k' = Bind k v l r
> | k < k' = Bind k' v' (insert k v l) r
> | otherwise = Bind k' v' l (insert k v r)
> insert k v Empty = Bind k v Empty Empty

The organization of the BST allows us to efficiently search the tree for a key.

> find k (Bind k' v' l r) 
> | k == k' = Just v'
> | k < k' = find k l
> | otherwise = find k r
> find k Empty = Nothing

The BST ordering obviates the need for any backtracking. If additionally if the tree is kept balanced we ensure very efficient searching.

Now, we can create a particular lookup table like so

> t0 = insert "burrito"     4.50 Empty
> t1 = insert "chimichanga" 5.25 t0
> t2 = insert "frijoles" 2.75 t1

NOTE: Each insert returns a brand new BST, this is not Java!

Of course this is a bit tedious, so it may be easier to write an ofList function that will turn an association list into an appropriate BST.

> ofList = foldl (\t (k, v) -> insert k v t) Empty

Now, we can just do

> t = ofList [("chimichanga", 5.25)
> ,("burrito" , 4.50)
> ,("frijoles" , 2.75)]

After which we can query the table

ghci> :type t
t :: BST [Char] Double

ghci> find "burrito" t
Just 4.5

ghci> find "birria" t
Nothing

Similarly, it makes sense to implement a toList which will convert the map into an association list. First, lets write a generic foldBST

> foldBST f b Empty          = b 
> foldBST f b (Bind k v l r) = f k v (foldBST f b l) (foldBST f b r)

after which toList is simply (but rather inefficiently!)

> toList =  foldBST (\k v l r -> l ++ [(k, v)] ++ r) []
ghci> toList t 
[("burrito", 4.50), ("chimichanga", 5.25), ("frijoles", 2.75)]

Exercise

Why is the output sorted (by key) ?

Constraint Propagation

Notice that we didn’t write down the types of any of the functions. Lets see what the types are

ghci> :type insert
insert :: (Ord a) => a -> v -> BST a v -> BST a v

ghci> :type find
find :: (Ord a) => a -> BST a t -> Maybe t

ghci> :type ofList
insert :: (Ord a) => a -> v -> BST a v -> BST a v

Whoa! Look at that, Haskell tells us that we can use any a value as a key as long as the value is an instance of the Ord typeclass. You might guess from the name, that a type is an instance of Ord if there are functions that allow us to compare values of the type. In particular

ghci> :info Ord
class (Eq a) => Ord a where
compare :: a -> a -> Ordering
(<) :: a -> a -> Bool
(>=) :: a -> a -> Bool
(>) :: a -> a -> Bool
(<=) :: a -> a -> Bool
max :: a -> a -> a
min :: a -> a -> a

ghci> :info Ordering
data Ordering = LT | EQ | GT -- Defined in GHC.Ordering

How, did the engine figure this out? Easy enough, if you look at the body of the insert and find functions, you’ll see that we compare two key values.

Exercise

Write a delete function of type

delete :: (Ord k) => k -> BST k v -> BST k v

Explicit Signatures

While Haskell is pretty good about inferring types in general, there are cases when the use of type classes requires explicit annotations (which change the behavior of the code.)

For example, Read is a built-in typeclass, where any instance a of Read has a function

read :: (Read a) => String -> a

which can parse a string and turn it into an a. Thus, Read is, in a sense, the opposite of Show. However, if we do

ghci> read "2"

Haskell is foxed, because it doesn’t know what to convert the string to! Did we want an Int or a Double ? Or maybe something else altogether. Thus, we get back the complaint

interactive>:1:0:
Ambiguous type variable `a' in the constraint:
`Read a' arising from a use of `read' at <interactive>:1:0-9
Probable fix: add a type signature that fixes these type variable(s)

which clearly states what the issue is. Thus, here an explicit type annotation is needed to tell it what to convert the string to. Thus, if we play nice and add the types we get

ghci> (read "2") :: Int
2

ghci> (read "2") :: Float
2.0

Note the different results due to the different types.

Instantiating Typeclasses

So far we have seen Haskell’s nifty support for overloading by observing that

  1. some standard types are instances of standard type classes, and
  2. new types can be automatically made instances of standard type classes.

However, in many situations the automatic instantiation doesn’t quite cut it, and instead we need to (and get to!) create our own instances.

For example, you might have noticed that we didn’t bother with adding Eq to the deriving clause for our BST type. Thus, we can’t compare two BSTs for equality (!)

*Main> Empty == Empty 

<interactive>:1:0:
No instance for (Eq (BST k v))
arising from a use of `==' at <interactive>:1:0-13
Possible fix: add an instance declaration for (Eq (BST k v))
In the expression: Empty == Empty
In the definition of `it': it = Empty == Empty

Suppose we had added

data BST k v = Empty 
| Bind k v (BST k v) (BST k v)
deriving (Eq, Show)

Now, we can compare two BST values

ghci> Empty == Empty 
True

But the equality test is rather too structural, as in, are the two trees exactly the same, rather than what we might want, which is, are the two underlying maps exactly the same. Consequently we get

ghci> t == ofList (toList t)
False

Ugh! Why did that happen? Well, lets see

ghci> t
Bind "chimichanga" 5.25 (Bind "burrito" 4.5 Empty Empty) (Bind "frijoles" 2.75 Empty Empty)

ghci> ofList (toList t)
Bind "burrito" 4.5 Empty (Bind "chimichanga" 5.25 Empty (Bind "frijoles" 2.75 Empty Empty))

The trees are different because they contain the keys in different (valid!) orders. To get around this, we can explicitly make BST an instance of the Eq typeclass, by implementing the relevant functions for the typeclass.

To undertand how, let us look at the full definition of the Eq typeclass. Ah! the typeclass definition also provides default implementations of each operation (in terms of the other operation.) Thus, all we need to do is define == and we will get /= (not-equals) for free!

class Eq a  where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool

{- Default Implementations -}

x == y = not (x /= y)
x /= y = not (x == y)

Thus, to define our own equality (and disequality) procedures we write

> instance (Eq k, Eq v) => Eq (BST k v) where
> t1 == t2 = toList t1 == toList t2

The above instance declaration states that if k and v are instances of Eq, that is can be compared for equality, then BST k v can be compared for equality, via the given procedure. Thus, once we have supplied the above we get

ghci> t == ofList (toList t)
True

In general, when instantiating a typeclass, Haskell will check that we have provided a minimal implementation containing enough functions from which the remaining functions can be obtained (via their default implementations.)

ghci> t /= Empty
True

Laws

In addition to the explicit type requirements, a typeclass also encodes a set of laws that describe the relationships between the different operations. For example, the intention of the Eq typeclass is that the supplied implementations of == and /= satisfy the law

forall t1 t2, t1 == t2 <==> not t1 /= t2

Unfortunately, there is no way for Haskell to verify that your implementations satisfy the laws, so this is something to be extra careful about, when using typeclasses.

Creating Typeclasses

It turns out that typeclasses are useful for many different things. We will see some of those over the next few lectures, but let us conclude today’s class with a quick example that provides a (very) small taste of their capabilities.

JSON

JavaScript Object Notation or JSON is a simple format for transferring data around. Here is an example:

{ "name"    : "Ranjit"
, "age" : 33
, "likes" : ["guacamole", "coffee", "bacon"]
, "hates" : [ "waiting" , "grapefruit"]
, "lunches" : [ {"day" : "monday", "loc" : "zanzibar"}
, {"day" : "tuesday", "loc" : "farmers market"}
, {"day" : "wednesday", "loc" : "harekrishna"}
, {"day" : "thursday", "loc" : "faculty club"}
, {"day" : "friday", "loc" : "coffee cart"} ]
}

In brief, each JSON object is either - a base value like a string, a number or a boolean, - an (ordered) array of objects, or - a set of string-object pairs.

Thus, we can encode (a subset of) JSON values with the datatype

> data JVal = JStr String
> | JNum Double
> | JBln Bool
> | JObj [(String, JVal)]
> | JArr [JVal]
> deriving (Eq, Ord, Show)

Thus, the above JSON value would be represented by the JVal

> js1 = 
> JObj [("name", JStr "Ranjit")
> ,("age", JNum 33)
> ,("likes", JArr [ JStr "guacamole", JStr "coffee", JStr "bacon"])
> ,("hates", JArr [ JStr "waiting" , JStr "grapefruit"])
> ,("lunches", JArr [ JObj [("day", JStr "monday")
> ,("loc", JStr "zanzibar")]
> , JObj [("day", JStr "tuesday")
> ,("loc", JStr "farmers market")]
> , JObj [("day", JStr "wednesday")
> ,("loc", JStr "hare krishna")]
> , JObj [("day", JStr "thursday")
> ,("loc", JStr "faculty club")]
> , JObj [("day", JStr "friday")
> ,("loc", JStr "coffee cart")]
> ])
> ]

Serializing Haskell Values to JSON

Next, suppose that we want to write a small library to serialize Haskell values as JSON. We could write a bunch of functions like

> doubleToJSON :: Double -> JVal 
> doubleToJSON = JNum

similarly, we have

> stringToJSON :: String -> JVal
> stringToJSON = JStr
>
> boolToJSON :: Bool -> JVal
> boolToJSON = JBln

But what about collections, namely objects and arrays? We might try

> doublesToJSON :: [Double] -> JVal
> doublesToJSON = JArr . map doubleToJSON
>
> boolsToJSON :: [Bool] -> JVal
> boolsToJSON = JArr . map boolToJSON
>
> stringsToJSON :: [String] -> JVal
> stringsToJSON = JArr . map stringToJSON

which of course, you could abstract by making the individual-element-converter a parameter

> xsToJSON :: (a -> JVal) -> [a] -> JVal
> xsToJSON f = JArr . map f
>
> xysToJSON :: (a -> JVal) -> [(String, a)] -> JVal
> xysToJSON f = JObj . map (second f)

but still, this is getting rather tedious, since we have to redefine versions for each Haskell type, and instantiate them by hand for each conversion

ghci> doubleToJSON 4
JNum 4.0

ghci> xsToJSON stringToJSON ["coffee", "guacamole", "bacon"]
JArr [JStr "coffee",JStr "guacamole",JStr "bacon"]

ghci> xysToJSON stringToJSON [("day", "monday"), ("loc", "zanzibar")]
JObj [("day",JStr "monday"),("loc",JStr "zanzibar")]

and this gets more hideous when you have richer objects like

> lunches = [ [("day", "monday"),    ("loc", "zanzibar")] 
> , [("day", "tuesday"), ("loc", "farmers market")]
> , [("day", "wednesday"), ("loc", "hare krishna")]
> , [("day", "thursday"), ("loc", "faculty club")]
> , [("day", "friday"), ("loc", "coffee cart")]
> ]

because we have to go through gymnastics like

ghci> xsToJSON (xysToJSON stringToJSON) lunches
JArr [JObj [("day",JStr "monday"),("loc",JStr "zanzibar")],JObj [("day",JStr "tuesday"),("loc",JStr "farmers market")]]

Ugh! So much for readability. Isn’t there a better way? Is it too much to ask for a magical toJSON that just works?

Typeclasses To The Rescue

Of course there is a better way, and the the route is paved by typeclasses!

Lets define a typeclass that describes any type that can be converted to JSON.

> class JSON a where
> toJSON :: a -> JVal

Easy enough. Now, we can make all the above instances of JSON like so

> instance JSON Double where
> toJSON = JNum
>
> instance JSON Bool where
> toJSON = JBln
>
> instance JSON String where
> toJSON = JStr

Now, we can just say

ghci> toJSON 4
JNum 4.0

ghci> toJSON True
JBln True

ghci> toJSON "guacamole"
JStr "guacamole"

Bootstrapping Instances

The real fun begins when we get Haskell to automaticall bootstrap the above functions to work for lists and association lists!

> instance (JSON a) => JSON [a] where
> toJSON = JArr . map toJSON

Whoa! The above says, if a is an instance of JSON, that is, if you can convert a to JVal then here’s a generic recipe to convert lists of a values!

ghci> toJSON [True, False, True]
JArr [JBln True,JBln False,JBln True]

ghci> toJSON ["cat", "dog", "Mouse"]
JArr [JStr "cat",JStr "dog",JStr "Mouse"]

ghci> toJSON [["cat", "dog"], ["mouse", "rabbit"]]
JArr [JArr [JStr "cat",JStr "dog"],JArr [JStr "mouse",JStr "rabbit"]]

Of course, we can pull the same trick with key-value lists

> instance (JSON a) => JSON [(String, a)] where
> toJSON = JObj . map (second toJSON)

after which, we are all set!

ghci> toJSON lunches
JArr [JObj [("day",JStr "monday"),("loc",JStr "zanzibar")],JObj [("day",JStr "tuesday"),("loc",JStr "farmers market")]]

It is also useful to bootstrap the serialization for tuples (upto some fixed size) so we can easily write “non-uniform” JSON objects where keys are bound to values with different shapes.

> instance (JSON a, JSON b) => JSON ((String, a), (String, b)) where
> toJSON ((k1, v1), (k2, v2)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2)]
>
> instance (JSON a, JSON b, JSON c) => JSON ((String, a), (String, b), (String, c)) where
> toJSON ((k1, v1), (k2, v2), (k3, v3)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2), (k3, toJSON v3)]
>
> instance (JSON a, JSON b, JSON c, JSON d) => JSON ((String, a), (String, b), (String, c), (String,d)) where
> toJSON ((k1, v1), (k2, v2), (k3, v3), (k4, v4)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2), (k3, toJSON v3), (k4, toJSON v4)]
>
> instance (JSON a, JSON b, JSON c, JSON d, JSON e) => JSON ((String, a), (String, b), (String, c), (String,d), (String, e)) where
> toJSON ((k1, v1), (k2, v2), (k3, v3), (k4, v4), (k5, v5)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2), (k3, toJSON v3), (k4, toJSON v4), (k5, toJSON v5)]

Now, we can simply write

> hs = (("name"   , "Ranjit")
> ,("age" , 33 :: Double)
> ,("likes" , ["guacamole", "coffee", "bacon"])
> ,("hates" , ["waiting", "grapefruit"])
> ,("lunches", lunches)
> )

which is a Haskell value that describes our running JSON example, and can convert it directly like so

> js2 = toJSON hs

This value is exactly equal to the old “hand-serialized” JSON object js1.

ghci> js1 == js2
True

Exercise

Why did we have to write a type annotation 33 :: Double in the above example? Can you figure out a way to remove it?

To wrap everything up, lets write a routine to serialize our BST maps.

> instance (JSON v) => JSON (BST String v) where
> toJSON = JObj . map (second toJSON) . toList

Now lets make up a complex Haskell value with an embedded BST.

> hs' = (("name"    , "el gordo taqueria")
> ,("address" , "213 Delicious Street")
> ,("menu" , t))

and presto! our serializer just works

ghci> t
Bind "chimichanga" 5.25
(Bind "burrito" 4.5 Empty Empty)
(Bind "frijoles" 2.75 Empty Empty)

ghci> :type hs'
hs' :: (([Char], [Char]),
([Char], [Char]),
([Char], BST [Char] Double))

ghci> toJSON hs'
JObj [("name", JStr "el gordo taqueria")
,("address", JStr "213 Delicious Street")
,("menu", JObj [("burrito", JNum 4.5)
,("chimichanga", JNum 5.25)
,("frijoles",JNum 2.75)])]

Thats it for today. We will see much more typeclass awesomeness in the next few lectures…

Exercise

Why does Haskell reject the following expression?

(1,2,3,4,5,6,7,8,9,10,1,1,1,1,1,1) == (1,2,3,4,5,6,7,8,9,10,1,1,1,1,1,1)

Exercise

What are the mysterious incantations in the LANGUAGE block at the top of the file? Can you figure out why they are needed?