> {-# LANGUAGE OverlappingInstances, FlexibleInstances, TypeSynonymInstances #-}
> import Control.Arrow
We have already seen that the +
operator works for a bunch of different underlying data types. For example
ghci> 2 + 3
5
ghci> :type it
it :: Integer
ghci> 2.9 + 3.5
6.4
ghci> :type it
it :: Double
Similarly we can compare all sorts of values
ghci> 2 == 3
False
ghci> [2.9, 3.5] == [2.9, 3.5]
True
“So?”, I can hear you shrug.
Indeed, this is quite unremarkable, since languages since the dawn of time has supported some form of operator “overloading” to support this kind of ad—hoc polymorphism.
However, in Haskell, there is no caste system. There is no distinction between operators and functions. All are first class citizens in Haskell.
Well then, what type do we give to functions like +
and ==
? Something like
(+) :: Integer -> Integer -> Integer
would be too anemic, since we want to add two doubles as well! Can type variables help?
(+) :: a -> a -> a
Nope. Thats a bit too aggressive, since it doesn’t make sense, to add two functions to each other! Haskell solves this problem with an insanely slick mechanism called typeclasses, introduced by Wadler and Blott.
BTW: The paper is one of the best examples of academic writing I have seen. The next time you hear a curmudgeon say all the best CS was done in the 60s, just point them to the above.
To see the right type, lets just (politely, always) ask
ghci> :type (+)
(+) :: (Num a) => a -> a -> a
We call the above a qualified type. Read it as, +
takes in two a
values and returns an a
value for any type a
that is a Num
or is an instance of Num
.
The name Num
can be thought of as a predicate over types. Some types satisfy the Num
predicate. Examples include Integer
, Double
etc, and any values of those types can be passed to +
. Other types do not satisfy the predicate. Examples include Char
, String
, functions etc, and so values of those types cannot be passed to +
.
ghci> 'a' + 'b'
<interactive>:1:0:
No instance for (Num Char)
arising from a use of `+' at <interactive>:1:0-8
Possible fix: add an instance declaration for (Num Char)
In the expression: 'a' + 'b'
In the definition of `it': it = 'a' + 'b'
As promised, now these kinds of error messages should make sense. Basically Haskell is complaining that a
and b
are of type Char
which is not an instance of Num
.
In
a nutshell, a typeclass is a collection of operations (functions) that
must exist for the underlying type. For example, lets look at possibly
the simplest typeclass Eq
class Eq a where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool
That is, a type a
can be an instance of Eq
as long as there are two functions that determine if two a
values are respectively equal or disequal. Similarly, the typeclass Show
captures the requirements that make a particular datatype be viewable,
class Show a where
show :: a -> String
Indeed, we can test this on different (built-in) types
ghci> show 2
"2"
ghci> show 3.14
"3.14"
ghci> show (1, "two", ([],[],[]))
"(1,\"two\",([],[],[]))"
When we type an expression into ghci, it computes the value and then calls show
on the result. Thus, if we create a new type by
> data Unshowable = A | B | C
then we can create values of the type,
ghci> let x = A
ghci> :type x
x :: Unshowable
but can’t view or compare them
ghci> x
<interactive>:1:0:
No instance for (Show Unshowable)
arising from a use of `print' at <interactive>:1:0
Possible fix: add an instance declaration for (Show Unshowable)
In a stmt of a 'do' expression: print it
ghci> x == x
<interactive>:1:0:
No instance for (Eq Unshowable)
arising from a use of `==' at <interactive>:1:0-5
Possible fix: add an instance declaration for (Eq Unshowable)
In the expression: x == x
In the definition of `it': it = x == x
Again, the previously incomprehensible type error message should make sense to you.
Of course, this is lame; we should be able to compare and view them. To allow this, Haskell allows us automatically derive functions for certain key type classes, namely those in the standard library.
To do so, we simply dress up the data type definition with
> data Showable = A' | B' | C' deriving (Eq, Show)
and now we have
ghci> let x' = A'
ghci> :type x'
x' :: Showable
ghci> x'
A'
ghci> x' == x'
True
Let us now peruse the definition of the Num
typeclass.
ghci> :info Num
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
(*) :: a -> a -> a
(-) :: a -> a -> a
negate :: a -> a
abs :: a -> a
signum :: a -> a
fromInteger :: Integer -> a
There’s quite a bit going on there. A type a
can only be deemed an instance of Num
if
Eq
and Show
, andIn other words in addition to the “arithmetic” operations, we can compare two Num
values and we can view them (as a String
.)
Haskell comes equipped with a rich set of built-in classes.
In the above picture, there is an edge from Eq
and Show
to Num
because for something to be a Num
it must also be an Eq
and Show
. There are a few other ones that we will come to know (and love!) in due course…
Lets now see how slickly typeclasses integrate with the rest of Haskell’s type system by building a small library for Maps (aka associative arrays, lookup tables etc.)
> data BST k v = Empty
> | Bind k v (BST k v) (BST k v)
> deriving (Show)
We will call this type BST
to abbreviate Binary Search Tree which are trees where keys are ordered such that at each node, the keys appearing in the left and right subtrees are respectively smaller and larger than than the key at the node. For example, this is what a tree that maps the strings "burrito"
, "chimichanga"
and "frijoles"
to their prices might look like
We must ensure that the invariant is preserved by the insert
function. In the functional setting, the insert
will return a brand new tree.
> insert k v (Bind k' v' l r)
> | k == k' = Bind k v l r
> | k < k' = Bind k' v' (insert k v l) r
> | otherwise = Bind k' v' l (insert k v r)
> insert k v Empty = Bind k v Empty Empty
The organization of the BST allows us to efficiently search the tree for a key.
> find k (Bind k' v' l r)
> | k == k' = Just v'
> | k < k' = find k l
> | otherwise = find k r
> find k Empty = Nothing
The BST ordering obviates the need for any backtracking. If additionally if the tree is kept balanced we ensure very efficient searching.
Now, we can create a particular lookup table like so
> t0 = insert "burrito" 4.50 Empty
> t1 = insert "chimichanga" 5.25 t0
> t2 = insert "frijoles" 2.75 t1
NOTE: Each insert
returns a brand new BST
, this is not Java!
Of course this is a bit tedious, so it may be easier to write an ofList
function that will turn an association list into an appropriate BST
.
> ofList = foldl (\t (k, v) -> insert k v t) Empty
Now, we can just do
> t = ofList [("chimichanga", 5.25)
> ,("burrito" , 4.50)
> ,("frijoles" , 2.75)]
After which we can query the table
ghci> :type t
t :: BST [Char] Double
ghci> find "burrito" t
Just 4.5
ghci> find "birria" t
Nothing
Similarly, it makes sense to implement a toList
which will convert the map into an association list. First, lets write a generic foldBST
> foldBST f b Empty = b
> foldBST f b (Bind k v l r) = f k v (foldBST f b l) (foldBST f b r)
after which toList
is simply (but rather inefficiently!)
> toList = foldBST (\k v l r -> l ++ [(k, v)] ++ r) []
ghci> toList t
[("burrito", 4.50), ("chimichanga", 5.25), ("frijoles", 2.75)]
Why is the output sorted (by key) ?
Notice that we didn’t write down the types of any of the functions. Lets see what the types are
ghci> :type insert
insert :: (Ord a) => a -> v -> BST a v -> BST a v
ghci> :type find
find :: (Ord a) => a -> BST a t -> Maybe t
ghci> :type ofList
insert :: (Ord a) => a -> v -> BST a v -> BST a v
Whoa! Look at that, Haskell tells us that we can use any a
value as a key as long as the value is an instance of the Ord
typeclass. You might guess from the name, that a type is an instance of Ord
if there are functions that allow us to compare values of the type. In particular
ghci> :info Ord
class (Eq a) => Ord a where
compare :: a -> a -> Ordering
(<) :: a -> a -> Bool
(>=) :: a -> a -> Bool
(>) :: a -> a -> Bool
(<=) :: a -> a -> Bool
max :: a -> a -> a
min :: a -> a -> a
ghci> :info Ordering
data Ordering = LT | EQ | GT -- Defined in GHC.Ordering
How, did the engine figure this out? Easy enough, if you look at the body of the insert
and find
functions, you’ll see that we compare two key values.
Write a delete
function of type
delete :: (Ord k) => k -> BST k v -> BST k v
While Haskell is pretty good about inferring types in general, there are cases when the use of type classes requires explicit annotations (which change the behavior of the code.)
For example, Read
is a built-in typeclass, where any instance a
of Read
has a function
read :: (Read a) => String -> a
which can parse a string and turn it into an a
. Thus, Read
is, in a sense, the opposite of Show
. However, if we do
ghci> read "2"
Haskell is foxed, because it doesn’t know what to convert the string to! Did we want an Int
or a Double
? Or maybe something else altogether. Thus, we get back the complaint
interactive>:1:0:
Ambiguous type variable `a' in the constraint:
`Read a' arising from a use of `read' at <interactive>:1:0-9
Probable fix: add a type signature that fixes these type variable(s)
which clearly states what the issue is. Thus, here an explicit type annotation is needed to tell it what to convert the string to. Thus, if we play nice and add the types we get
ghci> (read "2") :: Int
2
ghci> (read "2") :: Float
2.0
Note the different results due to the different types.
So far we have seen Haskell’s nifty support for overloading by observing that
However, in many situations the automatic instantiation doesn’t quite cut it, and instead we need to (and get to!) create our own instances.
For example, you might have noticed that we didn’t bother with adding Eq
to the deriving clause for our BST
type. Thus, we can’t compare two BST
s for equality (!)
*Main> Empty == Empty
<interactive>:1:0:
No instance for (Eq (BST k v))
arising from a use of `==' at <interactive>:1:0-13
Possible fix: add an instance declaration for (Eq (BST k v))
In the expression: Empty == Empty
In the definition of `it': it = Empty == Empty
Suppose we had added
data BST k v = Empty
| Bind k v (BST k v) (BST k v)
deriving (Eq, Show)
Now, we can compare two BST
values
ghci> Empty == Empty
True
But the equality test is rather too structural, as in, are the two trees exactly the same, rather than what we might want, which is, are the two underlying maps exactly the same. Consequently we get
ghci> t == ofList (toList t)
False
Ugh! Why did that happen? Well, lets see
ghci> t
Bind "chimichanga" 5.25 (Bind "burrito" 4.5 Empty Empty) (Bind "frijoles" 2.75 Empty Empty)
ghci> ofList (toList t)
Bind "burrito" 4.5 Empty (Bind "chimichanga" 5.25 Empty (Bind "frijoles" 2.75 Empty Empty))
The trees are different because they contain the keys in different (valid!) orders. To get around this, we can explicitly make BST
an instance of the Eq
typeclass, by implementing the relevant functions for the typeclass.
To undertand how, let us look at the full definition of the Eq
typeclass. Ah! the typeclass definition also provides default implementations of each operation (in terms of the other operation.) Thus, all we need to do is define ==
and we will get /=
(not-equals) for free!
class Eq a where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool
{- Default Implementations -}
x == y = not (x /= y)
x /= y = not (x == y)
Thus, to define our own equality (and disequality) procedures we write
> instance (Eq k, Eq v) => Eq (BST k v) where
> t1 == t2 = toList t1 == toList t2
The above instance declaration states that if k
and v
are instances of Eq
, that is can be compared for equality, then BST k v
can be compared for equality, via the given procedure. Thus, once we have supplied the above we get
ghci> t == ofList (toList t)
True
In general, when instantiating a typeclass, Haskell will check that we have provided a minimal implementation containing enough functions from which the remaining functions can be obtained (via their default implementations.)
ghci> t /= Empty
True
In addition to the explicit type requirements, a typeclass also encodes a set of laws that describe the relationships between the different operations. For example, the intention of the Eq
typeclass is that the supplied implementations of ==
and /=
satisfy the law
forall t1 t2, t1 == t2 <==> not t1 /= t2
Unfortunately, there is no way for Haskell to verify that your implementations satisfy the laws, so this is something to be extra careful about, when using typeclasses.
It turns out that typeclasses are useful for many different things. We will see some of those over the next few lectures, but let us conclude today’s class with a quick example that provides a (very) small taste of their capabilities.
JavaScript Object Notation or JSON is a simple format for transferring data around. Here is an example:
{ "name" : "Ranjit"
, "age" : 33
, "likes" : ["guacamole", "coffee", "bacon"]
, "hates" : [ "waiting" , "grapefruit"]
, "lunches" : [ {"day" : "monday", "loc" : "zanzibar"}
, {"day" : "tuesday", "loc" : "farmers market"}
, {"day" : "wednesday", "loc" : "harekrishna"}
, {"day" : "thursday", "loc" : "faculty club"}
, {"day" : "friday", "loc" : "coffee cart"} ]
}
In brief, each JSON object is either - a base value like a string, a number or a boolean, - an (ordered) array of objects, or - a set of string-object pairs.
Thus, we can encode (a subset of) JSON values with the datatype
> data JVal = JStr String
> | JNum Double
> | JBln Bool
> | JObj [(String, JVal)]
> | JArr [JVal]
> deriving (Eq, Ord, Show)
Thus, the above JSON value would be represented by the JVal
> js1 =
> JObj [("name", JStr "Ranjit")
> ,("age", JNum 33)
> ,("likes", JArr [ JStr "guacamole", JStr "coffee", JStr "bacon"])
> ,("hates", JArr [ JStr "waiting" , JStr "grapefruit"])
> ,("lunches", JArr [ JObj [("day", JStr "monday")
> ,("loc", JStr "zanzibar")]
> , JObj [("day", JStr "tuesday")
> ,("loc", JStr "farmers market")]
> , JObj [("day", JStr "wednesday")
> ,("loc", JStr "hare krishna")]
> , JObj [("day", JStr "thursday")
> ,("loc", JStr "faculty club")]
> , JObj [("day", JStr "friday")
> ,("loc", JStr "coffee cart")]
> ])
> ]
Next, suppose that we want to write a small library to serialize Haskell values as JSON. We could write a bunch of functions like
> doubleToJSON :: Double -> JVal
> doubleToJSON = JNum
similarly, we have
> stringToJSON :: String -> JVal
> stringToJSON = JStr
>
> boolToJSON :: Bool -> JVal
> boolToJSON = JBln
But what about collections, namely objects and arrays? We might try
> doublesToJSON :: [Double] -> JVal
> doublesToJSON = JArr . map doubleToJSON
>
> boolsToJSON :: [Bool] -> JVal
> boolsToJSON = JArr . map boolToJSON
>
> stringsToJSON :: [String] -> JVal
> stringsToJSON = JArr . map stringToJSON
which of course, you could abstract by making the individual-element-converter a parameter
> xsToJSON :: (a -> JVal) -> [a] -> JVal
> xsToJSON f = JArr . map f
>
> xysToJSON :: (a -> JVal) -> [(String, a)] -> JVal
> xysToJSON f = JObj . map (second f)
but still, this is getting rather tedious, since we have to redefine versions for each Haskell type, and instantiate them by hand for each conversion
ghci> doubleToJSON 4
JNum 4.0
ghci> xsToJSON stringToJSON ["coffee", "guacamole", "bacon"]
JArr [JStr "coffee",JStr "guacamole",JStr "bacon"]
ghci> xysToJSON stringToJSON [("day", "monday"), ("loc", "zanzibar")]
JObj [("day",JStr "monday"),("loc",JStr "zanzibar")]
and this gets more hideous when you have richer objects like
> lunches = [ [("day", "monday"), ("loc", "zanzibar")]
> , [("day", "tuesday"), ("loc", "farmers market")]
> , [("day", "wednesday"), ("loc", "hare krishna")]
> , [("day", "thursday"), ("loc", "faculty club")]
> , [("day", "friday"), ("loc", "coffee cart")]
> ]
because we have to go through gymnastics like
ghci> xsToJSON (xysToJSON stringToJSON) lunches
JArr [JObj [("day",JStr "monday"),("loc",JStr "zanzibar")],JObj [("day",JStr "tuesday"),("loc",JStr "farmers market")]]
Ugh! So much for readability. Isn’t there a better way? Is it too much to ask for a magical toJSON
that just works?
Of course there is a better way, and the the route is paved by typeclasses!
Lets define a typeclass that describes any type that can be converted to JSON.
> class JSON a where
> toJSON :: a -> JVal
Easy enough. Now, we can make all the above instances of JSON
like so
> instance JSON Double where
> toJSON = JNum
>
> instance JSON Bool where
> toJSON = JBln
>
> instance JSON String where
> toJSON = JStr
Now, we can just say
ghci> toJSON 4
JNum 4.0
ghci> toJSON True
JBln True
ghci> toJSON "guacamole"
JStr "guacamole"
The real fun begins when we get Haskell to automaticall bootstrap the above functions to work for lists and association lists!
> instance (JSON a) => JSON [a] where
> toJSON = JArr . map toJSON
Whoa! The above says, if a
is an instance of JSON
, that is, if you can convert a
to JVal
then here’s a generic recipe to convert lists of a
values!
ghci> toJSON [True, False, True]
JArr [JBln True,JBln False,JBln True]
ghci> toJSON ["cat", "dog", "Mouse"]
JArr [JStr "cat",JStr "dog",JStr "Mouse"]
ghci> toJSON [["cat", "dog"], ["mouse", "rabbit"]]
JArr [JArr [JStr "cat",JStr "dog"],JArr [JStr "mouse",JStr "rabbit"]]
Of course, we can pull the same trick with key-value lists
> instance (JSON a) => JSON [(String, a)] where
> toJSON = JObj . map (second toJSON)
after which, we are all set!
ghci> toJSON lunches
JArr [JObj [("day",JStr "monday"),("loc",JStr "zanzibar")],JObj [("day",JStr "tuesday"),("loc",JStr "farmers market")]]
It is also useful to bootstrap the serialization for tuples (upto some fixed size) so we can easily write “non-uniform” JSON objects where keys are bound to values with different shapes.
> instance (JSON a, JSON b) => JSON ((String, a), (String, b)) where
> toJSON ((k1, v1), (k2, v2)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2)]
>
> instance (JSON a, JSON b, JSON c) => JSON ((String, a), (String, b), (String, c)) where
> toJSON ((k1, v1), (k2, v2), (k3, v3)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2), (k3, toJSON v3)]
>
> instance (JSON a, JSON b, JSON c, JSON d) => JSON ((String, a), (String, b), (String, c), (String,d)) where
> toJSON ((k1, v1), (k2, v2), (k3, v3), (k4, v4)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2), (k3, toJSON v3), (k4, toJSON v4)]
>
> instance (JSON a, JSON b, JSON c, JSON d, JSON e) => JSON ((String, a), (String, b), (String, c), (String,d), (String, e)) where
> toJSON ((k1, v1), (k2, v2), (k3, v3), (k4, v4), (k5, v5)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2), (k3, toJSON v3), (k4, toJSON v4), (k5, toJSON v5)]
Now, we can simply write
> hs = (("name" , "Ranjit")
> ,("age" , 33 :: Double)
> ,("likes" , ["guacamole", "coffee", "bacon"])
> ,("hates" , ["waiting", "grapefruit"])
> ,("lunches", lunches)
> )
which is a Haskell value that describes our running JSON example, and can convert it directly like so
> js2 = toJSON hs
This value is exactly equal to the old “hand-serialized” JSON object js1
.
ghci> js1 == js2
True
Why did we have to write a type annotation 33 :: Double
in the above example? Can you figure out a way to remove it?
To wrap everything up, lets write a routine to serialize our BST
maps.
> instance (JSON v) => JSON (BST String v) where
> toJSON = JObj . map (second toJSON) . toList
Now lets make up a complex Haskell value with an embedded BST
.
> hs' = (("name" , "el gordo taqueria")
> ,("address" , "213 Delicious Street")
> ,("menu" , t))
and presto! our serializer just works
ghci> t
Bind "chimichanga" 5.25
(Bind "burrito" 4.5 Empty Empty)
(Bind "frijoles" 2.75 Empty Empty)
ghci> :type hs'
hs' :: (([Char], [Char]),
([Char], [Char]),
([Char], BST [Char] Double))
ghci> toJSON hs'
JObj [("name", JStr "el gordo taqueria")
,("address", JStr "213 Delicious Street")
,("menu", JObj [("burrito", JNum 4.5)
,("chimichanga", JNum 5.25)
,("frijoles",JNum 2.75)])]
Thats it for today. We will see much more typeclass awesomeness in the next few lectures…
Why does Haskell reject the following expression?
(1,2,3,4,5,6,7,8,9,10,1,1,1,1,1,1) == (1,2,3,4,5,6,7,8,9,10,1,1,1,1,1,1)
What are the mysterious incantations in the LANGUAGE
block at the top of the file? Can you figure out why they are needed?