January 2015 – Colour Coding

span.kw { color: #007020; font-weight: bold; }
code > span.dt { color: #902000; }
code > span.dv { color: #40a070; }
code > span.bn { color: #40a070; }
code > span.fl { color: #40a070; }
code > span.ch { color: #4070a0; }
code > span.st { color: #4070a0; }
code > span.co { color: #60a0b0; font-style: italic; }
code > span.ot { color: #007020; }
code > span.al { color: #ff0000; font-weight: bold; }
code > span.fu { color: #06287e; }
code > span.er { color: #ff0000; font-weight: bold; }
]]>

This is my attempt at a solution to the fox/goose/corn problem in Haskell. It was inspired by Carin Meier’s Clojure Kata for the same problem, although it deviates from the approach. A better Haskell developer might significantly improve on my version. I didn’t find much use for the standard typeclasses in this, sadly. As a consequence, however, the code is relatively understandable from the perspective of a Clojure programmer with no Haskell experience.

I’ll explain each construct as we encounter it.

Preliminaries

First, we have the namespace declaration. Unlike Clojure, we need to declare any identifiers we export. Since we’re writing an executable, we export main just as we would in C.

module Main (main) where

Data.Set exports a lot of things with the same names that Data.List exports, so it’s pretty common to import it qualified. It’s not strictly necessary for the code that follows, though.

import qualified Data.Set as S
import Data.List(unfoldr)
import Data.Foldable(Foldable,foldr,find)

The equivalent of clojure.core is the Prelude. We hide Left and Right because we’ll be using our own concept using those identifiers. We hide foldr because the version in Data.Foldable is more general.

import Prelude hiding (Left, Right, foldr)

The Haskell Prelude is actually kind of frustrating, in that it doesn’t show off the language in its full power. It’s heading in that direction, though. In particular, this particular problem is getting addressed soon. Some people opt out of the Prelude altogether and use an alternative.

Basic Data Types

We’re writing Haskell, so we should write some types down.

You’ll recognize the following declarations as being identical to Java enums. Ord means it’s orderable, which in turn means you can put it in a set (hash sets aren’t the default in Haskell), Eq means you can test for equality using it, which comes along for the ride with Ord, Show means you can print it. Haskell magically writes the code in deriving.

data Item = Fox | Goose | Corn | Me deriving (Ord, Eq, Show)
data Side = Left | Right

We’ll represent everything using only the representation of the right hand side. This has the nice property that the initial state is the empty set. So we’re travelling from the Left to the Right. If we’d used a list, some of the code below would be prettier than using a set, but I believe set is the correct representation since it’s fundamentally unordered. It’s worth considering how it would look in Clojure with Set.

This is a newtype. type would indicate a type alias (so State was exactly the same thing as S.Set Item.) A newtype can’t be mixed with a raw set (which is what a Clojure programmer would naturally do) and requires you to explicitly construct and deconstruct it from the set as necessary. This obviously has a cost in verbosity, but has no runtime overhead because it’s all optimised out. It’s especially useful if you’re dealing with two concepts wih the same type representation. In our case, State and History (defined later) could be very easily confused in Clojure.

newtype State = State (S.Set Item) deriving (Ord, Eq, Show)

`State` of Play

We’ll need some way of mapping booleans to Left/Right. We’re adopting a convention that Left = True here, and we’ve named the function to help keep this straight. Note that we have two definitions. Each definition is a pattern match on the right hand side. Basically, you need this for two things: identifying the side you’re on, and the side you’re not on, so the Bool -> Side mapping makes sense.

toRight :: Bool -> Side
toRight True = Right
toRight False = Left

Now let’s figure out which side we’re on. Here we destructure State for the first time.

onRight :: State -> Bool
onRight (State s) = S.member Me s

We also need a function that tells you what is on which side.

\ means “difference”. Since Data.Set is namespace qualified, so is the operator.
Sadly there’s no general type that subsumes sets and lists so there’s a List.\ and a Set.\ and they don’t interoperate well

Coming up with a good type system for lists and list like things is regarded as an open problem in the Haskell world and they’re not prepared to take the kinds of compromises Clojure and Scala have made. (Consider, for instance, that mapping a set returns a list.) However, in practice that means that using different types of lists together or writing general list-like code is a pain I could have introduced my own abstraction, but seriously, what’s the point?

Again, we have two definitions. This is the first time we use a where clause. A where clause is similar to a postfix let clause. Note that we don’t need type declarations for non-top-level declarations.

Also, this is an arity-2 function. Only there’s no such thing in Haskell. Haskell, like most FP languages (and unlike Clojure) only ever has functions that take one parameter and return one. So what you’re really looking at here is a function that takes a Side and returns another function which takes a State that then returns a set of items. If you just don’t apply enough parameters, you get the partial application of the function. I’ve long since been an advocate of programming Clojure like this ever since I spent a couple of hours in F#’s company.

side :: Side -> State -> S.Set Item
side Left (State s) = everyone S.\ s
  where everyone = S.fromList [Fox,Goose,Corn,Me]
side Right (State s) = s

The whole reason we’ve defined the operations above is this: after this point we’ll never destructure State again, just interact with the State functions we’ve already defined. The hope is that this enables us to think at a higher level about what we’re doing. (I’m not going to argue this point, but there’s plenty of people on the internet prepared to do so.)

Haskell! So We Can Be Safe!

Let’s figure out if a State is safe. Turns out the rules for whether or not you’re safe are pretty easy

The attended side is always safe
The unattended side is safe if only the Goose is there
No other unattended Goose is safe
Every other unattended side is safe

We’re using more Haskell features here.

. performs functional composition, so (toRight . not . onRight) is equivalent to (comp toRight not onRight).
We can have multiple definitions in a where clause.
We can call a variable _ if we don’t care about its value.
You can put a “guard” on a pattern match. I prefer to use guards and pattern matching over explicit branching primitives.
a $ b c d means the same as a (b c d). This prevents ridiculous paren buildup. Clojure has different ways of avoiding this, most obviously ->.

safe :: State -> Bool
safe s = safeSide $ side unattendedSide s
  where unattendedSide = (toRight . not . onRight) s
        safeSide l | S.member Goose l = S.size l == 1
        safeSide _ = True

In practice, the side function is only used within safe so we could have just stuck it into the where clause and saved some newtype book-keeping.

Moving the Boat

I’m not 100% happy with the readability of this next function, mostly because it’s quite long. Suggestions are welcome.

We need to find the next possible states. We’re mapping to set, because there’s no inherent ordering of the future states. You can do the same in Clojure. Unlike clojure, we need a separate map function, S.map rather than map. The good news is that it returns a set rather than a lazy list.

There is a general map function, fmap that will map anything to its correct container type (and more!) but we can’t use fmap here for technical reasons (for the curious, lookup: “Set is not a Functor in Haskell”).

Also, note that this is where we finally actually create a new State, and that we can just use State, the constructor, as a straight function that we can map over.

transitions :: State -> S.Set State
transitions state = S.map State $ S.insert moveBoat carry
  where onRight = S.member Me $ side Right state
        mySide = side (toRight onRight) state

The move command is either a delete or an insert, depending on the direction of travel. In Clojure this would be (if onRight dissoc assoc)

       move = if onRight
         then S.delete
         else S.insert

The list of items is the things that are on your side that aren’t you.

       items = S.delete Me mySide

Effectively, this next line just destructures State.

       right = side Right state

Whatever else happens, you’re definitely moving youself Note that moveBoat is the State represented by just moving yourself.

       moveBoat = move Me right

If you choose to move an item, it’s a motion on top of moveBoat, not on top of s, since you’re also moving.

We’re using flip, which swaps the parameters of move. We could also have said moveItem x = move x moveBoat or something with lambdas (IMO, lambdas are rarely the most clear option, and in this code they’re never used.) Although you could write flip in Clojure, it really isn’t Clojure “style”, but definitely is Haskell style

       moveItem = flip move moveBoat

carry is the set of states if you carry an item with you

       carry = S.map moveItem items

There’s a huge number of different types in the preceding function, and no type declarations other than the top level. You can put more type declarations in that I have, but you can’t put in fewer and compile with -Wall (If you’re OK with warnings, you can throw away the top level type declarations some of the time, but there’s a lot of reasons that’s a bad idea.)

Desperately Seeking Solution

We’ll ignore the State type completely for a while and just talk in general about how you solve this kind of problem.

We need to think about how to represent a sequence of moves. Here we newtype List (List here is a good choice of type, since history is fundamentally ordered). History is stored backwards for convenience.

newtype History a = History [a] deriving (Eq, Ord)

Let’s make history print the right way around though. To do this, we need to implement the Show typeclass by hand. (Typeclasses are a bit like interfaces, but behave very differently.)

=> is the first example in the code of a type restriction. Here we’re saying “If a is showable, then History of a is showable.” Then the implementation says “The way you show it is by taking the list, reversing it and then showing that.”

instance (Show a) => Show (History a) where
  show (History l) = show $ reverse l

How are we’re going to find the solution? You want to use your transition function to construct a set of possible histories and then search that list for a solution. You could basically do this as a breadth-first or a depth-first search. A breadth-first search will have the advantage of finding the minimal solution. To avoid wasting time on cycles such as the boat just going backwards and forwards, we’ll keep track of what positions we’ve already generated.

So, how to we go from all combinations of 2 moves to all combinations of 3 moves? We define a data structure, Generation.

data Generation a = Generation {
  previous :: S.Set a,
  states :: S.Set (History a)
}

In practice, we know that a will be State, but it’s generally good Haskell style to use the most general type possible. When you get the hang of it, this aids clarity, rather than impeding it. (See also: parametricity and theorems for free).

Generation is a record data type. Like Clojure, you can use previous and states as accessor functions. Unlike Clojure, these functions are strongly typed. That means you can’t have fields with the same name in different records (within the same file/namespace).

Working with Generations would be better if we used lenses, but lets stick to things in the base libraries.

Implementing the Search

We need to map the function that generates new states to a function that creates new Generations. In Clojure, we’d probably use reduce. In Haskell, we use foldr, which is pretty similar, modulo some laziness and argument order differences.

(a -> S.Set a) is a parameter that is a function.
We’re specifying that a implements Ord, which we need to be able to put it into a Set.
Due to the wonders of partial application, (a -> S.Set a) -> Generation a -> Generation a is exactly the same as (a -> S.Set a) -> (Generation a -> Generation a)

liftG :: (Ord a) => (a -> S.Set a) -> Generation a -> Generation a
liftG f t = foldr (stepG f) initial (states t)
  where initial = Generation {
          previous = (previous t),
          states = S.empty
        }

Actually, I’ve skipped the most important bit of this: the step function. I could have inlined it, but it’s pretty complex I prefer to give it its own top level declaration, along with a semi-scary type signature.

stepG :: (Ord a) => (a -> S.Set a) -> History a -> Generation a -> Generation a
stepG f (History h@(s : _)) t = result

The destructuring of History is a bit more complicated. Here we’re assigning h to the whole history, and s to the latest state in the history. Note that if History is empty, the pattern match won’t work. Clojure would just match it and put nil in s. Type safety is pretty cool here but it means we need a new pattern match for empty histories. Strictly speaking, they aren’t valid, but the way we defined the type they can happen. (If you’re seriously thinking you want a type system that can express “non-empty list” I have two answers for you: core.typed and Idris.) This is the point at which Haskell goes “Well, I’m trying to be a practical FP language, you know.”

  where result = Generation {

Add the new states into the list of known states.

          previous = S.union (previous t) nextStates,

Add the new histories into the current generation.

          states = S.union (states t) (S.map newHistory nextStates)
        }

The next states are the states of the transition function minus the known states.

        nextStates = f s S.\ (previous t)

The newHistory function is interesting. Observe (: h). Now (x : xs) is the same as (cons x xs) in Clojure. (x :) would be (partial cons x) and (: xs) would be #(cons % xs). So (: h) is a function that takes a t and puts it in front of the existing list. This is operator section and works for all operators (you can define your own) except (- x) (which is special cased to unary minus).

Again, History is just an ordinary function, that wouldn’t have been needed if we’d done types instead of newtypes.

        newHistory = History . (: h)

Finally, to avoid compiler warnings, tell it what happens when History is empty. This case should never happen.

stepG _ _ t = Generation { previous = previous t, states = S.empty }

The Under-Appreciated Unfold

So, now we’ve got a Generation to Generation function, how do we get the list of all possible histories? Well, we could always just write some recursive code, but like in Clojure, there’s functions that exemplify common recursion structures. In Clojure, iterate might be good choice here. In Haskell, there’s unfoldr.

The type declaration of iterate in Clojure would be iterate :: (a -> a) -> a -> [a].

In comparison, the type declaration of unfoldr is quite complex: unfoldr :: (b -> Maybe (a, b)) -> b -> [a].

You might be wondering why they’re so different. The short answer is that unfoldr is awesome. The key is the step function itself b -> Maybe (a,b). This says that it takes a b and returns either Nothing (nil) or Just a pair of a and b. (Did I mention one of the coolest things about Haskell? null/nil doesn’t exist.) The b gets passed to the next step, the a gets output. So unfoldr supports having an internal state and an external state. What happens if Nothing is returned? The list stops generating. Clojure expects you to then terminate the list in a separate step, an approach that seems simpler but falls down when you start to use things like the state monad.

So, our output a is going to be the set of states of the generation, while b is going to be the Generations themselves. We’ll return Nothing when there’s no states in the Generation.

iterations :: (Ord a) => a -> (a -> S.Set a) -> [S.Set (History a)]
iterations start f = unfoldr (forUnfoldr . (liftG f)) initial
  where forUnfoldr t | S.null (states t) = Nothing
        forUnfoldr t = Just ((states t),t)
        initial = Generation {
          previous = S.empty,
          states = S.singleton $ History [start]
        }

So we just call unfoldr with a generation producing function using forUnfoldr to adapt it to fit.

We’ve done this using unfoldr, which has explicit state. Control.Monad.Loops exposes unfoldM which could be used with a state monad to achieve a similar effect.

Fun with Types

Let’s have some fun. We’ve got a list of sets that contains the solution. There’s a perfectly good function for finding an element in a a list called find (as an aside: there’s no such perfectly reasonable function in Clojure). Small catch: it takes a Foldable (in Clojure, a reducable). List is Foldable, Set is Foldable, but a list of sets of states iterates through the sets, not the states.

We’ll do some type magic and make it iterate through the states. (Thanks to Tony Morris for pointing me to a way to achieve this. Much more brain-bending stuff is available in Control.Compose)

newtype Compose g f a = O (g (f a))

instance (Foldable f1, Foldable f2) => Foldable (Compose f1 f2) where
  foldr f start (O list) = foldr g start list
    where g = flip $ foldr f

So, here we’ve said that a foldable of a foldable of a can be used as a single foldable by using flip $ foldr f as the step function. We could have just written this function out, but hey, why not live a litte.

The Finish Line

Finally, we get to main. Often this is expressed in do notation, but I don’t feel the need here, since it’s literally one line: print solution.

main :: IO ()
main = print solution
  where solution = find success (O allHistories)
        success (History (s : _)) = side Left s == S.empty
        success _ = False
        allHistories = iterations allOnLeft next
        allOnLeft = State S.empty
        next = S.filter safe . transitions

So, you can build it, and run it. time reports that it takes 2ms on my machine. How on earth did it run so fast? Aren’t fully lazy functional languages meant to be slow? Well, there are advantages to running an optimizing compiler, but they’re helped by understanding a bit of what is going on under the hood. An unfold followed by a fold is called a hylomorphism. The thing is, you never need to build the whole structure, you could just run each iteration through the fold as it comes. The Haskell compiler is smart enough that it actually rewrites the code. So a large chunk of our code is actually running imperatively.

How much have types helped me write this code? Well, the early functions, especially safe, I needed to nail in GHCi, the Haskell REPL. On the other hand, the later parts of the code actually worked first time (after I’d managed to fix all of the type errors.). Make of that what you will.

I hope you’ve found this interesting. I’m still very much a beginner Haskell programmer, but I hope the presentation enables you to see how you can express ideas in Haskell. If you’d like to learn more, I can highly recommend starting with Brent Yorgey’s course.

Month: January 2015

Fox Goose Corn in Haskell for Clojure Programmers

Preliminaries

Basic Data Types

`State` of Play

Haskell! So We Can Be Safe!

Moving the Boat

Desperately Seeking Solution

Implementing the Search

The Under-Appreciated Unfold

Fun with Types

The Finish Line

Preliminaries

Basic Data Types

State of Play

Haskell! So We Can Be Safe!

Moving the Boat

Desperately Seeking Solution

Implementing the Search

The Under-Appreciated Unfold

Fun with Types

The Finish Line

`State` of Play