Haskell: More about TypeClasses and QuickCheck

Last time, I was looking into establishing equality on various conditions on Wai.Request, but established that this wasn’t what I was looking for. We did, however, establish how to perform casts and use polymorphic lists in a fashion that’s quite OO. Now I’m planning to drive right off road and try a bit of type-level reasoning.

Let’s start by simplifying the problem we had last time. Let’s stoop worrying about complex record types and just deal with primitive types. We’ll restrict our attention to equality and comparison conditions on those types. Let’s start by setting up some machinery.

data Equality = Eql | NEql deriving (Show, Ord, Eq, Typeable)
equalityAsFunc :: (Eq a) => Equality -> a -> a -> Bool
equalityAsFunc Eql = (==) equalityAsFunc NEql = (/=)
data Comparison = LTh | LThE | GThE | GTh deriving (Show, Eq, Ord, Typeable)
comparisonAsFunc :: (Ord a) => Comparison -> a -> a -> Bool
comparisonAsFunc LTh = (<)
comparisonAsFunc LThE = (<=)
comparisonAsFunc GTh = (>)
comparisonAsFunc GThE = (>=)
class Invertable a where
  invert :: a -> a
instance Invertable Comparison where
  invert LTh = GThE
  invert GTh = LThE
  invert GThE = LTh
  invert LThE = GTh
instance Invertable Equality where
  invert Eql = NEql
  invert NEql = Eql
-- I could make all functors of a invertable invertable, but I'm not sure that would actually be a good idea.
instance (Invertable a) => Invertable (Maybe a) where
  invert = fmap invert
-- Util
infixl 3 <|!> (<|!>) :: Maybe a -> a -> a
(<|!>) = flip fromMaybe

Now, we’re going to have a Condition typeclass, and at the very least we’re going to have instances for “always true/false”, “test for equality/inequality” and “compare against value”. And here’s the important bit: we’re going to want to analyze the relationship between them, even if they’re not the same type.

data ConditionRelationship = Same | Compatible | Incompatible | AImpliesB | BImpliesA deriving (Show, Ord, Eq, Typeable)
instance Invertable ConditionRelationship where
  invert AImpliesB = ImpliesA
  invert BImpliesA = AImpliesB
  invert x = x

Young Rankenstein

Now, how would we achieve this in an OO world? Well, we’d implement something like this:

class Condition0 c where
  analyze = (Condition0 d) => c -> d -> Maybe ConditionRelatioship

Where we’d return Nothing if c didn’t know how to analyze its relationship with d. Using the cast mechanism we’ve already seen, you can definitely implement this. And indeed, I did. (I tried an approach involving a more symmetric approach and some type magic, but ultimately couldn’t get it to fly.)

Let’s revise the definition a little so that we can actually it to test values. But we’re going to have to introduce a second type, v, the value under test.

{-# LANGUAGE RankNTypes #-}
class Condition1 v a where
  analyzeSame :: a -> a -> ConditionRelationship
  analyzeDifferent :: (Condition1 v b) => a -> b -> Maybe ConditionRelationship
  test :: a -> v -> Bool
analyze :: (Condition1 v a, Condition1 v b) => a -> b -> ConditionRelationship
analyze x y = a <|> b <|> c <|!> Compatible
                      -- If we have no idea, say it's compatible
  where a = analyzeSame x <$> (cast y)
        b = analyzeDifferent x y
        c = invert $ analyzeDifferent y x

Now, there’s actually a serious problem with this code: it doesn’t even compile! The problem is with the vs in analyze. It can’t determine that they’re the same. This I actually find weird, given that I’ve specified that a and b share a v, but it’s solvable.

First, I want to talk a bit about what the rank 2 typeclass actually is. It specifies a set of functions that can be called with an a and a v, but doesn’t restrict the a or the v in any way. So, all it’s really giving you is a relationship between the two types. And analyze never uses v, so it can’t deduce anything about it.

Now, there’s an extension called FunctionalDependencies and another called TypeFamilies that’d resolve this, but actually all we need to do is take the test method back out.

class (Eq a, Show a, Typeable a, Invertable a) => Condition2 a where
  analyzeSame2 :: a -> a -> ConditionRelationship
  analyzeDifferent2 :: (Condition2 b) => a -> b -> Maybe ConditionRelationship
class (Condition2 a) => Condition3 v a where
  test :: a -> v -> Bool
analyze2 :: (Condition2 a, Condition2 b) => a -> b -> ConditionRelationship
analyze2 x y = a <|> b <|> c <|!> Compatible -- We have no idea, say it's compatible
    where a = analyzeSame2 x <$> (cast y)
          b = analyzeDifferent2 x y
          c = invert $ analyzeDifferent2 y x

That works, doesn’t use any more extensions, (although Lord knows I’ve been playing extension pokemon recently) and also leaves us with the possibility of using different vs with the same a, even if in this particular case I can’t see why we’d wish to. For that matter, we could remove the interdependency between the two.

class Condition4 v a where
  test :: a -> v -> Bool

I found it really interesting the process I went through here: we actually ended up with a better design and separated our concerns as a consequence of the type system complaining about the functions we actually implemented.

Into The Lens

Let’s rename our concepts:

class (Eq a, Show a, Typeable a, Invertable a) => Analyzable a where
  analyzeSame :: a -> a -> ConditionRelationship
  analyzeDifferent :: (Analyzable b) => a -> b -> Maybe ConditionRelationship
class Testable v a where
  test :: a -> v -> Bool

First some definitions:

data Value c v = Value {
  _value :: v,
  _condition :: c
} deriving (Typeable, Eq, Show)
makeLenses ''Value
instance (Invertable c) => Invertable (Value c v) where
  invert = over condition invert

I’m using the lens package here, although to be honest I’m really only using it to start learning it. The actual practical benefits of it in the code I’ve written are very small, but I’m hoping to slowly pick up more aspects. In fact, of the code I’ve written so far this is the only bit that actually shows an improvement.

Breaking it down, we’re saying that if a condition is invertable, a Value using that condition is invertable by inverting the condition. Even though this is pretty elegant, but it’s going to take me a fair while to get my head around lens in general (There’s been loose talk of a lens NICTA-style course, that would be awesome.).

Scopes Monkey

So, we can declare an equality value to be a condition:

instance (Typeable a, Show a, Eq a) => Condition a (Value Equality a) where

I’m deliberately skipping the instance code because it’s pretty boring and predictable. The interesting case is when we’re implementing comparison:

analyzeEqualOther :: (Invertable b, Testable v b) => b -> Equal v -> ConditionRelationship

(I’ll skip the implementation.) Now for an instance:

instance (Typeable a, Show a, Ord a) => Analyzable (Value Comparison a) where
  analyzeSame = analyzeCompSame -- elided
  analyzeDifferent x y = analyzeEqualOther x <$> y2
    where y2 = cast y

Makes sense. Doesn’t compile. The reason’s a bit weird: it can’t figure out exactly what to cast y to. So let’s try this:

    where y2 = (cast y :: Maybe (Value Equality a))

Still doesn’t work. Here, the error message isn’t particularly helpful (unlike quite a few that just point you directly to the extension that you might need). The problem is actually that the a in the y2 expression isn’t the same as the a in the instance declaration. I don’t really understand why that decision was made (the explanation probably features the word “parametricity”), but you can reverse it by adding in another extension:

{-# LANGUAGE ScopedTypeVariables #-}

QuickCheck Yourself

There’s a plethora of things we could test, but let’s start with this one: What actually is the relationship between the Analyzeable version of condition and the Testable version of condition? Well, the answer is approximately that given two types that are both, we should be able to pick a set of vs such that we can deduce the behaviour of one from the other.

It would be lovely if we could achieve this through the type system, but I think that would be a serious reach, and even if it was possible it’s doubtful it would be readable. So instead let’s try using QuickCheck, the original property testing tool.

Aside: conversely, there should be no value of v where the behaviour of the two contradict one another. However, this latter condition is kind of hard to demonstrate using any example-based system. For that, you really do want Idris.

We need to set up some infrastructure to make cabal run tests.

Test-suite test
  Type:              exitcode-stdio-1.0
  Hs-source-dirs:    test
  Main-is:           Main.hs
  Build-depends:     base >=4.7 && <4.8,
                     tasty,
                     tasty-quickcheck,
                     derive,
                     QuickCheck,
                     myprojectname

This introduces a new target called test. I’ve not used quickcheck before so this is quite interesting. tasty appears to be the standard test running infrastructure. Note that since the test code is actually a separate executable, you need to put your own code as a dependency.

Stuck In The Middle With You

So, let’s pick five distinct values, a, b, c, d and e. We’ll put conditions at b and d and then test all five values in pairs. We can then read out from the set of pairs what the correct relationships between the two conditions is.

import Test.QuickCheck

data TestValues x = TestValues {
  a :: x,
  b :: x,
  c :: x,
  d :: x,
  e :: x
} deriving (Show, Eq, Ord)

instance (Arbitrary a, Ord a, Num a) => Arbitrary (TestValues a) where
  arbitrary = do
    a <- getPositive <$> arbitrary
    b <- (a +) <$> getPositive <$> arbitrary
    c <- (b +) <$> getPositive <$> arbitrary
    d <- (c +) <$> getPositive <$> arbitrary
    e <- (d +) <$> getPositive <$> arbitrary
    return (TestValues a b c d e)

This is all rather fun: arbitrary returns a value in the Gen monad. getPositive unwraps a Positive and return type polymorphism kicks in.

Normally, though, how to create an Arbitrary instance is obvious given its components, setting them up is going to get boring real fast, which is where the derive package kicks in

{-# LANGUAGE TemplateHaskell #-}
import Data.DeriveTH

derive makeArbitrary ''Equality
derive makeArbitrary ''Comparison
derive makeArbitrary ''Value

This now enables us to write the code we wanted:

propDeduce :: (Analyzable (Value c1 v), R.Testable v (Value c1 v), Analyzable (Value c2 v), R.Testable v (Value c2 v))
  => c1 -> c2 -> TestValues v -> Bool
propDeduce c1 c2 testValues = expected == drawConclusion x y testValues where
  x = Value c1 (b testValues)
  y = Value c2 (d testValues)
  expected = analyze x y

drawResults :: (R.Testable v (Value c1 v), R.Testable v (Value c2 v))
  => (Value c1 v) -> (Value c2 v) -> TestValues v -> [(Bool,Bool)]
drawResults x y testValues = result where
  result = f <$> ([a, b, c, d, e] <*> (pure testValues))
  f v = (test x v, test y v)

drawConclusion :: (R.Testable v (Value c1 v), R.Testable v (Value c2 v))
  => (Value c1 v) -> (Value c2 v) -> TestValues v -> ConditionRelationship
drawConclusion x y testValues = ac (length conclusions) conclusions where
  conclusions = nub $ drawResults x y testValues
  ac 4 _ = Compatible
  ac 3 _ = ac3 missingConclusion
  ac _ x | null $ x \ [(True,True),(False,False)] = Same
  ac _ _ = Incompatible
  missingConclusion = head $ [(True, True),(True, False),(False, True),(False,False)] \ conclusions
  ac3 (False, True) = BImpliesA
  ac3 (True, False) = AImpliesB
  ac3 (True, True) = Incompatible
  ac3 _ = Compatible

You can now write quickcheck properties like

prop_deduceCompEq :: Comparison -> Equality -> TestValues Int -> Bool
prop_deduceCompEq = propDeduce

(There may be a better way of instantiating propDeduce with different types, but this definitely works.)

In practice, what now happens is you spend a large amount of time actually fixing your code and your tests. What you’re seeing above is the output of that process. I learned a few things in the process.

  • Although QuickCheck is good at telling you there’s a problem, it’s got no facilities at all for telling you why.
  • Having relatively complex types makes it quite hard to reproduce a test in the repl. Conceivably the tooling for this could be improved.
  • You need to split your code up into chunks that are testable in the repl. This is a lesson Clojure taught me as well, but having access to an excellent debugger in other languages keeps unteaching it.

This is getting really long: I’ve skipped over the entire tasty code and the entire implementation.

Review

So the Condition design looks more appropriate to the aim of actually allowing us to optimize our tests, and Haskell’s led us to a typeclass design better than the original item. There are, however, certain problems. For instance, the design I’ve outlined here is incapable of spotting that “> 2” is the same as “>= 3” in the Integer domain. Pretty much the only good solution to this is to require stronger conditions than just Eq and Ord for condition values, which allow you to perform these analyses. I’m not very inclined to do that, but this problem doesn’t ruin my intended use. However, it highlights again just how challenging it is to write something truly polymorphic and correct.

It’s pretty easy to see how you can extend this into projections as well. However, in practice it gets pretty tricky, because you need to do an order 2 cast. Thankfully, I got a good answer on StackOverflow of exactly how to achieve that. Separating out the concept of the condition from the projection also seems like a strong idea. Ultimately, though, I don’t really like the way this is going. Casts work, and Maybe makes them safe, but the design feels like I’m circumventing the type system rather than using it.