MnO2's Blog

a Haskell novice

I Used to Be an Haskeller Like You, Then I Took an Arrow in the Knee

You hardly can miss Arrow on your path of learning Haskell. Just after struggling with Functor, Applicative and Monad, you convince yourself that you can finally take a look at those standard libraries requiring you the knowledge of Monad. A few Arrow interface libraries such as Hakyll, HXT, reactive-banana just punch you in the face. You just don’t understand what that is. Finishing tutorials on wikibook (link1, link2) do help, but you still can’t get the whole picture. What’s the relationship between Arrow and Monad? Is there non-trivial concrete example that can persuade me it is useful? Here I try to provide my answers.

Why Monad is not Enough

Arrow is proposed by John Hughes in 1998. As described in his paper Generalising Monads to Arrows, the purpose of Arrows is to resolve the issue of memory consumption in using moandic parser. An example from the paper is:

1
2
3
4
instance MonadPlus Parser where
    P a + P b = P (\s -> case a s of
                            Just (x, s') -> Just (x, s')
                            Nothing -> b s)

The property of Plus corresponds to backtracking parsers. The input s is retained in the memory and not garbage collected as long as the parser a succeeding in parsing. If a succeeds a large part of the input s before it eventually fails, then a great deal of space would be used to hold these already-parsed tokens. To resolve this kind of problems, the Arrows was proposed.

Besides the issue mentioned above, We know that Monads essentially provide a sequential interface to computation. One can build a computation out of a value, or sequence two computations. The structure of Monads empowers us convenience and at the same time limits its usage. Some problem domains can not be modeled with Monads, jaspervdj proposes that a building system like Hakyll is one of them.

In general, the structure that is not sequental, but is much like electronic circuits can be modeled by Arrows. Monads fails to model these structures. Its composability is not as flexible as Arrows.

Typeclassopedia

Let’s see a diagram from Typeclassopedia by Brent Yorgey.

typeclassopedia_diagram

This diagram tells us the relationship between Monads and Arrows. Unlike Functor and Applicative, they are basically different things and from different lineage.

First of all, an Arrow has to be an Category. Category generalizes the notion of function composition to the morphisms in Category Theory. Its definition in base package is as follows:

1
2
3
class Category cat where
    id :: cat a a
    (.) :: cat b c -> cat a b -> cat a c

We can see that it requires an identity morphism id, and the composition operator (.) for morphisms. Also note that Category is unequivalent to arbitrary categories in Category Theory. It only corresponds to those categories whose objects are objects of Hask.

Laying on the foundation of Category, Arrow’s definition is as follows:

1
2
3
4
5
6
class Category a => Arrow a where
    arr :: (b -> c) -> a b c
    first :: a b c -> a (b, d) (c, d)
    second :: a b c -> a (d, b) (d, c)
    (***) :: a b c -> a b' c' -> a (b, b') (c, c')
    (&&&) :: a b c -> a b c' -> a b (c, c')

There are five methods here, but actually you only have to define arr and first. Rest of them can be automatically derived.

To see how these five composing methods work, let’s try out an easy example. That is to model an OR gate with NAND gates.

or from nand

We can duplicate the input to the first level NAND gates with id and &&&, and to combine the results of first level NAND gates, we can use *** operator. In chart it looks like this.

arrow or from and

To implements it in haskell we also need to define the circuit element as instances of Category and Arrow, we call it Auto.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
{-# LANGUAGE Arrows #-}
import Prelude hiding (id, (.))
import Control.Arrow
import Control.Category

data Auto b c = Auto (b -> c)

instance Category Auto where
    id = Auto id
    Auto f . Auto g = Auto (f . g)

instance Arrow Auto where
    arr f = Auto f
    first (Auto f) = Auto $ \(x, y) -> (f x, y)

nand2 :: Auto (Bool, Bool) Bool
nand2 = arr (not . uncurry (&&))

or2 :: Auto (Bool, Bool) Bool
or2 = ((arr (id) &&& arr (id)) >>> nand2) *** ((arr (id) &&& arr (id)) >>> nand2) >>> nand2

runAutoA :: (Bool, Bool) -> Auto (Bool, Bool) a -> IO a
runAutoA x (Auto f) = return (f x)

main = do
    let comb = [(False, False), (False, True), (True, False), (True, True)]
    result <- sequence $ flip runAutoA or2 `map` comb
    putStrLn $ show result

Arrow Notation

When we are tackling larger problems, the notation soon gets complicated to handle. That’s why we want to have do block syntax sugar to make it look nicer, as we do for Monad. GHC provides Arrow Notation, but it’s currently not part of Haskell 98, you must enable the corresponding extension: with -XArrow or prepend a header line to the files {-# LANGUAGE Arrows #-}. After that, we can rewrite our OR gate example above with Arrow notation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
{-# LANGUAGE Arrows #-}
import Prelude hiding (id, (.))
import Control.Arrow
import Control.Category

data Auto b c = Auto (b -> c)

instance Category Auto where
    id = Auto id
    Auto f . Auto g = Auto (f . g)

instance Arrow Auto where
    arr f = Auto f
    first (Auto f) = Auto $ \(x, y) -> (f x, y)

nand2 :: Auto (Bool, Bool) Bool
nand2 = arr (not . uncurry (&&))

or2' :: Auto (Bool, Bool) Bool
or2' = proc (i0, i1) -> do
        m1 <- nand2 -< (i0,i0)
        m2 <- nand2 -< (i1,i1)
        nand2 -< (m1,m2)

runAutoA :: (Bool, Bool) -> Auto (Bool, Bool) a -> IO a
runAutoA x (Auto f) = return (f x)

main = do
    let comb = [(False, False), (False, True), (True, False), (True, True)]
    result' <- sequence $ flip runAutoA or2' `map` comb
    putStrLn $ show result'

The pattern is keyword proc followed with the input symbols, then -> do. We describe each elemment in each line. m1 <- nand2 -< (i0,i0) means nand2 accept a tuple (i0, i0) and the its output is bound to the name m1. The last line must be an expression, just like Monad, its value denotes the value of the whole do block. In this case, its the output of nand2 -< (m1,m2)

A More Complicated Example

Already know how to simulate OR with NAND, let’s implement a hald-adder.

half adder

With the help of Arrow Notation, all we need to do is to define the instances, and drawing circuits with <- and -< symbols in the program.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
{-# LANGUAGE Arrows #-}
import Prelude hiding (id, (.))
import Control.Arrow
import Control.Category

data Auto b c = Auto (b -> c)

instance Category Auto where
    id = Auto id
    Auto f . Auto g = Auto (f . g)

instance Arrow Auto where
    arr f = Auto f
    first (Auto f) = Auto $ \(x, y) -> (f x, y)

and2 :: Auto (Bool, Bool) Bool
and2 = arr (uncurry (&&))

or2 :: Auto (Bool, Bool) Bool
or2 = arr (uncurry (||))

nand2 :: Auto (Bool, Bool) Bool
nand2 = arr (not . uncurry (&&))

xor2 :: Auto (Bool, Bool) Bool
xor2 = proc i -> do
        m0 <- nand2 -< i
        m1 <- or2 -< i
        and2 -< (m0, m1)

hAdder :: Auto (Bool, Bool) (Bool, Bool)
hAdder = proc i -> do
          o0 <- xor2 -< i
          o1 <- and2 -< i
          id -< (o1, o0)

runAutoA :: (Bool, Bool) -> Auto (Bool, Bool) a -> IO a
runAutoA x (Auto f) = return (f x)

main = do
    putStr "Hello "
    pairs <- runAutoA (True, True) hAdder
    putStrLn $ show pairs

Postscript

As you can see from typeclassopedia diagram, there are variants of Arrow. They are ArrowZero, ArrowPlus, ArrowLoop, ArrowChoice and ArrowApply, where ArrowApply is equivalent to Monad. I’ll write another post to introduce them another days.

What Is Rank-N Types in Haskell

For a beginner of haskell coming from popular imperative languages with “Generic” or “Template” construct in them, like C++, Java, it is not too hard to accept the parametric polymorphism at first glance. The type signature of id :: a -> a can quickly remind him of C++ template.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>
#include <string>

using namespace std;

template <typename T>
T id(const T &x)
{
        return x;
}

int main()
{
        cout << id(1) << endl;
        cout << id(string("foo")) << endl;
        return 0;
}

However, the thing would become complicated when he bumped to the type signature of runST

1
runST :: (forall s. ST s a) -> a

A few questions pop out. What the hell is that forall, and what are Rank2Types and RankNTypes documented in the GHC manual?

Higher Rank Polymorphism

The parametric polymorphism you usually see is actually Rank-1 Polymorphism. There is actually Rank-2 polymorphism … to Rank-N polymorphism. In not quite exact language, a Rank-1 polymorphic function is that it doesn’t depend on its input/output types. The type become parameters, and when you call the function, the types are “instantiated.” And Rank-2 polymorphic function is a function gets called with its parameters parameterized, or to be Rank-1 polymorphic. So on, Rank-N polymorphic function is a function gets called with its parameters to be Rank-(N-1) polymorphic.

Let’s look at a Rank-2 example. Suppose we have a list of Int and a list of Char, we want to write a function to check if the results of applying a function to both lists are equal. Use case would look like this.

1
2
3
4
5
check f l1 l2 = (f l1) == (f l2)

main = do
    print $ check length [1,2,3,4] ['a','b','c','d']
    print $ check ((+2) . length) [1,2] ['a']

We can see that we want parameter f of check to be a Rank-1 polymorphic function, so that we can apply it to both [Int] and [Char]. In such case, what would be the type signature of check? Let’s look at how we can declare it in haskell.

forall

Parametric polymorphism is implicit in haskell. When you write a -> a, actually it is forall a. a -> a. forall is the keyword to explicitly declare what is your type parameter, or type variable. A few examples from standard lib can be explicitly written like this.

1
2
3
length :: forall a. [a] -> Int
fst :: forall a b. (a,b) -> a
map :: forall a b. (a -> b) -> [a] -> [b]

Writing out explicitly, we declare the type signature of check as check :: (forall a. [a] -> Int) -> forall b. [b] -> forall c. [c] -> Bool. Notice that the effective range of forall greedily extends to the most right end, so the type signature above is equivalent to check :: (forall a. a -> [Int]) -> (forall b. [b] -> (forall c. [c] -> Bool)). But so many forall in the type signature, you would ask where should I put the forall to express a Rank-2 polymorphic function. Just as this link said, you can consider forall as a type level lambda /\. This lambda /\ has some operational similarity with \, instead of declare a variable binding, it binds type variables. A parallel in C++ would be binding T of template <typename T> to the corresponding template.

With this in mind, basically we can write out the type variables and normal variables at the same time in /\ and \. It would look like check = \f -> /\b -> \l1 -> /\c -> \l2 -> Bool, where f is a function which is /\a -> \l -> Int.

Here is a working example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{-# LANGUAGE RankNTypes #-}
module Main(main) where

import Prelude hiding (length)

length :: forall a. [a] -> Int
length (x:xs) = 1 + (length xs)
length [] = 0

check :: (forall a. [a] -> Int) -> forall b. [b] -> forall c. [c] -> Bool
check f l1 l2 = (f l1) == (f l2)

main = do
    print $ check length [1] ['a','b']

You can play with Rank-N Type with N greater than 2. The spirit is the same. This is an example from here.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
{-# LANGUAGE RankNTypes #-}
module Main (main) where

g1 :: a -> a
g1 x = x

g2 :: (forall a. a -> a) -> (Bool, Char)
g2 f = (f True, f 'a')

g3 :: ((forall a. a -> a) -> (Bool, Char)) -> (Char, Bool)
g3 f = (\x -> (snd x, fst x)) (f g1)

g4 :: (((forall a. a-> a) -> (Bool, Char)) -> (Char, Bool)) -> (Bool, Char)
g4 f = (\x -> (snd x, fst x)) (f g2)

main = do
    putStrLn "Rank-2:"
    putStrLn . show . fst $ (g2 g1)
    putStrLn . show . snd $ (g2 g1)
    putStrLn "Rank-3:"
    putStrLn . show . fst . g3 $ g2
    putStrLn . show . snd . g3 $ g2
    putStrLn "Rank-4:"
    putStrLn . show . fst . g4 $ g3
    putStrLn . show . snd . g4 $ g3

Possible Confusion

Due to the usage of forall keyword, it is very easy to confuse RankNTypes, Existential Types and Scoped Type Variables. You can go to check out an excellent discussion on stackoverflow. Also don’t miss out John Lato’s and ezyang’s explanations on Existenatial Types.

中東現場與拉丁美洲真相之路

這是一篇欠了很久的讀書心得。 拖了很久是因為每每讀完這類開拓視野、發人深省的書便在下筆與反覆咀嚼兩端擺盪。 想下筆是因為覺得這些書可以讓閱讀的人更進一歩接近事情的本質,想透過推薦找出願意品嘗知識所造成的孤獨感的人。就算小眾至少世上是有人可以分享的。 猶豫著反思又是怕說自己所學甚淺,未累積足夠智識可以找出脈絡獨立判斷閱讀的資料。 閱讀過後的衝擊、激動或許是一種危險、缺乏冷靜的判斷。

特別是像這兩本著名的香港獨立記者張翠容所寫的中東現場拉丁美洲真相之路, 每每讀到這類各國現況的所聞紀實、關鍵人物採訪好似在提醒自己所學有多麼淺薄。 了解政治、人民生活的現況往往需要歷史脈絡的追尋、政治左右派之爭、經濟史各個流派強堆的學說、彼此串連糾纏在一起才能窺得真相的某一面。 讀翠容的書的好處是她用淺顯的文字,在行文中也融進了消化上數知識後的分析見解。 可以讓你自己需要做的功課少了許多。 但這樣的缺點就是會讓你的讀書心得寫不下手, 因為覺得怎麼寫都覺得自己寫的心得沒什麼附加價值,浪費了看這篇心得的人的時間。 還不如直接附上連結,指著說「趕快去看本書吧。」

這邊只分享一個特別有印象的事: 眾所皆知猶太人於二戰被屠殺,是被迫害者。 二戰後以色列立國,不出幾年過去的迫害者變成了擴張主義的征服者。 撇開巴勒斯坦土地不說, 1982年在以色列的介入下,黎巴嫩的巴人難民營被血洗、屠殺。 成了難民逃到別國還要被趕盡殺絕,這是什麼道理。 同樣距今不到十年,2002另一個名為Jenin難民營亦遭到以色列屠殺。 我知道這件事的時候完全可以體會自殺炸彈這種攻擊是怎麼來的, 當你被逼到絕境,生命毫無出路時,死亡並不是一件可怕的事, 可以順便把你恨透的敵人一起拖下水可說是絕佳划算。 歷史或許就在人性中不斷地糾纏發展, 過去的被害者也不一定會記住教訓,而不要讓事情再發生。

最後,儘管在現今Twitter、Facebook等短文平台被號稱為新媒體的現在, 還是來列一些早些年所謂的獨立媒體或是公民媒體平台好了。 雖然以我的學識也沒有辦法嚴格區分這些的定義有什麼不同。 但至少在這個世界受到主流美國媒體強烈主導下,提供些不同有組織性的管道可以了解世界。 相信其中也是有不少像翠容一樣智識的人在付出。

最有名的是不外乎就是Global Voices,也是流量最大的。 他是Ethan Zuckerman跟Rebecca MacKinnon在2004年在哈佛法學院的贊助下成立的。 也是目前看到的平台中最容易透過多國語言翻譯來閱讀的。

另外就是在1999年透過反全球化活動、希望聚集眾人力量的前提下而重力的Independent Media Center。 算是串連世界各地獨立媒體的平台。只要符合一些條件,確認你可以自己運作一個獨立媒體組織就可以申請加入串連。

至於一些地區性的著名獨立媒體羅列如下