Haskell basics

Haskell is a functional language, that means that everything is a function. For every input you give to a function, it gives you same output every time. It is called that these functions are pure.
Functions which are not pure have some side effects, for example changing global variable in Imperative language like C or Python.

Lists, map, foldr,

List comrpehension

[(a,b)| a<-[1..5],b<-["dog","cat","crocodile"]]


 map (\arg1 -> arg1*arg1) [1..10]


zip [1..5] ["baf","haf","meow","kek","kuk"]


zipWith (\ x y -> (x*2,y,y++y,"WHATEVER")) [1..5] ["baf","haf","meow","kek","kuk"]


Example with brontosaurus
foldl (\x y -> x++[y] ) [] ['b','r','o','n','t','o','s','a','u','r','u','s']

Language basics

You can use let to define local functions and local variables of type:

    numbeeers num = let oneLess int = int - 1
                        twoLess int = int - 2
                    in (oneLess num) + (twoLess num)

Where is used to define local functions and variables at the end of the function.

    numbeeers num =  (oneLess num) + (twoLess num)
            where oneLess int = int - 1
                  twoLess int = int - 2

Guards is a case like structure, it goes from top to bottom and compares conditions.

    doYouLikeCake answer
        | answer == "Yes" = "Good boi."
        | answer == "No"  = "No cake for you."
        | otherwise       = "Sort it out, Yes or No?"





Haskell type system, static typing, type variables, typeclasses

 :set -fbreak-on-exception

HTML parser

Let's define html data type following way:

data HTML = Html  [HTML] | Head   [HTML] | Body  [HTML] | P String  | Script  String deriving (Show)

Here is a function that takes this data and creates list of tags and strings

    parse (Html body) = ["<html>"]++( concat $ map parse body ) ++ ["</html>"]
    parse (Head body) = ["<head>"]++( concat $ map parse body ) ++ ["</head>"]
    parse (Body body) = ["<body>"]++( concat $ map parse body ) ++ ["</body>"]
    parse (P    body) = ["<p>"]++[body]++ ["</p>"]
    parse (Script body) = ["<script>"]++[body]++["</head>"]
So let's create sample data:
let web1 = Html [Head [Script "javascript"],Body [P "Dog is colorful.", P "What is this article"] ]

Applying parse on web1 gets:

["<html>","<head>","<script>","javascript","</head>","</head>","<body>","<p>","Dog is colorful.","</p>","<p>","What is this article","</p>","</body>","</html>"]

Now we need to print it to output:

print2 list = do if ( length list >0)
                    then  do putStrLn $ head list
                             print2 $ tail list
                    else  do return ()

And output is:

Dog is colorful.
What is this article

Okay...but this is rather dull, lets add padding so it looks like real html!

    parse (Html body) level = [(level,"<html>")]++( concat $ map (\x -> parse x (level+1)) body  ) ++ [(level,"</html>")]
    parse (Head body) level = [(level,"<head>")]++( concat $ map (\x -> parse x (level+1)) body ) ++ [(level,"</head>")]
    parse (Body body) level = [(level,"<body>")]++( concat $ map (\x -> parse x (level+1)) body ) ++ [(level,"</body>")]
    parse (P    body) level = [(level,"<p>")]++[(level+1,body)]++ [(level,"</p>")]
    parse (Script body) level = [(level,"<script>")]++[(level+1,body)] ++ [(level,"</head>")]

    print2 list = do if ( length list >0)
                        then  do putStrLn $  (padding $ fst $ head list)++(snd $ head list)
                                 print2 $ tail list
                        else  do return ()

    padding num = take newNum $ repeat ' '
        where newNum = num*4

Output is:

            Dog is colorful.
            What is this article

Huffman coding

Huffman coding is works like this. 1.) scan file you want to compress and compute list of (character,frequency) 2.) create huffman tree, which is binary tree with leaves that are characters and their frequencies 3.) encode original file with this tree

Simple database

We will create simple database.

Binary Trees

Binary tree is declared like this:

    data BT a = Leaf a | (BST a) (BST a) deriving (Show)

N-ary trees

N-ary trees can have arbitrary numbers of descendants, and are defined like this:

data NT a = N a [NT a] deriving Show

--number a number
aNumber::   a ->            -- thing to number
            Int->           -- starting number
            ((a,            -- thing
            Int),           -- number
            Int)            -- next number

aNumber a aNum = ((a,aNum),aNum+1)

--number a tree

nNumber:: NT a   ->  -- thing to number
          Int    ->  -- starting number
         (NT (a,Int) -- numbered thing
         ,Int)       -- next number

nNumber (N a list) i0 = (  N nThing numberedList ,i2)
    where   (nThing,i1)= aNumber a i0
            ( numberedList, i2) = ntsNumber list i1

-- -number a list of trees
ntsNumber:: [NT a] -- thing to number
            -> Int -- starting ntsNumber
            -> ([NT (a,Int)], -- thing to number
            Int    -- next following number
ntsNumber []     i0 = ([],i0)
ntsNumber (x:xs) i0 = (tree:treeList,i2)
    where   (tree,i1)      = nNumber x i0
            (treeList,i2) = ntsNumber xs i1



Aho Corasick

Graph representation

We will use following data:

    a b 2
    a c 4
    b a 2
    b d 3
    b g 4
    c a 4
    c d 5
    c e 6
    d b 3
    d c 5
    d f 6
    e c 6
    e f 7
    f d 6
    f e 7
    f g 9
    g b 4
    g f 9

Lets transform this into haskell friendly form, line by line,

    a b 2



Then we'll have list of these triplets, as edge list with weights.
Here is how we do it:

    import System.IO
    import Data.List.Split

    main = do
        inh <- openFile "data01.txt" ReadMode
        outh <- openFile "output.txt" WriteMode
        hPutChar outh '['
        mainloop "" inh outh
        hClose inh
        hClose outh

    mainloop::String->Handle -> Handle -> IO ()
    mainloop nextWord  inh outh =
        do  ineof <- hIsEOF inh
            if ineof
                then do hPutChar outh ']'
                        return ()
                else do if (nextWord=="")
                            then  do inpStr <- hGetLine inh
                                     hPutStr outh $ transform inpStr
                                     mainloop inpStr inh outh

                            else do  hPutChar outh ','
                                     inpStr <- hGetLine inh
                                     hPutStr outh $ transform inpStr
                                     mainloop inpStr inh outh

    transform inpStr = let parts =  splitOn " " inpStr
                       in "("++(parts!!0)++","++(parts!!1)++","++(parts!!2)++")"



Hmm it would be better to have graph in Adjacency list format.

elToAl is a function that does exactly that.

    elToAl::(Eq a,Integral b)=>[(a,a,b)]->[(a,[(a,b)])]
    elToAl edgeList = elToAl2 edgeList []

    elToAl2::(Eq a,Integral b)=>[(a,a,b)]->[(a,[(a,b)])]->[(a,[(a,b)])]
    elToAl2 [] acc = acc
    elToAl2 ((from,to,dist):es) acc =elToAl2 es  (addToAl (from,to,dist) acc)

    -- takes first edge and add it to adjacency list, return new adjacency list
    addToAl::(Eq a,Integral b)=>(a,a,b)->[(a,[(a,b)])]->[(a,[(a,b)])]
    addToAl (from,to,dist) adjList
        | (length adjList) == 0 = [(from,[(to,dist)])]
        | from==from2 = (from2,neighs++[(to,dist)]):(tail adjList)
        | from/=from2 = (from2,neighs):(addToAl (from,to,dist) (tail adjList) )
        | otherwise  =  error "Whut"
        where (from2,neighs)= head adjList

And here is resulting adjacency list


Dijkstra algorithm

    --nodes have to have additional info, so lets create custom type for them

    data Node =  Node   { from :: Char
                        , adjList:: [(Char,Int)]
                        , dist :: Int
                        , prev:: Char
                        } deriving (Show)
    type EdgeList = [(Char,[(Char,Int)])]

    -- data MinQueue
    type Queue = [Node]

    -- dijsktra::EdgeList->Queue->Node->Node->([Node],Int)
    -- dijkstra

    dijkstra edgeList queue source target = if (length queue) > 0 -- || (u/=target)
        then    let
                    newEdgeList  =  (changeDistances queue edgeList)
                in dijkstra newEdgeList (newQueue)  source target

        else (findPath edgeList source target )

        where   (u,newQueue) = minNode queue
                initializeNodes = initialize  edgeList
                source2 = Node ( from source) ( adjList source) (0) (prev source)

    changeDistances queue edgelist = edgelist --TODO

    findPath  edgeList source target = ([source],4) --TODO

    minNode queue = (head queue,queue) --TODO

    initialize edgeList = map (\x ->  Node (fst x) (snd x) (maxBound) (' ') ) edgeList



Coin change

Water jugs

Hanoi towers