Yes. I’m divided into “hum… 100 lines is larger than I expected” and “what did he mean ‘from scratch’? did he write the parser combinators? if so, 100 lines is crazy small!”
But I’m settling in believing 80 of those lines are verbose type declarations.
I decided to write it myself for fun. I decided that “From Scratch” means:
No parser libraries (parsec/happy/etc)
No using read from Prelude
No hacky meta-parsing
Here is what I came up with (using my favourite parsing method: parser combinators):
import Control.Monad ((>=>))
import Control.Applicative (Alternative (..), asum, optional)
import Data.Maybe (fromMaybe)
import Data.Functor (($>))
import Data.List (singleton)
import Data.Map (Map, fromList)
import Data.Bifunctor (second)
import Data.Char (toLower, chr)
newtype Parser i o = Parser { parse :: i -> Maybe (i, o) } deriving (Functor)
instance Applicative (Parser i) where
pure a = Parser $ \i -> Just (i, a)
a <*> b = Parser $ parse a >=> \(i, f) -> second f <$> parse b i
instance Alternative (Parser i) where
empty = Parser $ const Nothing
a <|> b = Parser $ \i -> parse a i <|> parse b i
instance Monad (Parser i) where
a >>= f = Parser $ parse a >=> \(i, b) -> parse (f b) i
instance Semigroup o => Semigroup (Parser i o) where
a <> b = (<>) <$> a <*> b
instance Monoid o => Monoid (Parser i o) where
mempty = pure mempty
type SParser = Parser String
charIf :: (a -> Bool) -> Parser [a] a
charIf cond = Parser $ \i -> case i of
(x:xs) | cond x -> Just (xs, x)
_ -> Nothing
char :: Eq a => a -> Parser [a] a
char c = charIf (== c)
one :: Parser i a -> Parser i [a]
one = fmap singleton
str :: Eq a => [a] -> Parser [a] [a]
str (c:cs) = one (char c) <> str cs
str _ = pure []
sepBy :: Parser i a -> Parser i b -> Parser i [a]
sepBy a b = (one a <> many (b *> a)) <|> mempty
data Decimal = Decimal { mantissa :: Integer, exponent :: Int } deriving Show
data JSON = Object (Map String JSON) | Array [JSON] | Bool Bool | Number Decimal | String String | Null deriving Show
whitespace :: SParser String
whitespace = many $ asum $ map char [' ', '\t', '\r', '\n']
digit :: Int -> SParser Int
digit base = asum $ take base [asum [char c, char (toLower c)] $> n | (c, n) <- zip (['0'..'9'] <> ['A'..'Z']) [0..]]
unsignedInteger :: Int -> SParser Integer
unsignedInteger base = foldl (\acc x -> acc * fromIntegral base + fromIntegral x) 0 <$> some (digit base)
integer :: SParser Integer
integer = do
sign <- fromIntegral <$> asum [char '-' $> (-1), char '+' $> 1, str "" $> 1]
(sign *) <$> unsignedInteger 10
-- This is the ceil of the log10 and also very inefficient
log10 :: Integer -> Int
log10 n
| n < 1 = 0
| otherwise = 1 + log10 (n `div` 10)
jsonNumber :: SParser Decimal
jsonNumber = do
whole <- integer
fraction <- fromMaybe 0 <$> optional (str "." *> unsignedInteger 10)
e <- fromIntegral <$> fromMaybe 0 <$> optional ((str "E" <|> str "e") *> integer)
pure $ Decimal (whole * 10^(log10 fraction) + signum whole * fraction) (e - log10 fraction)
escapeChar :: SParser Char
escapeChar = char '\\'
*> asum [
str "'" $> '\'',
str "\"" $> '"',
str "\\" $> '\\',
str "n" $> '\n',
str "r" $> '\r',
str "t" $> '\t',
str "b" $> '\b',
str "f" $> '\f',
str "u" *> (chr . fromIntegral <$> unsignedInteger 16)
]
jsonString :: SParser String
jsonString =
char '"'
*> many (asum [charIf (\c -> c /= '"' && c /= '\\'), escapeChar])
<* char '"'
jsonObjectPair :: SParser (String, JSON)
jsonObjectPair = (,) <$> (whitespace *> jsonString <* whitespace <* char ':') <*> json
json :: SParser JSON
json =
whitespace *>
asum [
Object <$> fromList <$> (char '{' *> jsonObjectPair `sepBy` char ',' <* char '}'),
Array <$> (char '[' *> json `sepBy` char ',' <* char ']'),
Bool <$> asum [str "true" $> True, str "false" $> False],
Number <$> jsonNumber,
String <$> jsonString,
Null <$ str "null"
]
<* whitespace
main :: IO ()
main = interact $ show . parse json
This parses numbers as my own weird Decimal type, in order to preserve all information (converting to Double is lossy). I didn’t bother implementing any methods on the Decimal, because there are other libraries that do that and we’re just writing a parser.
It’s also slow as hell but hey, that’s naive implementations for you!
It ended up being 113 lines. I think I could reduce it a bit more if I was willing to sacrifice readability and/or just inline things instead of implementing stdlib typeclasses.
Just looking at the image, yeah he’s a little parser combinator library entirely from scratch.
Not sure what you mean by verbose type declarations. It looks to be 2 type declarations in a few lines of code (a newtype for the parser and a sum type to represent the different types of JSON values). It’s really not much at all.
Yes. I’m divided into “hum… 100 lines is larger than I expected” and “what did he mean ‘from scratch’? did he write the parser combinators? if so, 100 lines is crazy small!”
But I’m settling in believing 80 of those lines are verbose type declarations.
I decided to write it myself for fun. I decided that “From Scratch” means:
read
from PreludeHere is what I came up with (using my favourite parsing method: parser combinators):
This parses numbers as my own weird
Decimal
type, in order to preserve all information (converting toDouble
is lossy). I didn’t bother implementing any methods on theDecimal
, because there are other libraries that do that and we’re just writing a parser.It’s also slow as hell but hey, that’s naive implementations for you!
It ended up being 113 lines. I think I could reduce it a bit more if I was willing to sacrifice readability and/or just inline things instead of implementing stdlib typeclasses.
So, ARE you bringing a girl?
I’m not coming to my parents for this new year’s because I might get arrested and/or sent to die in a war. But once Putin dies, yes, I am
Didn’t know where you were talking about til you said Putin.
So that’s two things to look forward to!
Haskell is succinct.
Just looking at the image, yeah he’s a little parser combinator library entirely from scratch.
Not sure what you mean by verbose type declarations. It looks to be 2 type declarations in a few lines of code (a newtype for the parser and a sum type to represent the different types of JSON values). It’s really not much at all.
You could probably write a very basic parser combinator library, enough to parse JSON, in 100 lines of Haskell
Judging by the Parser newtype, he did.
With recursive list comprehensions you can cram quite some complexity into one line of code.