I wrote a small piece of code to match a regex to a string in Haskell. The code is based on the example from Beautiful code.
match :: String -> String -> Bool
match ('^':reg) s = matchhere reg s
match reg s@(x:xs) = if matchhere reg s
then True
else matchhere reg xs
matchhere :: String -> String -> Bool
matchhere [] _ = True
matchhere ('$':[]) [] = True
matchhere (x:'*':xs) s = matchstar x xs s
matchhere (r:rs) (x:xs) | r == x = matchhere rs xs
| r == '.' = matchhere rs xs
| otherwise = False
matchstar :: Char -> String -> String -> Bool
matchstar c [] [] = True
matchstar c _ [] = False
matchstar c reg (x:xs) = matchhere reg xs
|| ((c == '.' || c == x) && matchstar c reg xs)
The matchstar
function in this example is non-greedy, because it will try to
match the character after *
before it tries to match the star.
The other way this could be written is:
matchstar c reg str@(x:xs) = matchstar c reg xs || matchhere reg str
This will first try to matchstar and only if that fails, try to match it with matchhere. A string like:
aba*ab
will never match. Because the last a
will always be matched by the a*
.
I also wrote this simple test suite with this code:
main = do
plan 11
ok $ match "es" "test"
ok $ match "test" "test"
ok $ match "^test$" "test"
ok $ not $ match "adf" "test"
ok $ match "^..$" "aa"
ok $ match ".*" "aa"
ok $ match "abc.*" "abcaa"
ok $ match "abc.*abc" "abcaaabc"
ok $ match "abc.*a" "abcaaa"
ok $ match "^abc.*a" "abcaaa"
ok $ match "^abc.*a$" "abcaaa"
return ()
ok :: Bool -> IO ()
ok True = do putStrLn "ok"
ok False = do putStrLn "not ok"
plan :: Int -> IO ()
plan tests = do putStrLn $ "1.." ++ (show tests)
return ()
Each test will print ok
(if it passed) or not ok
(if it didn’t). It look
like TAP, because it is easy to do. I used Test::Harness 3.10, which is a perl
module. Test::Harness contains a program called prove
. This will run code and
gather information about the TAP output. It works really nice together.
prove -e runhaskell Regex.hs
And the output is:
Regex.......ok
All tests successful.
Files=1, Tests=11, 1 wallclock secs ( 0.00 usr 0.00 sys + 0.29 cusr 0.02 csys = 0.31 CPU)
Result: PASS
All tests pass. My little test framework doesn’t implement everything in the TAP specification. The most important missing things are the test numbers and an error message when a tests fails. Maybe I will look at this sometime, but at the moment I’m with how this works.