Haskell 处理文件中的文本

Haskell Processing text from a file

大家好,

1。我想做什么?

我得到一个包含文本

的 1 行文件
"Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"

我想从文件中读取此文本并将其转换为:

[[Just 3, Nothing, Just 1, Nothing], [Nothing, Nothing, Nothing, Nothing], [Nothing, Nothing, Just 4, Nothing], [Nothing, Just 3, Nothing, Nothing]]

这是 [[Maybe Integer]] 类型。

2。我已经做了什么?

我可以将普通的String修改为Maybe Integer

我的字符串:

xxx = "Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"

执行stripChars ",]" $ drop 10 xxx后我得到:

"Just 31 Nothing Just 1 Nothing [Nothing Nothing Nothing Nothing [Nothing Nothing Just 4 Nothing [Nothing Just 3 Nothing Nothing"

在下一个命令之后 map (splitOn " ") $ splitOn "[" 我有:

[["Just","31","Nothing","Just","1","Nothing",""],["Nothing","Nothing","Nothing","Nothing",""],["Nothing","Nothing","Just","4","Nothing",""],["Nothing","Just","3","Nothing","Nothing"]]

现在我必须使用 cleany 切断空字符串 "" 最后使用 cuty

[[String]] 更改为 [[Maybe Integer]]
 [[Just 31,Nothing,Just 1,Nothing],[Nothing,Nothing,Nothing,Nothing],[Nothing,Nothing,Just 4,Nothing],[Nothing,Just 3,Nothing,Nothing]]

这就是我想要的!

3。问题是...

...如何执行此方法:

parse xxx = cuty $ cleany $ map (splitOn " ") $ splitOn "[" $ stripChars ",]" $ drop 10 xxx

关于从文件中读取的文本(IO 字符串类型)?

这是我的第一个 Haskell 项目,所以我的功能可能会重新发明轮子或做更糟糕的事情:/

使用的函数:

main do     
      text <- readFile "test.txt"
      let l = lines
      map parse . l



-- deletes unwanted characters from a String
stripChars :: String -> String -> String
stripChars = filter . flip notElem


-- converts String to Maybe a   
maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
    [(x,"")] -> Just x
    _ -> Nothing

-- convert(with subfunction conv, because I don't know how to make it one function)

conv:: [String] -> [Maybe Integer]
conv[] = []
conv(x:xs) =  if x == "Just" then conv xs
                else maybeRead x: conv xs

convert:: [[String]] -> [[Maybe Integer]]
convert[] = []
convert(x:xs) = conv x : convert xs

-- cleany (with subfunction clean, because I don't know how to make it one function)    

clean :: [String] -> [String]
clean [] = []
clean (x:xs) = if x == "" then clean xs
                else x : clean xs

cleany :: [[String]] -> [[String]]
cleany [] = []
cleany (x:xs) = clean x : cleany xs

我假设您对执行零到最小错误检查的解析器没有问题。 Haskell 有很好的解析库,稍后我会用一些你应该看看的替代方案来修改我的答案。

而不是使用 splitOn 我建议编写这些函数:

takeList :: String -> (String, String)
-- returns the match text and the text following the match
-- e.g. takeList " [1,2,3] ..."  returns ("[1,2,3]", " ...")

takeLists :: String -> [String]
-- parses a sequence of lists separated by spaces
-- into a list of matches

我将 takeList 作为练习。对于这些简单的解析器,我喜欢使用 Data.List 中的 spanbreak

takeList 方面,你可以这样写 takeLists:

takeLists :: String -> [ String ]
takeLists str =
  let s1 = dropWhile (/= '[') str
  in if null s1
       then []
       else let (s2,s3) = takeList s1
            in   s2 : takeLists s3

例如,takeLists " [123] [4,5,6] [7,8] " 将 return:

[ "[123]", "[4,5,6]", "[7,8]" ]

最后,要将此列表中的每个字符串转换为 Haskell 值,只需使用 read.

answer :: [ [Int] ]
answer = map read (takeLists " [123] [4,5,6] [7,8] ")

更新

使用基础库中可用的 ReadP 和 ReadS 解析器:

import Text.ParserCombinators.ReadP

bang :: ReadP [[Maybe Int]]
bang = do string "Bangabang"
          skipSpaces
          xs <- sepBy1 (readS_to_P reads) skipSpaces
          eof
          return xs

input = "Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"

runParser p input = case (readP_to_S p) input of
                      [] -> error "no parses"
                      ((a,_):_) -> print a

example = runParser bang input

可以直接使用Read实例。

data Bangabang = Bangabang [Maybe Integer]
                           [Maybe Integer]
                           [Maybe Integer]
                           [Maybe Integer] deriving (Read, Show)

现在,您可以使用从类型推断的所有 Read 机制(readreadsreadIO、...)。例如

readBangabang :: String -> Bangabang
readBangabang = read

如果数据来自文件

readFile "foo.txt" >>= print . readBangabang