Haskell 处理文件中的文本

Question

大家好，

1。我想做什么？

我得到一个包含文本

的 1 行文件

"Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"

我想从文件中读取此文本并将其转换为：

[[Just 3, Nothing, Just 1, Nothing], [Nothing, Nothing, Nothing, Nothing], [Nothing, Nothing, Just 4, Nothing], [Nothing, Just 3, Nothing, Nothing]]

这是 [[Maybe Integer]] 类型。

2。我已经做了什么？

我可以将普通的String修改为Maybe Integer

我的字符串：

xxx = "Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"

执行stripChars ",]" $ drop 10 xxx后我得到：

"Just 31 Nothing Just 1 Nothing [Nothing Nothing Nothing Nothing [Nothing Nothing Just 4 Nothing [Nothing Just 3 Nothing Nothing"

在下一个命令之后 map (splitOn " ") $ splitOn "[" 我有：

[["Just","31","Nothing","Just","1","Nothing",""],["Nothing","Nothing","Nothing","Nothing",""],["Nothing","Nothing","Just","4","Nothing",""],["Nothing","Just","3","Nothing","Nothing"]]

现在我必须使用 cleany 切断空字符串 "" 最后使用 cuty

将 [[String]] 更改为 [[Maybe Integer]]

 [[Just 31,Nothing,Just 1,Nothing],[Nothing,Nothing,Nothing,Nothing],[Nothing,Nothing,Just 4,Nothing],[Nothing,Just 3,Nothing,Nothing]]

这就是我想要的！

3。问题是...

...如何执行此方法：

parse xxx = cuty $ cleany $ map (splitOn " ") $ splitOn "[" $ stripChars ",]" $ drop 10 xxx

关于从文件中读取的文本（IO 字符串类型）？

这是我的第一个 Haskell 项目，所以我的功能可能会重新发明轮子或做更糟糕的事情:/

使用的函数：

main do     
      text <- readFile "test.txt"
      let l = lines
      map parse . l



-- deletes unwanted characters from a String
stripChars :: String -> String -> String
stripChars = filter . flip notElem


-- converts String to Maybe a   
maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
    [(x,"")] -> Just x
    _ -> Nothing

-- convert(with subfunction conv, because I don't know how to make it one function)

conv:: [String] -> [Maybe Integer]
conv[] = []
conv(x:xs) =  if x == "Just" then conv xs
                else maybeRead x: conv xs

convert:: [[String]] -> [[Maybe Integer]]
convert[] = []
convert(x:xs) = conv x : convert xs

-- cleany (with subfunction clean, because I don't know how to make it one function)    

clean :: [String] -> [String]
clean [] = []
clean (x:xs) = if x == "" then clean xs
                else x : clean xs

cleany :: [[String]] -> [[String]]
cleany [] = []
cleany (x:xs) = clean x : cleany xs

Answer 1

我假设您对执行零到最小错误检查的解析器没有问题。 Haskell 有很好的解析库，稍后我会用一些你应该看看的替代方案来修改我的答案。

而不是使用 splitOn 我建议编写这些函数：

takeList :: String -> (String, String)
-- returns the match text and the text following the match
-- e.g. takeList " [1,2,3] ..."  returns ("[1,2,3]", " ...")

takeLists :: String -> [String]
-- parses a sequence of lists separated by spaces
-- into a list of matches

我将 takeList 作为练习。对于这些简单的解析器，我喜欢使用 Data.List 中的 span 和 break。

在 takeList 方面，你可以这样写 takeLists:

takeLists :: String -> [ String ]
takeLists str =
  let s1 = dropWhile (/= '[') str
  in if null s1
       then []
       else let (s2,s3) = takeList s1
            in   s2 : takeLists s3

例如，takeLists " [123] [4,5,6] [7,8] " 将 return:

[ "[123]", "[4,5,6]", "[7,8]" ]

最后，要将此列表中的每个字符串转换为 Haskell 值，只需使用 read.

answer :: [ [Int] ]
answer = map read (takeLists " [123] [4,5,6] [7,8] ")

更新

使用基础库中可用的 ReadP 和 ReadS 解析器：

import Text.ParserCombinators.ReadP

bang :: ReadP [[Maybe Int]]
bang = do string "Bangabang"
          skipSpaces
          xs <- sepBy1 (readS_to_P reads) skipSpaces
          eof
          return xs

input = "Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"

runParser p input = case (readP_to_S p) input of
                      [] -> error "no parses"
                      ((a,_):_) -> print a

example = runParser bang input

Answer 2

可以直接使用Read实例。

data Bangabang = Bangabang [Maybe Integer]
                           [Maybe Integer]
                           [Maybe Integer]
                           [Maybe Integer] deriving (Read, Show)

现在，您可以使用从类型推断的所有 Read 机制（read、reads、readIO、...）。例如

readBangabang :: String -> Bangabang
readBangabang = read

如果数据来自文件

readFile "foo.txt" >>= print . readBangabang

Haskell 处理文件中的文本

Haskell Processing text from a file

string

text-processing

haskell

1。我想做什么？

2。我已经做了什么？

3。问题是...

使用的函数：