Haskell 处理文件中的文本
Haskell Processing text from a file
大家好,
1。我想做什么?
我得到一个包含文本
的 1 行文件
"Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"
我想从文件中读取此文本并将其转换为:
[[Just 3, Nothing, Just 1, Nothing], [Nothing, Nothing, Nothing, Nothing], [Nothing, Nothing, Just 4, Nothing], [Nothing, Just 3, Nothing, Nothing]]
这是 [[Maybe Integer]]
类型。
2。我已经做了什么?
我可以将普通的String
修改为Maybe Integer
我的字符串:
xxx = "Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"
执行stripChars ",]" $ drop 10 xxx
后我得到:
"Just 31 Nothing Just 1 Nothing [Nothing Nothing Nothing Nothing [Nothing Nothing Just 4 Nothing [Nothing Just 3 Nothing Nothing"
在下一个命令之后 map (splitOn " ") $ splitOn "["
我有:
[["Just","31","Nothing","Just","1","Nothing",""],["Nothing","Nothing","Nothing","Nothing",""],["Nothing","Nothing","Just","4","Nothing",""],["Nothing","Just","3","Nothing","Nothing"]]
现在我必须使用 cleany
切断空字符串 ""
最后使用 cuty
将 [[String]]
更改为 [[Maybe Integer]]
[[Just 31,Nothing,Just 1,Nothing],[Nothing,Nothing,Nothing,Nothing],[Nothing,Nothing,Just 4,Nothing],[Nothing,Just 3,Nothing,Nothing]]
这就是我想要的!
3。问题是...
...如何执行此方法:
parse xxx = cuty $ cleany $ map (splitOn " ") $ splitOn "[" $ stripChars ",]" $ drop 10 xxx
关于从文件中读取的文本(IO 字符串类型)?
这是我的第一个 Haskell 项目,所以我的功能可能会重新发明轮子或做更糟糕的事情:/
使用的函数:
main do
text <- readFile "test.txt"
let l = lines
map parse . l
-- deletes unwanted characters from a String
stripChars :: String -> String -> String
stripChars = filter . flip notElem
-- converts String to Maybe a
maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
[(x,"")] -> Just x
_ -> Nothing
-- convert(with subfunction conv, because I don't know how to make it one function)
conv:: [String] -> [Maybe Integer]
conv[] = []
conv(x:xs) = if x == "Just" then conv xs
else maybeRead x: conv xs
convert:: [[String]] -> [[Maybe Integer]]
convert[] = []
convert(x:xs) = conv x : convert xs
-- cleany (with subfunction clean, because I don't know how to make it one function)
clean :: [String] -> [String]
clean [] = []
clean (x:xs) = if x == "" then clean xs
else x : clean xs
cleany :: [[String]] -> [[String]]
cleany [] = []
cleany (x:xs) = clean x : cleany xs
我假设您对执行零到最小错误检查的解析器没有问题。 Haskell 有很好的解析库,稍后我会用一些你应该看看的替代方案来修改我的答案。
而不是使用 splitOn
我建议编写这些函数:
takeList :: String -> (String, String)
-- returns the match text and the text following the match
-- e.g. takeList " [1,2,3] ..." returns ("[1,2,3]", " ...")
takeLists :: String -> [String]
-- parses a sequence of lists separated by spaces
-- into a list of matches
我将 takeList
作为练习。对于这些简单的解析器,我喜欢使用 Data.List 中的 span
和 break
。
在 takeList
方面,你可以这样写 takeLists
:
takeLists :: String -> [ String ]
takeLists str =
let s1 = dropWhile (/= '[') str
in if null s1
then []
else let (s2,s3) = takeList s1
in s2 : takeLists s3
例如,takeLists " [123] [4,5,6] [7,8] "
将 return:
[ "[123]", "[4,5,6]", "[7,8]" ]
最后,要将此列表中的每个字符串转换为 Haskell 值,只需使用 read
.
answer :: [ [Int] ]
answer = map read (takeLists " [123] [4,5,6] [7,8] ")
更新
使用基础库中可用的 ReadP 和 ReadS 解析器:
import Text.ParserCombinators.ReadP
bang :: ReadP [[Maybe Int]]
bang = do string "Bangabang"
skipSpaces
xs <- sepBy1 (readS_to_P reads) skipSpaces
eof
return xs
input = "Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"
runParser p input = case (readP_to_S p) input of
[] -> error "no parses"
((a,_):_) -> print a
example = runParser bang input
可以直接使用Read
实例。
data Bangabang = Bangabang [Maybe Integer]
[Maybe Integer]
[Maybe Integer]
[Maybe Integer] deriving (Read, Show)
现在,您可以使用从类型推断的所有 Read
机制(read
、reads
、readIO
、...)。例如
readBangabang :: String -> Bangabang
readBangabang = read
如果数据来自文件
readFile "foo.txt" >>= print . readBangabang
大家好,
1。我想做什么?
我得到一个包含文本
的 1 行文件"Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"
我想从文件中读取此文本并将其转换为:
[[Just 3, Nothing, Just 1, Nothing], [Nothing, Nothing, Nothing, Nothing], [Nothing, Nothing, Just 4, Nothing], [Nothing, Just 3, Nothing, Nothing]]
这是 [[Maybe Integer]]
类型。
2。我已经做了什么?
我可以将普通的String
修改为Maybe Integer
我的字符串:
xxx = "Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"
执行stripChars ",]" $ drop 10 xxx
后我得到:
"Just 31 Nothing Just 1 Nothing [Nothing Nothing Nothing Nothing [Nothing Nothing Just 4 Nothing [Nothing Just 3 Nothing Nothing"
在下一个命令之后 map (splitOn " ") $ splitOn "["
我有:
[["Just","31","Nothing","Just","1","Nothing",""],["Nothing","Nothing","Nothing","Nothing",""],["Nothing","Nothing","Just","4","Nothing",""],["Nothing","Just","3","Nothing","Nothing"]]
现在我必须使用 cleany
切断空字符串 ""
最后使用 cuty
[[String]]
更改为 [[Maybe Integer]]
[[Just 31,Nothing,Just 1,Nothing],[Nothing,Nothing,Nothing,Nothing],[Nothing,Nothing,Just 4,Nothing],[Nothing,Just 3,Nothing,Nothing]]
这就是我想要的!
3。问题是...
...如何执行此方法:
parse xxx = cuty $ cleany $ map (splitOn " ") $ splitOn "[" $ stripChars ",]" $ drop 10 xxx
关于从文件中读取的文本(IO 字符串类型)?
这是我的第一个 Haskell 项目,所以我的功能可能会重新发明轮子或做更糟糕的事情:/
使用的函数:
main do
text <- readFile "test.txt"
let l = lines
map parse . l
-- deletes unwanted characters from a String
stripChars :: String -> String -> String
stripChars = filter . flip notElem
-- converts String to Maybe a
maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
[(x,"")] -> Just x
_ -> Nothing
-- convert(with subfunction conv, because I don't know how to make it one function)
conv:: [String] -> [Maybe Integer]
conv[] = []
conv(x:xs) = if x == "Just" then conv xs
else maybeRead x: conv xs
convert:: [[String]] -> [[Maybe Integer]]
convert[] = []
convert(x:xs) = conv x : convert xs
-- cleany (with subfunction clean, because I don't know how to make it one function)
clean :: [String] -> [String]
clean [] = []
clean (x:xs) = if x == "" then clean xs
else x : clean xs
cleany :: [[String]] -> [[String]]
cleany [] = []
cleany (x:xs) = clean x : cleany xs
我假设您对执行零到最小错误检查的解析器没有问题。 Haskell 有很好的解析库,稍后我会用一些你应该看看的替代方案来修改我的答案。
而不是使用 splitOn
我建议编写这些函数:
takeList :: String -> (String, String)
-- returns the match text and the text following the match
-- e.g. takeList " [1,2,3] ..." returns ("[1,2,3]", " ...")
takeLists :: String -> [String]
-- parses a sequence of lists separated by spaces
-- into a list of matches
我将 takeList
作为练习。对于这些简单的解析器,我喜欢使用 Data.List 中的 span
和 break
。
在 takeList
方面,你可以这样写 takeLists
:
takeLists :: String -> [ String ]
takeLists str =
let s1 = dropWhile (/= '[') str
in if null s1
then []
else let (s2,s3) = takeList s1
in s2 : takeLists s3
例如,takeLists " [123] [4,5,6] [7,8] "
将 return:
[ "[123]", "[4,5,6]", "[7,8]" ]
最后,要将此列表中的每个字符串转换为 Haskell 值,只需使用 read
.
answer :: [ [Int] ]
answer = map read (takeLists " [123] [4,5,6] [7,8] ")
更新
使用基础库中可用的 ReadP 和 ReadS 解析器:
import Text.ParserCombinators.ReadP
bang :: ReadP [[Maybe Int]]
bang = do string "Bangabang"
skipSpaces
xs <- sepBy1 (readS_to_P reads) skipSpaces
eof
return xs
input = "Bangabang [Just 3, Nothing, Just 1, Nothing] [Nothing, Nothing, Nothing, Nothing] [Nothing, Nothing, Just 4, Nothing] [Nothing, Just 3, Nothing, Nothing]"
runParser p input = case (readP_to_S p) input of
[] -> error "no parses"
((a,_):_) -> print a
example = runParser bang input
可以直接使用Read
实例。
data Bangabang = Bangabang [Maybe Integer]
[Maybe Integer]
[Maybe Integer]
[Maybe Integer] deriving (Read, Show)
现在,您可以使用从类型推断的所有 Read
机制(read
、reads
、readIO
、...)。例如
readBangabang :: String -> Bangabang
readBangabang = read
如果数据来自文件
readFile "foo.txt" >>= print . readBangabang