xml-管道解析xml属性
xml-conduit parse xml attributes
用 xml-conduit
解析 XML 我偶然发现了以下问题:当我有多个属性时,具有相同的基本名称但不同的前缀仅在(词汇)顺序中的第一个。
如果同时存在属性的前缀和非前缀版本,我如何获取前缀值?
最小的非工作示例:
Main.hs
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Data.Text.Lazy (Text)
import qualified Data.Text.Lazy as T
import Text.XML (parseText, def, elementAttributes, documentRoot)
import Data.List (splitAt, drop)
main :: IO ()
main = do
putStrLn "Example1: only the first element is parsed"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines test)
putStrLn "Example2: this behaviour is independent of both having a prefix"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines $ dropAt 1 test)
putStrLn "Example3: also no difference if there is just one attribute with prefix"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines $ dropAt 2 test)
putStrLn "Example4: on its own the last element can be parsed"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines $ dropAt 1 $ dropAt 1 test)
putStrLn "==============="
putStrLn "Example1: it is always the first element parsed"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines test2)
putStrLn "Example2: really just the first"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines $ dropAt 1 test2)
test :: [Text]
test =["<Root"
, "here = \"ok\""
, "is:here = \"ok\""
, "not:here=\"nok\">"
,"</Root>"]
test2 :: [Text]
test2 =["<Root"
, "is:here = \"ok\""
, "here = \"ok\""
, "not:here=\"nok\">"
,"</Root>"]
dropAt :: Int -> [a] -> [a]
dropAt i xs = let (hd,tl) = splitAt i xs
in hd ++ drop 1 tl
attr.cabal
build-depends: base >= 4.7 && < 5
, xml-conduit
, text
> stack exec attr
Example1: only the first element is parsed
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Nothing},"ok")])
Example2: this behaviour is independent of both having a prefix
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Just "is"},"ok")])
Example3: also no difference if there is just one attribute with prefix
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Nothing},"ok")])
Example4: on its own the last element can be parsed
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Just "not"},"nok")])
===============
Example1: only the first element is parsed
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Just "is"},"ok")])
Example2: this behaviour is independent of both having a prefix
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Nothing},"ok")])
引用 Text.XML.Name
:
Prefixes are not semantically important; they are included only to simplify pass-through parsing. When comparing names with Eq or Ord methods, prefixes are ignored.
语义上的区别在于命名空间,所以下面解决了你的问题:
test :: [Text]
test =["<Root xmlns:is=\"http://example.com\" xmlns:not=\"http://example.com/2\""
, "here = \"ok\""
, "is:here = \"ok\""
, "not:here=\"nok\">"
,"</Root>"]
这也是有道理的,因为我们可以在不同的地方以不同的方式命名同一个命名空间,但它应该是一样的。我认为在不将名称空间关联到前缀的情况下使用前缀也是无效的 XML。
用 xml-conduit
解析 XML 我偶然发现了以下问题:当我有多个属性时,具有相同的基本名称但不同的前缀仅在(词汇)顺序中的第一个。
如果同时存在属性的前缀和非前缀版本,我如何获取前缀值?
最小的非工作示例:
Main.hs
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Data.Text.Lazy (Text)
import qualified Data.Text.Lazy as T
import Text.XML (parseText, def, elementAttributes, documentRoot)
import Data.List (splitAt, drop)
main :: IO ()
main = do
putStrLn "Example1: only the first element is parsed"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines test)
putStrLn "Example2: this behaviour is independent of both having a prefix"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines $ dropAt 1 test)
putStrLn "Example3: also no difference if there is just one attribute with prefix"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines $ dropAt 2 test)
putStrLn "Example4: on its own the last element can be parsed"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines $ dropAt 1 $ dropAt 1 test)
putStrLn "==============="
putStrLn "Example1: it is always the first element parsed"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines test2)
putStrLn "Example2: really just the first"
putStrLn "========\n"
print $ elementAttributes . documentRoot <$> parseText def (T.unlines $ dropAt 1 test2)
test :: [Text]
test =["<Root"
, "here = \"ok\""
, "is:here = \"ok\""
, "not:here=\"nok\">"
,"</Root>"]
test2 :: [Text]
test2 =["<Root"
, "is:here = \"ok\""
, "here = \"ok\""
, "not:here=\"nok\">"
,"</Root>"]
dropAt :: Int -> [a] -> [a]
dropAt i xs = let (hd,tl) = splitAt i xs
in hd ++ drop 1 tl
attr.cabal
build-depends: base >= 4.7 && < 5
, xml-conduit
, text
> stack exec attr
Example1: only the first element is parsed
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Nothing},"ok")])
Example2: this behaviour is independent of both having a prefix
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Just "is"},"ok")])
Example3: also no difference if there is just one attribute with prefix
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Nothing},"ok")])
Example4: on its own the last element can be parsed
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Just "not"},"nok")])
===============
Example1: only the first element is parsed
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Just "is"},"ok")])
Example2: this behaviour is independent of both having a prefix
========
Right (fromList [(Name {nameLocalName = "here", nameNamespace = Nothing, namePrefix = Nothing},"ok")])
引用 Text.XML.Name
:
Prefixes are not semantically important; they are included only to simplify pass-through parsing. When comparing names with Eq or Ord methods, prefixes are ignored.
语义上的区别在于命名空间,所以下面解决了你的问题:
test :: [Text]
test =["<Root xmlns:is=\"http://example.com\" xmlns:not=\"http://example.com/2\""
, "here = \"ok\""
, "is:here = \"ok\""
, "not:here=\"nok\">"
,"</Root>"]
这也是有道理的,因为我们可以在不同的地方以不同的方式命名同一个命名空间,但它应该是一样的。我认为在不将名称空间关联到前缀的情况下使用前缀也是无效的 XML。