Haskell's hxt 如果我添加另一行函数会失败

Haskell's hxt fails if I add another line a function

我正在尝试使用 Haskell 的 hxt 包解析 COLLADA 文件。

我一直都很好,但是我 运行 遇到了一个奇怪的错误(或者更有可能是我的错误)。

我有一个看起来像这样的箭头:

processGeometry = proc x -> do
    geometry <- atTag "geometry" -< x
    meshID <- getAttrValue "id" -< geometry
    meshName <- getAttrValue "name" -< geometry
    mesh <- atTag "mesh" -< geometry
    sources <- hasName "source" <<< getChildren -< mesh
    positionSource <- hasAttrValue "id" ("-positions" `isSuffixOf`) -< sources
    positionArray  <- processFloatSource -< positionSource
    returnA -< positionArray

添加行

normalSource <- hasAttrValue "id" ("-normals" `isSuffixOf`) -< sources

然而,靠近底部会使整个箭头失效。

无论我 return 做什么,都会发生这种情况,即使我 return 使用原始 x

这是我的 atTag 函数:

atTag tag = deep (isElem >>> hasName tag)

这是我尝试解析的示例 COLLADA 文件: https://pastebin.com/mDSTH2TW

为什么添加一条线会完全改变箭头的结果,而它根本不应该做任何事情?

TL;DR: 如果您要查找两个单独的子元素,请分别调用 getChildren.

您的变量 sources 不代表所有源元素的列表。相反,它是一个 单一 来源。如果您检查 sources 的类型,您会看到它是 XMLTree。因此,当您对它使用 hasAttrValue 两次时,您正在寻找一个与两种情况都匹配的源元素。

至于为什么你不在乎return:每一行都会被执行,即使它的值没有被使用。事实上,除非您正在使用输出,否则您甚至不必为它指定一个名称:只有 hasAttrValue "id" (isSuffixOf "-normals") <- sources 的一行(删除 normalSource <-)的效果是一样的。所以如果你 return x,它仍然 return s x 只有当它能找到那个不可能的源元素时。

您可以通过对 getChildren 进行单独的两次调用来让您的代码找到两个单独的源元素——每个您要查找的单独元素调用一次——并检查 "id" 属性每一个分开。


如果上面的内容不清楚,这里有一个独立的例子。

data Tree a = Tree a [Tree a]

exampleTree :: Tree String
exampleTree = Tree "root" [Tree "childA" [], Tree "childB" []]

newtype ListArrow a b = ListArrow { runListArrow :: a -> [b] }

instance Category ListArrow where
    id = ListArrow (\x -> [x])
    (ListArrow g) . (ListArrow f) = ListArrow (\x -> concatMap g (f x))

instance Arrow ListArrow where
    arr f = ListArrow (\x -> [f x])
    first (ListArrow f) = ListArrow (\(a, b) -> [ (a', b) | a' <- f a ])

getChildren :: ListArrow (Tree a) (Tree a)
getChildren = ListArrow gc where
    gc (Tree _ children) = children

hasContent :: Eq a => a -> ListArrow (Tree a) (Tree a)
hasContent content = ListArrow hc where
    hc cur@(Tree c _) = if content == c then [cur] else []

getContent :: ListArrow (Tree a) a
getContent = ListArrow gc where
    gc (Tree c _) = [c]

-- this has the same problem as the code in the question
findBothChildrenBad :: ListArrow (Tree String) (String, String)
findBothChildrenBad = proc root -> do
    -- child is a (single) child of the root
    child <- getChildren -< root

    -- childA == child, and filter to only cases where its content is "childA"
    childA <- hasContent "childA" -< child

    -- childB == child, and filter to only cases where its content is "childB"
    childB <- hasContent "childB" -< child
    -- now the content has to be both "childA" and "childB" -- so we're stuck

    childAContent <- getContent -< childA
    childBContent <- getContent -< childB
    returnA -< (childAContent, childBContent)

-- this is the fixed version
findBothChildren :: ListArrow (Tree String) (String, String)
findBothChildren = proc root -> do
    -- childA is a (single) child of the root
    childA <- getChildren -< root

    -- filter to only cases where its content is "childA"
    hasContent "childA" -< childA

    -- childB is a (potentially different) child of the root
    childB <- getChildren -< root

    -- filter to only cases where its content is "childB"
    hasContent "childB" -< childB
    -- we're not stuck here

    childAContent <- getContent -< childA
    childBContent <- getContent -< childB
    returnA -< (childAContent, childBContent)