使用 'servant-client' 和 'servant-xml' 解析 XML 响应

Parsing XML response using 'servant-client' and 'servant-xml'

我想使用 servant-client, servant-xml and xmlbf 库将 API 响应解析为数据类型。

这是一个例子API响应

<GoodreadsResponse>
   <Request>
      <authentication>true</authentication>
      <key>api_key</key>
      <method>search_index</method>
   </Request>
   <search>
      <query>Ender's Game</query>
      <results-start>1</results-start>
      <results-end>20</results-end>
   </search>
</GoodreadsResponse>

这是我要将其解析成的数据类型

data GoodreadsRequest = 
        GoodreadsRequest { authentication :: Text
                         , key            :: Text
                         , method         :: Text
                         }


data GoodreadsSearch = 
        GoodreadsSearch { query        :: Text
                        , resultsStart :: Int
                        , resultsEnd   :: Int
                        }


data GoodreadsResponse = 
        GoodreadsResponse { goodreadsRequest :: GoodreadsRequest
                          , goodreadsSearch  :: GoodreadsSearch
                          }

这是仆人 API 类型我想用它

type API
  = "search" :> "index.xml" :> QueryParam "key" Key :> QueryParam "q" Query :> Get '[XML] GoodreadsResponse

它构建了这样一个端点

https://www.goodreads.com/search/index.xml?key=api_key&q=Ender%27s+Game

并且在编写了其余的脚手架代码(clientM、baseURL、客户端环境等)之后,我得到的错误是

No instance for (FromXml GoodreadsResponse) arising from a use of 'client'

写作

instance FromXml GoodreadsResponse where
    fromXml = undefined

抑制错误,所以我认为我走在正确的轨道上,但我不知道如何着手编写解析器。


编辑:来自不同端点的结果包含 'works'

的列表
<GoodreadsResponse>
   <Request>
      <authentication>true</authentication>
      <key>api_key</key>
      <method>search_index</method>
   </Request>
   <search>
      <query>Ender's Game</query>
      <results-start>1</results-start>
      <results-end>20</results-end>
      <results>
            <work>
                <id type="integer">2422333</id>
                <average_rating>4.30</average_rating>
                <best_book type="Book">
                    <id type="integer">375802</id>
                    <title>Ender's Game (Ender's Saga, #1)</title>
                </best_book>
            </work>
            <work>
                <id type="integer">4892733</id>
                <average_rating>2.49</average_rating>
                <best_book type="Book">
                    <id type="integer">44687</id>
                    <title>Enchanters' End Game (The Belgariad, #5)</title>
                </best_book>
            </work>
            <work>
                <id type="integer">293823</id>
                <average_rating>2.30</average_rating>
                <best_book type="Book">
                    <id type="integer">6393082</id>
                    <title>Ender's Game, Volume 1: Battle School (Ender's Saga)</title>
                 </best_book>
            </work>
      </results>
   </search>
</GoodreadsResponse>

被解析为

data GoodreadsResponse = 
        GoodreadsResponse { goodreadsRequest :: GoodreadsRequest
                          , goodreadsSearch  :: GoodreadsSearch
                          }

data GoodreadsRequest = 
        GoodreadsRequest { authentication :: Text
                         , key            :: Text
                         , method         :: Text
                         }

data GoodreadsSearch = 
        GoodreadsSearch { query        :: Text
                        , resultsStart :: Int
                        , resultsEnd   :: Int
                        , results      :: GoodreadsSearchResults
                        }

data GoodreadsSearchResults = GooreadsSearchResults { works :: [Work] }

data Work = Work { workID               :: Int
                 , workAverageRating    :: Double
                 , workBestMatchingBook :: Book
                 }

data Book = Book { bookID    :: Int
                 , bookTitle :: Text
                 }

哇,xmlbf 中没有示例或预定义实例,其文档也有多个错误。无论如何,在玩了一会儿之后,看起来你是这样做的:

{-# LANGUAGE OverloadedStrings #-}

import Data.Text.Lazy (unpack)
import Text.Read (readEither)
import Xmlbf

instance FromXml GoodreadsRequest where
  fromXml = pElement "Request" $ do
    a <- pElement "authentication" pText
    k <- pElement "key" pText
    m <- pElement "method" pText
    pure GoodreadsRequest{ authentication = a, key = k, method = m }

instance FromXml GoodreadsSearch where
  fromXml = pElement "search" $ do
    q <- pElement "query" pText
    s <- pElement "results-start" pText
    s' <- either fail return . readEither $ unpack s
    e <- pElement "results-end" pText
    e' <- either fail return . readEither $ unpack e
    pure GoodreadsSearch{ query = q, resultsStart = s', resultsEnd = e' }

instance FromXml GoodreadsResponse where
  fromXml = pElement "GoodreadsResponse" $ do
    r <- fromXml
    s <- fromXml
    pure GoodreadsResponse{ goodreadsRequest = r, goodreadsSearch = s }

这里正在使用您的示例 XML:

GHCi, version 8.8.2: https://www.haskell.org/ghc/  :? for help
Prelude> :l Main.hs
[1 of 1] Compiling Main             ( Main.hs, interpreted )
Ok, one module loaded.
*Main> :set -XOverloadedStrings
*Main> import Xmlbf.Xeno
*Main Xmlbf.Xeno> fromRawXml "<GoodreadsResponse>\n   <Request>\n      <authentication>true</authentication>\n      <key>api_key</key>\n      <method>search_index</method>\n   </Request>\n   <search>\n      <query>Ender's Game</query>\n      <results-start>1</results-start>\n      <results-end>20</results-end>\n   </search>\n</GoodreadsResponse>" >>= runParser fromXml :: Either String GoodreadsResponse
Right (GoodreadsResponse {goodreadsRequest = GoodreadsRequest {authentication = "true", key = "api_key", method = "search_index"}, goodreadsSearch = GoodreadsSearch {query = "Ender's Game", resultsStart = 1, resultsEnd = 20}})
*Main Xmlbf.Xeno>

编辑:以下是您如何在列表中使用它以及您的其他端点:

{-# LANGUAGE OverloadedStrings #-}

import Control.Applicative (Alternative(many))
import Data.Text.Lazy (unpack)
import Text.Read (readEither)
import Xmlbf

instance FromXml GoodreadsResponse where
  fromXml = pElement "GoodreadsResponse" $ do
    r <- fromXml
    s <- fromXml
    pure GoodreadsResponse{ goodreadsRequest = r, goodreadsSearch = s }

instance FromXml GoodreadsRequest where
  fromXml = pElement "Request" $ do
    a <- pElement "authentication" pText
    k <- pElement "key" pText
    m <- pElement "method" pText
    pure GoodreadsRequest{ authentication = a, key = k, method = m }

instance FromXml GoodreadsSearch where
  fromXml = pElement "search" $ do
    q <- pElement "query" pText
    s <- pElement "results-start" pText
    s' <- either fail return . readEither $ unpack s
    e <- pElement "results-end" pText
    e' <- either fail return . readEither $ unpack e
    r <- fromXml
    pure GoodreadsSearch{ query = q, resultsStart = s', resultsEnd = e', results = r }

instance FromXml GoodreadsSearchResults where
  fromXml = pElement "results" $ do
    w <- many fromXml
    pure GooreadsSearchResults{ works = w }

instance FromXml Work where
  fromXml = pElement "work" $ do
    i <- pElement "id" pText -- the type attribute is ignored
    i' <- either fail return . readEither $ unpack i
    r <- pElement "average_rating" pText
    r' <- either fail return . readEither $ unpack r
    b <- fromXml
    pure Work{ workID = i', workAverageRating = r', workBestMatchingBook = b }

instance FromXml Book where
  fromXml = pElement "best_book" $ do -- the type attribute is ignored
    i <- pElement "id" pText -- the type attribute is ignored
    i' <- either fail return . readEither $ unpack i
    t <- pElement "title" pText
    pure Book{ bookID = i', bookTitle = t }

结果:

GHCi, version 8.8.2: https://www.haskell.org/ghc/  :? for help
Prelude> :l Main.hs
[1 of 1] Compiling Main             ( Main.hs, interpreted )
Ok, one module loaded.
*Main> :set -XOverloadedStrings
*Main> import Xmlbf.Xeno
*Main Xmlbf.Xeno> fromRawXml "<GoodreadsResponse>\n   <Request>\n      <authentication>true</authentication>\n      <key>api_key</key>\n      <method>search_index</method>\n   </Request>\n   <search>\n      <query>Ender's Game</query>\n      <results-start>1</results-start>\n      <results-end>20</results-end>\n      <results>\n            <work>\n                <id type=\"integer\">2422333</id>\n                <average_rating>4.30</average_rating>\n                <best_book type=\"Book\">\n                    <id type=\"integer\">375802</id>\n                    <title>Ender's Game (Ender's Saga, #1)</title>\n                </best_book>\n            </work>\n            <work>\n                <id type=\"integer\">4892733</id>\n                <average_rating>2.49</average_rating>\n                <best_book type=\"Book\">\n                    <id type=\"integer\">44687</id>\n                    <title>Enchanters' End Game (The Belgariad, #5)</title>\n                </best_book>\n            </work>\n            <work>\n                <id type=\"integer\">293823</id>\n                <average_rating>2.30</average_rating>\n                <best_book type=\"Book\">\n                    <id type=\"integer\">6393082</id>\n                    <title>Ender's Game, Volume 1: Battle School (Ender's Saga)</title>\n                 </best_book>\n            </work>\n      </results>\n   </search>\n</GoodreadsResponse>" >>= runParser fromXml :: Either String GoodreadsResponse
Right (GoodreadsResponse {goodreadsRequest = GoodreadsRequest {authentication = "true", key = "api_key", method = "search_index"}, goodreadsSearch = GoodreadsSearch {query = "Ender's Game", resultsStart = 1, resultsEnd = 20, results = GooreadsSearchResults {works = [Work {workID = 2422333, workAverageRating = 4.3, workBestMatchingBook = Book {bookID = 375802, bookTitle = "Ender's Game (Ender's Saga, #1)"}},Work {workID = 4892733, workAverageRating = 2.49, workBestMatchingBook = Book {bookID = 44687, bookTitle = "Enchanters' End Game (The Belgariad, #5)"}},Work {workID = 293823, workAverageRating = 2.3, workBestMatchingBook = Book {bookID = 6393082, bookTitle = "Ender's Game, Volume 1: Battle School (Ender's Saga)"}}]}}})
*Main Xmlbf.Xeno>

这一篇的新关键概念是Control.Applicative.many。它保持 运行 和 Alternative 直到失败,然后将所有成功的结果放入列表中。在这种情况下,这意味着重复 fromXml :: Parser Work 直到它开始失败(希望是因为没有 <work> 剩余)。请注意,many 在这种情况下的工作方式存在一个缺陷(IMO,因为 xmlbf 的解析器接口不是很好),即格式错误的 <work> 元素只会导致一切从它到 </results> 被忽略,而不是错误冒泡。如果需要,您可以使用涉及 pChildren 的稍微复杂的代码来修复它。