将多个列表与 haskell 中的字符串组合

Combining multiple lists with strings in haskell

对于一个作业,我试图将 4 个抓取数据列表合并为 1 个。 所有 4 个都正确排序并显示如下。

["Een gezonde samenleving? Het belang van sporten wordt onderschat","Zo vader, zo dochter","Milieuvriendelijk vervoer met waterstof","\"Ik heb zin in wat nog komen gaat\"","Oorlog in Oekraïne"]
["Teamsport","Carsten en Kirsten","Kennisclip","Master Mind","Statement van het CvB"]
["16 maart 2022","10 maart 2022","09 maart 2022","08 maart 2022","07 maart 2022"]
["Directie","Bot","CB","Moniek","Christian"]

我想要的输出是这样的

[["Een gezonde samenleving? Het belang van sporten wordt onderschat", "Teamsport", "16 maar 2022", "Directie"], [...], [...], [...], [...]]

我已经尝试了一些在 Internet 上找到的解决方案,但我不理解其中的一些,其中大部分都是大约 2 个列表,或者在我尝试实施它们时出现错误。

更多参考,我的代码如下所示:

urlString :: String
urlString = "https://www.example.com"

--Main function in which we call the other functions
main :: IO()
main = do
    resultTitle <- scrapeURL urlString scrapeHANTitle
    resultSubtitle <- scrapeURL urlString scrapeHANSubtitle
    resultDate <- scrapeURL urlString scrapeHANDate
    resultAuthor <- scrapeURL urlString scrapeHANAuthor
    print resultTitle
    print resultSubtitle
    print resultDate
    print resultAuthor

scrapeHANTitle :: Scraper String [String]
scrapeHANTitle =
    chroots ("div" @: [hasClass "card-news__body"]) scrapeTitle

scrapeHANSubtitle :: Scraper String [String]
scrapeHANSubtitle =
    chroots ("div" @: [hasClass "card-news__body"]) scrapeSubTitle

scrapeHANDate :: Scraper String [String]
scrapeHANDate = 
    chroots ("div" @: [hasClass "card-article__meta__body"]) scrapeDate

scrapeHANAuthor :: Scraper String [String]
scrapeHANAuthor =
    chroots ("div" @: [hasClass "card-article__meta__body"]) scrapeAuthor

-- gets the title of news items
-- https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128&utf8=dec
-- some titles contain special characters so use this utf8 table to add conversion
scrapeTitle :: Scraper String String
scrapeTitle = do
    text $ "a" @: [hasClass "card-news__body__title"]

-- gets the subtitle of news items
scrapeSubTitle :: Scraper String String
scrapeSubTitle = do
    text $ "span" @: [hasClass "card-news__body__eyebrow"]

--gets the date on which the news item was posted
scrapeDate :: Scraper String String 
scrapeDate = do
    text $ "div" @: [hasClass "card-news__footer__body__date"]

--gets the author of the news item
scrapeAuthor :: Scraper String String 
scrapeAuthor = do
    text $ "div" @: [hasClass "card-news__footer__body__author"]

我也尝试了下面的方法,但它给了我一堆类型错误。

mergeLists :: Maybe [String] -> Maybe [String] ->Maybe [String] -> Maybe [String] -> Maybe [String]
mergeLists = \s1 -> \s2 -> \s3 -> \s4 ->s1 ++ s2 ++ s3 ++ s4

您可以使用 Monoid 实例并使用:

mergeLists :: Maybe [String] -> Maybe [String] ->Maybe [String] -> Maybe [String] -> Maybe [String]
mergeLists s1 s2 s3 s4 = s1 <> s2 <> s3 <> s4

然而,您在这里抓取的是同一个页面,因此您可以将抓取工具中的数据与:

myScraper :: Scraper String [String]
myScraper = do
    da <- scrapeHANTitle
    db <- scrapeHANSubtitle
    dc <- scrapeHANDate
    dd <- scrapeHANAuthor
    return da ++ db ++ dc ++ dd

然后 运行 这与:

main :: IO()
main = do
    result <- scrapeURL urlString myScraper
    print result

或更短:

main :: IO()
main = scrapeURL urlString myScraper >>= print

您可以使用 Data.List 中的 zip4 合并四个列表。

import Data.List

list1 = ["Een gezonde samenleving? Het belang van sporten wordt onderschat","Zo vader, zo dochter","Milieuvriendelijk vervoer met waterstof","\"Ik heb zin in wat nog komen gaat\"","Oorlog in Oekraïne"]
list2 = ["Teamsport","Carsten en Kirsten","Kennisclip","Master Mind","Statement van het CvB"]
list3 = ["16 maart 2022","10 maart 2022","09 maart 2022","08 maart 2022","07 maart 2022"]
list4 = ["Directie","Bot","CB","Moniek","Christian"]

result = zip4 list1 list2 list3 list4

result2 = [[x1,x2,x3,x4] | (x1,x2,x3,x4) <- zip4 list1 list2 list3 list4]

两个结果略有不同。结果 result 创建了一个元组列表。结果 result2 根据要求创建列表列表。元组列表可能更好,因为:

  • 该列表可以包含任意数量的值,所有值都是同一类型(Haskell 列表是同类的)
  • 元组可以包含任何类型,因此更加灵活
  • 具有两个值的元组与具有三个值的元组是不同的类型,因此如果您想要使用元组收集四个值,则用户不会挤入三个值或五个值的集合