使用 groovy 解析 RSS

Question

我正在尝试使用 groovy 解析 RSS 提要。我只是想提取标题和描述标签的值。我使用以下代码片段来实现此目的：

rss = new XmlSlurper().parse(url)
            rss.channel.item.each {
            titleList.add(it.title)
            descriptionList.add(it.description)
            }

在此之后，我将在 JSP 页面中访问这些值。出了什么问题是我得到的描述值不仅是<description>（<channel> 的child）而且还有<media:description>（另一个可选的child <channel>）。我可以更改什么以仅提取 <description> 的值并省略 <media:description> 的值？

编辑：要复制此行为，您可以在此网站上执行以下代码：http://www.tutorialspoint.com/execute_groovy_online.php

 def url = "http://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml"
 rss = new XmlSlurper().parse(url)
 rss.channel.item.each {
    println"${it.title}"
    println"${it.description}"
}

您会看到控制台中也打印了媒体描述标签。

Answer 1

您可以告诉 XmlSlurper 和 XmlParser 不要尝试在构造函数中处理命名空间。我相信这可以满足您的需求：

'http://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml'.toURL().withReader { r ->
    new XmlSlurper(false, false).parse(r).channel.item.each {
        println it.title
        println it.description
    }
}

使用 groovy 解析 RSS

Parse RSS with groovy

rss

groovy

parsing

xmlslurper