如何从 Java 中的 XML 文件获取信息

How to get information from a XML File in Java

我正在尝试使用 Xpath 从 XML 文件中获取信息。但我不能成功。我尝试使用以下方法获取 wiki 中的摘要和内容信息(在代码底部):

String xpath="/*[local-name(.)='wiki']/*[local-name(.)='summary']";

但我什么也得不到..我想也许我的 xpath 是错误的?还是因为这个 CDATA ?我完全是新手,有小费吗?

<lfm status="ok">
<album>
<name>Believe</name>
<artist>Cher</artist>
<id>2026126</id>
<mbid>61bf0388-b8a9-48f4-81d1-7eb02706dfb0</mbid>
<url>http://www.last.fm/music/Cher/Believe</url>
<releasedate>5 Jul 2005, 00:00</releasedate>
<image size="small">http://userserve-ak.last.fm/serve/34s/88057565.png</image>
<image size="medium">http://userserve-ak.last.fm/serve/64s/88057565.png</image>
<image size="large">
http://userserve-ak.last.fm/serve/174s/88057565.png
</image>
<image size="extralarge">
http://userserve-ak.last.fm/serve/300x300/88057565.png
</image>
<image size="mega">
http://userserve-ak.last.fm/serve/_/88057565/Believe.png
</image>
<listeners>259410</listeners>
<playcount>1501557</playcount>
<tracks>
<track rank="1">
<name>Believe</name>
<duration>239</duration>
<mbid>403ceb02-581b-4c36-8814-6f2a29a3d213</mbid>
<url>http://www.last.fm/music/Cher/_/Believe</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="2">
<name>The Power</name>
<duration>233</duration>
<mbid>6b3de6b5-db70-49c9-b58d-e132a3eb1a36</mbid>
<url>http://www.last.fm/music/Cher/_/The+Power</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="3">
<name>Runaway</name>
<duration>286</duration>
<mbid>379f760d-1f29-4317-ab04-06a8218a874d</mbid>
<url>http://www.last.fm/music/Cher/_/Runaway</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="4">
<name>All or Nothing</name>
<duration>238</duration>
<mbid>a88735e6-b35c-4379-8ef7-bbd2b793ccf4</mbid>
<url>http://www.last.fm/music/Cher/_/All+or+Nothing</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="5">
<name>Strong Enough</name>
<duration>220</duration>
<mbid>26107af6-7dda-4844-85a5-8d61f24f4fc2</mbid>
<url>http://www.last.fm/music/Cher/_/Strong+Enough</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="6">
<name>Dov'è L'amore</name>
<duration>258</duration>
<mbid>58153307-25dd-4ff6-87f0-e08777e34539</mbid>
<url>
http://www.last.fm/music/Cher/_/Dov%27%C3%A8+L%27amore
</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="7">
<name>Takin' Back My Heart</name>
<duration>272</duration>
<mbid>07a38e80-ba81-494a-a61a-e8d81a40413e</mbid>
<url>
http://www.last.fm/music/Cher/_/Takin%27+Back+My+Heart
</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="8">
<name>Taxi Taxi</name>
<duration>304</duration>
<mbid>66f526c9-b135-4458-86cf-77065ce8f0aa</mbid>
<url>http://www.last.fm/music/Cher/_/Taxi+Taxi</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="9">
<name>Love Is the Groove</name>
<duration>271</duration>
<mbid>832f8f9a-95e4-476b-b108-14dec1dc84ba</mbid>
<url>http://www.last.fm/music/Cher/_/Love+Is+the+Groove</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="10">
<name>We All Sleep Alone</name>
<duration>236</duration>
<mbid>2286a77a-644a-4c86-9d43-31c029c3625b</mbid>
<url>http://www.last.fm/music/Cher/_/We+All+Sleep+Alone</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
</tracks>
<toptags>
<tag>
<name>sourabh</name>
<url>http://www.last.fm/tag/sourabh</url>
</tag>
<tag>
<name>albums</name>
<url>http://www.last.fm/tag/albums</url>
</tag>
<tag>
<name>pop</name>
<url>http://www.last.fm/tag/pop</url>
</tag>
<tag>
<name>90s</name>
<url>http://www.last.fm/tag/90s</url>
</tag>
<tag>
<name>dance</name>
<url>http://www.last.fm/tag/dance</url>
</tag>
</toptags>
<wiki>
<published>Sat, 6 Mar 2010 16:48:03 +0000</published>
<summary>
<![CDATA[
Believe is the twenty-third studio album by American singer-actress Cher, released on November 10, 1998 by Warner Bros. Records. The RIAA certified it Quadruple Platinum on December 23, 1999, recognizing four million shipments in the United States; Worldwide, the album has sold more than 20 million copies, making it the biggest-selling album of her career. In 1999 the album received three Grammy Awards nominations including &quot;Record of the Year&quot;, &quot;Best Pop Album&quot; and winning &quot;Best Dance Recording&quot; for the single &quot;Believe&quot;.
]]>
</summary>
<content>
<![CDATA[
Believe is the twenty-third studio album by American singer-actress Cher, released on November 10, 1998 by Warner Bros. Records. The RIAA certified it Quadruple Platinum on December 23, 1999, recognizing four million shipments in the United States; Worldwide, the album has sold more than 20 million copies, making it the biggest-selling album of her career. In 1999 the album received three Grammy Awards nominations including &quot;Record of the Year&quot;, &quot;Best Pop Album&quot; and winning &quot;Best Dance Recording&quot; for the single &quot;Believe&quot;.

 It was released by Warner Bros. Records at the end of 1998. The album was executive produced by Rob Dickens. Upon its debut, critical reception was generally positive. Believe became Cher's most commercially-successful release, reached number one and Top 10 all over the world. In the United States, the album was released on November 10, 1998, and reached number four on the Billboard 200 chart, where it was certified four times platinum.

 The album featured a change in Cher's music; in addition, Believe presented a vocally stronger Cher and a massive use of vocoder and Auto-Tune. In 1999, the album received 3 Grammy Awards nominations for &quot;Record of the Year&quot;, &quot;Best Pop Album&quot; and winning &quot;Best Dance Recording&quot;. Throughout 1999 and into 2000 Cher was nominated and winning many awards for the album including a Billboard Music Award for &quot;Female Vocalist of the Year&quot;, Lifelong Contribution Awards and a Star on the Walk of Fame shared with former Sonny Bono. The boost in Cher's popularity led to a very successful Do You Believe? Tour.

 The album was dedicated to Sonny Bono, Cher's former husband who died earlier that year from a skiing accident.

 Cher also recorded a cover version of &quot;Love Is in the Air&quot; during early sessions for this album. Although never officially released, the song has leaked on file sharing networks.

 Singles


 &quot;Believe&quot;
 &quot;Strong Enough&quot;
 &quot;All or Nothing&quot;
 &quot;Dov'è L'Amore&quot; User-contributed text is available under the Creative Commons By-SA License and may also be available under the GNU FDL.
]]>
</content>
</wiki>
</album>
</lfm>

和Java部分:

String urlToRead =" http://ws.audioscrobbler.com/2.0/?method=album.getinfo&api_key=1b76cd3eaf8349f06fb4e0a9e06e0760&artist=Cher&album=Believe";
URL url;
HttpURLConnection conn;
BufferedReader rd;
String line;
String result = "";
try {
    url = new URL(urlToRead);
    conn = (HttpURLConnection) url.openConnection();
    conn.setRequestMethod("GET");
    rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));

    while ((line = rd.readLine()) != null) {

        result += line;
    }
    rd.close();
} catch (Exception e) {
    e.printStackTrace();
}
String  out = result;
SAXReader reader = new SAXReader(false);
reader.setIncludeInternalDTDDeclarations(false);
reader.setIncludeExternalDTDDeclarations(false);
String xpath="/*[local-name(.)='wiki']/*[local-name(.)='summary']";
Document document = null;
try {
    document = reader.read(new StringReader(out));
} catch (DocumentException e) {
    e.printStackTrace();
}
List nodelist = document.selectNodes(xpath);

ArrayList outputList = new ArrayList();
ArrayList outputXmlList = new ArrayList();

String val = null;
String xmlVal = null;
for (Iterator iter = nodelist.iterator(); iter.hasNext();) {
    Node element = (Node) iter.next();
    xmlVal = element.asXML();
    val = element.getStringValue();
    if (val != null && !val.equals("")) {
        outputList.add(val);
        outputXmlList.add(xmlVal);

    }

}
System.out.println(outputList.get(0));

您在问题中提供的 XPath:

/*[local-name(.)='wiki']/*[local-name(.)='summary']

是从文档的根节点开始的 绝对 路径,因此要匹配它需要 wiki 元素是文档的根元素, summary 作为它的直接子节点。这与您给出的 XML 不匹配,它的根 lfm 包含子 album,其中包含 wiki 元素。

鉴于您的示例 XML 不涉及任何名称空间,您可以省去 local-name 技巧,只需使用像

这样的路径
/lfm/album/wiki/summary