RSS Feed - 解析结束标记异常时发生
RSS Feed - While parsing closing tag exception occurs
我正在使用 rome-1.5.jar 来解析 RSS 提要。但是当它解析一些 rss 提要时,它给出了关闭元标记的错误。
RSS 源 link:NewYork Times RSS Feed Link
这是代码
public static SyndFeed getRssFeed(String rsslUrl){
try {
URL url = new URL(rsslUrl);
HttpURLConnection httpcon = (HttpURLConnection) url.openConnection();
httpcon.addRequestProperty("User-Agent", "Mozilla/4.76");
SyndFeedInput input = new SyndFeedInput();
return input.build(new XmlReader(httpcon.getInputStream()));
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
这里是例外
com.rometools.rome.io.ParsingFeedException: Invalid XML: Error on line 45: The element type "meta" must be terminated by the matching end-tag "</meta>".
at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:215)
at com.rometools.rome.io.SyndFeedInput.build(SyndFeedInput.java:133)
at com.gold.eloop.server.util.RssUtil.getRssFeed(RssUtil.java:132)
at com.gold.eloop.server.util.RssUtil.getRssForProfile(RssUtil.java:228)
at com.gold.eloop.server.util.RssUtil.mergeRssProfiles(RssUtil.java:269)
at com.gold.eloop.server.util.outbound.MailMerger.getTransmission(MailMerger.java:581)
at com.gold.eloop.server.services.MessageServiceImpl.sendTestMessage(MessageServiceImpl.java:192)
at com.gold.eloop.server.remoteservices.MessageServiceRemote.sendTestMessage(MessageServiceRemote.java:309)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at com.google.gwt.user.server.rpc.RPC.invokeAndEncodeResponse(RPC.java:562)
at com.google.gwt.user.server.rpc.RemoteServiceServlet.processCall(RemoteServiceServlet.java:188)
at com.google.gwt.user.server.rpc.RemoteServiceServlet.processPost(RemoteServiceServlet.java:224)
at com.google.gwt.user.server.rpc.AbstractRemoteServiceServlet.doPost(AbstractRemoteServiceServlet.java:62)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:362)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:729)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.handler.RequestLogHandler.handle(RequestLogHandler.java:49)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:505)
at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:843)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:647)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:211)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:395)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488)
Caused by: org.jdom2.input.JDOMParseException: Error on line 45: The element type "meta" must be terminated by the matching end-tag "</meta>".
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:232)
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:303)
at org.jdom2.input.SAXBuilder.build(SAXBuilder.java:1196)
at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:212)
... 34 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 45; columnNumber: 9; The element type "meta" must be terminated by the matching end-tag "</meta>".
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:217)
... 37 more
我在这段代码中做错了什么。请帮我解决这个错误。
指定的 URL http://www.nytimes.com/services/xml/rss/index.html
不是 return RSS 文档。
有如下内容:
<meta name="PT" content="Member Center">
<meta name="PST" content="RSS Page">
RSS 处理器会失败。
该页面是 RSS 提要列表,而不是 RSS 提要本身。
第一个 link 是 http://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml
:尝试将其传递给您的 RSS 处理器。
我正在使用 rome-1.5.jar 来解析 RSS 提要。但是当它解析一些 rss 提要时,它给出了关闭元标记的错误。
RSS 源 link:NewYork Times RSS Feed Link
这是代码
public static SyndFeed getRssFeed(String rsslUrl){
try {
URL url = new URL(rsslUrl);
HttpURLConnection httpcon = (HttpURLConnection) url.openConnection();
httpcon.addRequestProperty("User-Agent", "Mozilla/4.76");
SyndFeedInput input = new SyndFeedInput();
return input.build(new XmlReader(httpcon.getInputStream()));
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
这里是例外
com.rometools.rome.io.ParsingFeedException: Invalid XML: Error on line 45: The element type "meta" must be terminated by the matching end-tag "</meta>".
at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:215)
at com.rometools.rome.io.SyndFeedInput.build(SyndFeedInput.java:133)
at com.gold.eloop.server.util.RssUtil.getRssFeed(RssUtil.java:132)
at com.gold.eloop.server.util.RssUtil.getRssForProfile(RssUtil.java:228)
at com.gold.eloop.server.util.RssUtil.mergeRssProfiles(RssUtil.java:269)
at com.gold.eloop.server.util.outbound.MailMerger.getTransmission(MailMerger.java:581)
at com.gold.eloop.server.services.MessageServiceImpl.sendTestMessage(MessageServiceImpl.java:192)
at com.gold.eloop.server.remoteservices.MessageServiceRemote.sendTestMessage(MessageServiceRemote.java:309)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at com.google.gwt.user.server.rpc.RPC.invokeAndEncodeResponse(RPC.java:562)
at com.google.gwt.user.server.rpc.RemoteServiceServlet.processCall(RemoteServiceServlet.java:188)
at com.google.gwt.user.server.rpc.RemoteServiceServlet.processPost(RemoteServiceServlet.java:224)
at com.google.gwt.user.server.rpc.AbstractRemoteServiceServlet.doPost(AbstractRemoteServiceServlet.java:62)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:362)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:729)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.handler.RequestLogHandler.handle(RequestLogHandler.java:49)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:505)
at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:843)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:647)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:211)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:395)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488)
Caused by: org.jdom2.input.JDOMParseException: Error on line 45: The element type "meta" must be terminated by the matching end-tag "</meta>".
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:232)
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:303)
at org.jdom2.input.SAXBuilder.build(SAXBuilder.java:1196)
at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:212)
... 34 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 45; columnNumber: 9; The element type "meta" must be terminated by the matching end-tag "</meta>".
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:217)
... 37 more
我在这段代码中做错了什么。请帮我解决这个错误。
指定的 URL http://www.nytimes.com/services/xml/rss/index.html
不是 return RSS 文档。
有如下内容:
<meta name="PT" content="Member Center"> <meta name="PST" content="RSS Page">
RSS 处理器会失败。
该页面是 RSS 提要列表,而不是 RSS 提要本身。
第一个 link 是 http://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml
:尝试将其传递给您的 RSS 处理器。