com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot call method “appendChild” of null
com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot call method “appendChild” of null
使用htmlunit webclient.getPage()方法打开login.html,从ajax请求result.html获取html,无法执行body.appendChild。因为 document.body 为空。示例:
login.html代码:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK" />
</head>
<script>
function getContent(){
var url= "result.html";
var xhr=new (window.XMLHttpRequest||window.ActiveXObject)("Microsoft.XMLHTTP");
xhr.onreadystatechange = function() {
if (xhr.readyState == 4 && xhr.status == 200) {
document.write(xhr.responseText);
document.close();
}
};
xhr.open("GET",url,false); xhr.send();
}
getContent();
</script>
</html>
result.html代码:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK"/>
<title>login</title>
</head>
<body>
<script type="text/javascript">
var d = document, b = d.body;
var n = d.createElement("div");
n.innerHTML = "<div> I was appended... </div>";
b.appendChild(n);
</script>
</body>
</html>
测试用例代码:
@Test public void testExecScript() throws Exception {
WebClient client = new WebClient(BrowserVersion.CHROME);
client.getOptions().setUseInsecureSSL(true);
client.getOptions().setJavaScriptEnabled(true);
client.getOptions().setCssEnabled(false);
client.getOptions().setThrowExceptionOnScriptError(false);
client.getOptions().setTimeout(10000);
String url = "http://localhost/login.html";
HtmlPage loginPage = client.getPage(url);
logger.info("{}\n{}", loginPage.getTitleText(), loginPage.asXml());
}
异常输出:
EcmaError: lineNumber=[1] column=[0] lineSource=[<no source>] name=[TypeError] sourceName=[script in http://localhost/login.html from (1, 1462) to (1, 1780)] message=[TypeError: Cannot call method "appendChild" of null (script in login.html from (1, 1462) to (1, 1780)#1)]
com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot call method "appendChild" of null (script in login.html from (1, 1462) to (1, 1780)#1)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:847)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:733)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:708)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptIfPossible(HtmlPage.java:982)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded(HtmlScript.java:351)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:411)
at com.gargoylesoftware.htmlunit.html.HtmlScript.execute(HtmlScript.java:276)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:290)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:793)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:751)
at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1170)
at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1072)
at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206)
at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330)
at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3126)
at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2093)
at org.cyberneko.html.HTMLScanner.evaluateInputSource(HTMLScanner.java:608)
at org.cyberneko.html.HTMLConfiguration.evaluateInputSource(HTMLConfiguration.java:342)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.pushInputString(HTMLParser.java:420)
at com.gargoylesoftware.htmlunit.html.HtmlPage.writeInParsedStream(HtmlPage.java:2375)
at com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument.write(HTMLDocument.java:683)
at com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument.write(HTMLDocument.java:569)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at net.sourceforge.htmlunit.corejs.javascript.MemberBox.invoke(MemberBox.java:153)
at net.sourceforge.htmlunit.corejs.javascript.FunctionObject.call(FunctionObject.java:384)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1531)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doRun(JavaScriptEngine.java:772)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:832)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:779)
at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.setState(XMLHttpRequest.java:218)
at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.doSend(XMLHttpRequest.java:762)
at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.send(XMLHttpRequest.java:598)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at net.sourceforge.htmlunit.corejs.javascript.MemberBox.invoke(MemberBox.java:153)
at net.sourceforge.htmlunit.corejs.javascript.FunctionObject.call(FunctionObject.java:448)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1531)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:411)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:309)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3057)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:115)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doRun(JavaScriptEngine.java:724)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:832)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:733)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:708)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptIfPossible(HtmlPage.java:982)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded(HtmlScript.java:351)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:411)
at com.gargoylesoftware.htmlunit.html.HtmlScript.execute(HtmlScript.java:276)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:290)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:793)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:751)
at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1170)
at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1072)
at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206)
at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330)
at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3126)
at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2093)
at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:920)
at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499)
at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:1017)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:248)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:194)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:471)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:345)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:410)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:395)
at com.aduan.study.test.web.crawler.HtmlUnitTest.testExecScript(HtmlUnitTest.java:186)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.junit.runners.model.FrameworkMethod.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access[=13=]0(ParentRunner.java:53)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:78)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:212)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
如果 result.html 没有
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK"/>
<title>login</title>
</head>
或在login.html
中添加正文标签
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK" />
</head>
<body>
<script>
function getContent(){
var url= "result.html";
var xhr=new (window.XMLHttpRequest||window.ActiveXObject)("Microsoft.XMLHTTP");
xhr.onreadystatechange = function() {
if (xhr.readyState == 4 && xhr.status == 200) {
document.write(xhr.responseText);
document.close();
}
};
xhr.open("GET",url,false); xhr.send();
}
getContent();
</script>
</body>
</html>
一定会成功的。谁能告诉我为什么?
result.html javascript 添加
d.write("d:" + d + "<br/>");
d.write("b:" + b + "<br/>");
输出:
d:[object HTMLDocument]
<br/>
b:[object HTMLBodyElement]
<br/>
<div>
<div>
I was appended...
</div>
</div>
使用WebConnectionWrapper 添加插入标签可以解决问题。
client.setWebConnection(
new WebConnectionWrapper(client) {
public WebResponse getResponse(WebRequest request) throws IOException {
WebResponse response = super.getResponse(request);
String content = response.getContentAsString("UTF-8");
if(content != null) {
if(!content.contains("<body>") && content.contains("</head>")) {
content = content.replace("</head>", "</head>\n<body>");
if(!content.contains("</body>") && content.contains("</html>")) {
content = content.replace("</html>", "</body>\n</html>");
}
}
}
logger.info("response: {}", content);
WebResponseData data = new WebResponseData(content.getBytes("UTF-8"),
response.getStatusCode(), response.getStatusMessage(), response.getResponseHeaders());
response = new WebResponse(data, request, response.getLoadTime());
return response;
}
});
使用htmlunit webclient.getPage()方法打开login.html,从ajax请求result.html获取html,无法执行body.appendChild。因为 document.body 为空。示例:
login.html代码:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK" />
</head>
<script>
function getContent(){
var url= "result.html";
var xhr=new (window.XMLHttpRequest||window.ActiveXObject)("Microsoft.XMLHTTP");
xhr.onreadystatechange = function() {
if (xhr.readyState == 4 && xhr.status == 200) {
document.write(xhr.responseText);
document.close();
}
};
xhr.open("GET",url,false); xhr.send();
}
getContent();
</script>
</html>
result.html代码:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK"/>
<title>login</title>
</head>
<body>
<script type="text/javascript">
var d = document, b = d.body;
var n = d.createElement("div");
n.innerHTML = "<div> I was appended... </div>";
b.appendChild(n);
</script>
</body>
</html>
测试用例代码:
@Test public void testExecScript() throws Exception {
WebClient client = new WebClient(BrowserVersion.CHROME);
client.getOptions().setUseInsecureSSL(true);
client.getOptions().setJavaScriptEnabled(true);
client.getOptions().setCssEnabled(false);
client.getOptions().setThrowExceptionOnScriptError(false);
client.getOptions().setTimeout(10000);
String url = "http://localhost/login.html";
HtmlPage loginPage = client.getPage(url);
logger.info("{}\n{}", loginPage.getTitleText(), loginPage.asXml());
}
异常输出:
EcmaError: lineNumber=[1] column=[0] lineSource=[<no source>] name=[TypeError] sourceName=[script in http://localhost/login.html from (1, 1462) to (1, 1780)] message=[TypeError: Cannot call method "appendChild" of null (script in login.html from (1, 1462) to (1, 1780)#1)]
com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot call method "appendChild" of null (script in login.html from (1, 1462) to (1, 1780)#1)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:847)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:733)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:708)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptIfPossible(HtmlPage.java:982)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded(HtmlScript.java:351)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:411)
at com.gargoylesoftware.htmlunit.html.HtmlScript.execute(HtmlScript.java:276)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:290)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:793)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:751)
at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1170)
at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1072)
at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206)
at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330)
at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3126)
at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2093)
at org.cyberneko.html.HTMLScanner.evaluateInputSource(HTMLScanner.java:608)
at org.cyberneko.html.HTMLConfiguration.evaluateInputSource(HTMLConfiguration.java:342)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.pushInputString(HTMLParser.java:420)
at com.gargoylesoftware.htmlunit.html.HtmlPage.writeInParsedStream(HtmlPage.java:2375)
at com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument.write(HTMLDocument.java:683)
at com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument.write(HTMLDocument.java:569)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at net.sourceforge.htmlunit.corejs.javascript.MemberBox.invoke(MemberBox.java:153)
at net.sourceforge.htmlunit.corejs.javascript.FunctionObject.call(FunctionObject.java:384)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1531)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doRun(JavaScriptEngine.java:772)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:832)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:779)
at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.setState(XMLHttpRequest.java:218)
at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.doSend(XMLHttpRequest.java:762)
at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.send(XMLHttpRequest.java:598)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at net.sourceforge.htmlunit.corejs.javascript.MemberBox.invoke(MemberBox.java:153)
at net.sourceforge.htmlunit.corejs.javascript.FunctionObject.call(FunctionObject.java:448)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1531)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:411)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:309)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3057)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:115)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doRun(JavaScriptEngine.java:724)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:832)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:733)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:708)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptIfPossible(HtmlPage.java:982)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded(HtmlScript.java:351)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:411)
at com.gargoylesoftware.htmlunit.html.HtmlScript.execute(HtmlScript.java:276)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:290)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:793)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:751)
at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1170)
at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1072)
at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206)
at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330)
at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3126)
at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2093)
at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:920)
at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499)
at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:1017)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:248)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:194)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:471)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:345)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:410)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:395)
at com.aduan.study.test.web.crawler.HtmlUnitTest.testExecScript(HtmlUnitTest.java:186)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.junit.runners.model.FrameworkMethod.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access[=13=]0(ParentRunner.java:53)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:78)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:212)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
如果 result.html 没有
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK"/>
<title>login</title>
</head>
或在login.html
中添加正文标签<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK" />
</head>
<body>
<script>
function getContent(){
var url= "result.html";
var xhr=new (window.XMLHttpRequest||window.ActiveXObject)("Microsoft.XMLHTTP");
xhr.onreadystatechange = function() {
if (xhr.readyState == 4 && xhr.status == 200) {
document.write(xhr.responseText);
document.close();
}
};
xhr.open("GET",url,false); xhr.send();
}
getContent();
</script>
</body>
</html>
一定会成功的。谁能告诉我为什么?
result.html javascript 添加
d.write("d:" + d + "<br/>");
d.write("b:" + b + "<br/>");
输出:
d:[object HTMLDocument]
<br/>
b:[object HTMLBodyElement]
<br/>
<div>
<div>
I was appended...
</div>
</div>
使用WebConnectionWrapper 添加插入标签可以解决问题。
client.setWebConnection(
new WebConnectionWrapper(client) {
public WebResponse getResponse(WebRequest request) throws IOException {
WebResponse response = super.getResponse(request);
String content = response.getContentAsString("UTF-8");
if(content != null) {
if(!content.contains("<body>") && content.contains("</head>")) {
content = content.replace("</head>", "</head>\n<body>");
if(!content.contains("</body>") && content.contains("</html>")) {
content = content.replace("</html>", "</body>\n</html>");
}
}
}
logger.info("response: {}", content);
WebResponseData data = new WebResponseData(content.getBytes("UTF-8"),
response.getStatusCode(), response.getStatusMessage(), response.getResponseHeaders());
response = new WebResponse(data, request, response.getLoadTime());
return response;
}
});