使用 jsoup 获取 java.net.SocketTimeoutException: 读取超时异常

Using jsoup getting java.net.SocketTimeoutException: Read timed out exception

使用jsoup获取java.net.SocketTimeoutException:读取超时异常

private static void getNiftyFutureOIReader() {
        String url = "https://www1.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuoteFO.jsp?underlying=NIFTY&instrument=FUTIDX&type=-&strike=-&expiry=30JAN2020";
        Document doc = null;
        try {
            doc = Jsoup.connect(url).timeout(15*1000).get();
            Element content = doc.getElementById("responseDiv");
            String jsonCont=content.html();
            System.out.println(jsonCont);

                } catch (IOException e) {
            e.printStackTrace();

        }

    }

我正在使用 Jsoup 调用网站 url 并读取其内容,使用 jsoup 得到 java.net.SocketTimeoutException:读取超时异常

error log

java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(Unknown Source)
    at java.net.SocketInputStream.read(Unknown Source)
    at java.net.SocketInputStream.read(Unknown Source)
    at sun.security.ssl.InputRecord.readFully(Unknown Source)
    at sun.security.ssl.InputRecord.read(Unknown Source)
    at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
    at sun.security.ssl.SSLSocketImpl.readDataRecord(Unknown Source)
    at sun.security.ssl.AppInputStream.read(Unknown Source)
    at java.io.BufferedInputStream.fill(Unknown Source)
    at java.io.BufferedInputStream.read1(Unknown Source)
    at java.io.BufferedInputStream.read(Unknown Source)
    at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
    at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
    at java.net.HttpURLConnection.getResponseCode(Unknown Source)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(Unknown Source)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:750)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:722)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:306)
    at org.jsoup.helper.HttpConnection.get(HttpConnection.java:295)
    at code.test.BankNiftyFutureOIReader.getNiftyFutureOIReader(BankNiftyFutureOIReader.java:19)
    at code.test.BankNiftyFutureOIReader.main(BankNiftyFutureOIReader.java:53)

问题可能是由于

  1. 确保您已连接到互联网。尝试在浏览器中打开相同的 URL 并查看它是否打开该页面。 或者从您的 VM 能够达到 url 简单的 curl / wget 方法

  2. 指定更多的 Jsoup 连接超时时间,然后获取下面给出的文档。

参考:https://www.javacodeexamples.com/jsoup-sockettimeoutexception-read-timed-out-connect-timed-out-fix/775

根据这个答案,JSoup UserAgent, how to set it right?,如果网站正在检查用户代理或其他 Headers 以验证您不是机器人,也许可以尝试。我希望 "live quotes" 网页有这样的 counter-measures.

Response response= Jsoup.connect("https://www1.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuoteFO.jsp?underlying=NIFTY&instrument=FUTIDX&type=-&strike=-&expiry=30JAN2020")
       .ignoreContentType(true)
       .userAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0")  
       .referrer("https://www.nseindia.com")   
       .timeout(15_000) 
       .followRedirects(true)
       .execute();
// TODO: verify Response status code here!
Document doc = response.parse();