java.io.IOException:服务器返回 HTTP 响应代码:URL 的 403
java.io.IOException: Server returned HTTP response code: 403 for URL
我想从 url 打开一个 link:“http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular”,有时我得到:
java.io.IOException: Server returned HTTP response code: 403 for URL. But it's ok when open the url using browser. Below is part of my code:
URL url = new URL("http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular");
InputStream is = url.openConnection().getInputStream();
错误详情
Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: http://www.kohls.com/search.jsp?N=0&search=jacket&WS=96
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1627)
at Links.main(Links.java:41)
您尝试访问的特定网络服务器正在检查 User-Agent
HTTP header 并拒绝访问任何看起来不像普通浏览器的东西,以防止机器人(这可能是什么你在写。
您只需在 Java 中将 header 设置为请求的一部分即可。
如何设置 header 将取决于建立连接的方式,但如果您使用的是简单的 URLConnection,那么这将起作用:
URLConnection conn = url.openConnection();
conn.setRequestProperty("User-Agent", "Mozilla/5.0");
通常 "real" User-Agent
包含很多额外信息,但该网络服务器似乎只查找基本浏览器类型。
您可以使用 wget
和 -U
User-Agent 选项证明这一点:
$ wget "http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular"
--2015-05-07 16:08:46-- http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular
2015-05-07 16:08:46 ERROR 403: Forbidden.
$ wget -U "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0" "http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular"
--2015-05-07 16:08:49-- http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular
awaiting response... 200 OK
...
我想从 url 打开一个 link:“http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular”,有时我得到:
java.io.IOException: Server returned HTTP response code: 403 for URL. But it's ok when open the url using browser. Below is part of my code:
URL url = new URL("http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular");
InputStream is = url.openConnection().getInputStream();
错误详情
Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: http://www.kohls.com/search.jsp?N=0&search=jacket&WS=96 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1627) at Links.main(Links.java:41)
您尝试访问的特定网络服务器正在检查 User-Agent
HTTP header 并拒绝访问任何看起来不像普通浏览器的东西,以防止机器人(这可能是什么你在写。
您只需在 Java 中将 header 设置为请求的一部分即可。
如何设置 header 将取决于建立连接的方式,但如果您使用的是简单的 URLConnection,那么这将起作用:
URLConnection conn = url.openConnection();
conn.setRequestProperty("User-Agent", "Mozilla/5.0");
通常 "real" User-Agent
包含很多额外信息,但该网络服务器似乎只查找基本浏览器类型。
您可以使用 wget
和 -U
User-Agent 选项证明这一点:
$ wget "http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular"
--2015-05-07 16:08:46-- http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular
2015-05-07 16:08:46 ERROR 403: Forbidden.
$ wget -U "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0" "http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular"
--2015-05-07 16:08:49-- http://www.kohls.com/search.jsp?search=jacket&submit-search=web-regular
awaiting response... 200 OK
...