URL in Java : 为什么不考虑“+”后面的String部分？

Question

我正在与 URL's 合作，更准确地说是在 Stack Overflow 上工作。

站点 URLs 的 questions 部分的结构是：

/questions/tagged/tag+anotherTag+lastTag

尝试使用 URL 时，我只得到了第一个标签的问题。

例子

URL url = null;
InputStream is = null;
BufferedReader br;
String line;

try{
    url = new URL("https://whosebug.com/questions/tagged/cobol+hibernate");
    br = new BufferedReader(new InputStreamReader(url.openStream()));

    while ((line = br.readLine()) != null) {
        if (line.contains("<div class=\"tags")){
            System.out.println(line);
        }
    }
} catch (Exception e){
    e.printStackTrace();
}
System.out.println(url);

输出

<div class="tags t-cobol">
<div class="tags t-batch-file t-cobol t-mainframe t-vsam">
<div class="tags t-cobol t-mainframe">
<div class="tags t-cobol t-opencobol t-microfocus">
<div class="tags t-cobol">
https://whosebug.com/questions/tagged/cobol+hibernate

预期输出

// Nothing because there is no question under both tags
https://whosebug.com/questions/tagged/cobol+hibernate

实际 Link 是一个 empty page（从来没有将任何问题与两个标签一起发布的方式）并且如您所见，代码仅查找第一个标识的问题标签。

Cobol+Hibernate只是一个很好说明问题的例子，我知道把这两个标签放在一起是没有逻辑的。

Answer 1

这个 curl 命令和输出说明了一些问题：

$ curl 'http://whosebug.com/questions/tagged/cobol+hibernate'
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/questions/tagged/cobol">here</a>.</h2>
</body></html>

也就是说，请求被重定向，丢弃第二个标签。

也是curl -v ...输出的摘录：

< HTTP/1.1 302 Found
< Cache-Control: private
< Content-Type: text/html; charset=utf-8
< Location: /questions/tagged/cobol

看来您需要一些代表才能同时搜索多个标签。如果我在隐身模式 window（我未登录）中打开 http://whosebug.com/questions/tagged/cobol+hibernate，则会删除第二个和更多标签。

因此，如果您想在 Java 中执行此查询，看来您需要以编程方式登录。

我猜这是因为搜索多个标签会给数据库带来负担，因此它的使用仅限于有经验的用户。你大概可以在 MSE 上得到明确的答案。

URL in Java : 为什么不考虑“+”后面的String部分？

URL in Java : Why does the String part after "+" not be considered?

java

url

inputstream

例子

输出

预期输出