API 的 HTTP 响应生成整个 HTML 页面而不是响应的正文

API's HTTP response yields the entire HTML page instead of the response's body

我目前正在编写一个使用 FreeCite API (a citation extraction service) - the API guide is defined here 的 Java 程序 (Ruby中有一个例子)。几天来,我一直在尝试使用 Java (Apache HttpClient) API,但它没有按预期工作。

这里是Ruby

中的例子

代码:

require 'net/http'

Net::HTTP.start('localhost', 3000) do |http|
  response = http.post('/citations/create',
    'citation=A. Bookstein and S. T. Klein,  \
    Detecting content-bearing words by serial clustering,  \
Proceedings of the Nineteenth Annual International ACM SIGIR Conference \
on Research and Development in Information Retrieval,   \
pp. 319327,   1995.',
'Accept' => 'text/xml')

  puts "Code: #{response.code}"
  puts "Message: #{response.message}"
  puts "Body:\n #{response.body}"
end

n.b.: localhost指自由引用。预期响应代码为 201,响应为 XML.

结果:

<citations>
  <citation valid=true>
  <authors>
    <author>I S Udvarhelyi</author>
    <author>C A Gatsonis</author>
    <author>A M Epstein</author>
    <author>C L Pashos</author>
    <author>J P Newhouse</author>
    <author>B J McNeil</author>
  </authors>
  <title>Acute Myocardial Infarction in the Medicare population: process of care and clinical outcomes</title>
  <journal>Journal of the American Medical Association</ journal>
  <pages>18--2530</pages>
  <year>1992</year>
  <raw_string>Udvarhelyi, I.S., Gatsonis, C.A., Epstein, A.M., Pashos, C.L., Newhouse, J.P. and McNeil, B.J. Acute Myocardial Infarction in the Medicare population: process of care and clinical outcomes. Journal of the American Medical Association, 1992; 18:2530-2536.</raw_string>
  <ctx:context-objects xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='info:ofi/fmt:xml:xsd:ctx http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:ctx' xmlns:ctx='info:ofi/fmt:xml:xsd:ctx'>
    <ctx:context-object timestamp='2008-07-11T00:57:33-04:00'
    encoding='info:ofi/enc:UTF-8' version='Z39.88-2004' identifier=''>
      <ctx:referent>
        <ctx:metadata-by-val>
          <ctx:format>info:ofi/fmt:xml:xsd:journal</ctx:format>
          <ctx:metadata>
            <journal xmlns:rft='info:ofi/fmt:xml:xsd:journal' xsi:schemaLocation='info:ofi/fmt:xml:xsd:journal http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:journal'>
              <rft:atitle>Acute Myocardial Infarction in the Medicare population: process of care and clinical outcomes</rft:atitle>
              <rft:spage>18</rft:spage>
              <rft:date>1992</rft:date>
              <rft:stitle>Journal of the American Medical Association</rft:stitle>
              <rft:genre>article</rft:genre>
              <rft:epage>2530</rft:epage>
              <rft:au>I S Udvarhelyi</rft:au>
              <rft:au>C A Gatsonis</rft:au>
              <rft:au>A M Epstein</rft:au>
              <rft:au>C L Pashos</rft:au>
              <rft:au>J P Newhouse</rft:au>
              <rft:au>B J McNeil</rft:au>
            </journal>
          </ctx:metadata>
        </ctx:metadata-by-val>
      </ctx:referent>
    </ctx:context-object>
  </ctx:context-objects>
  </citation>
</citations>

这是我的项目:

代码:

import java.io.IOException;
import java.io.InputStream;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.io.IOUtils;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.message.BasicNameValuePair;

public class HttpClientTest {

    public static void main(String[] args) throws UnsupportedEncodingException {
        HttpClient httpclient = HttpClients.createDefault();
        HttpPost httppost = new HttpPost("http://freecite.library.brown.edu/citations/create");

        // Request parameters and other properties.
        List<NameValuePair> params = new ArrayList<NameValuePair>();
        params.add(new BasicNameValuePair("citation", "A. Bookstein and S. T. Klein, Detecting content-bearing words by serial clustering, "
                + "Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 319327, 1995."));
        httppost.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));

        //Execute and get the response.
        HttpResponse response = null;
        try {
            response = httpclient.execute(httppost);
            response.setHeader("Content-Type", "text/xml");

        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        HttpEntity entity = response.getEntity();
        System.out.println(response.getStatusLine());

        if (entity != null) {
            InputStream instream = null;
            try {
                instream = entity.getContent();
                // NB: does not close inputStream, you can use IOUtils.closeQuietly for that
                String theString = IOUtils.toString(instream, "UTF-8"); 
                IOUtils.closeQuietly(instream);
                System.out.println(theString);
            } catch (UnsupportedOperationException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }
            try {
                // do something useful
            } finally {
                try {
                    instream.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }
}

结果:

我得到了 整个 HTML 页面,而不是 XML响应代码 200 而不是 201

HTTP/1.1 200 

<script src="/javascripts/prototype.js?1218559878" type="text/javascript"></script>
<link href="/stylesheets/citation.css?1218559878" media="screen" rel="stylesheet" type="text/css" />
<table>

  <tr>
  <td>
  <span class="citation"> <span class="authors"> <span class="author"> A Bookstein</span> <span class="author"> S T Klein</span> </span> <span class="title"> Detecting content-bearing words by serial clustering</span> <span class="booktitle"> Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</span> <span class="pages"> 319327</span> <span class="year"> 1995</span> <br> <span class="raw_string"> A. Bookstein and S. T. Klein, Detecting content-bearing words by serial clustering, Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 319327, 1995.</span> </span> 
  <br>
  <code> &lt;ctx:context-objects xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='info:ofi/fmt:xml:xsd:ctx http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:ctx' xmlns:ctx='info:ofi/fmt:xml:xsd:ctx'&gt;&lt;ctx:context-object timestamp='2016-10-29T02:43:38-04:00' encoding='info:ofi/enc:UTF-8' version='Z39.88-2004' identifier=''&gt;&lt;ctx:referent&gt;&lt;ctx:metadata-by-val&gt;&lt;ctx:format&gt;info:ofi/fmt:xml:xsd:book&lt;/ctx:format&gt;&lt;ctx:metadata&gt;&lt;book xmlns:rft='info:ofi/fmt:xml:xsd:book' xsi:schemaLocation='info:ofi/fmt:xml:xsd:book http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:book'&gt;&lt;rft:atitle&gt;Detecting content-bearing words by serial clustering&lt;/rft:atitle&gt;&lt;rft:date&gt;1995&lt;/rft:date&gt;&lt;rft:btitle&gt;Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval&lt;/rft:btitle&gt;&lt;rft:genre&gt;proceeding&lt;/rft:genre&gt;&lt;rft:pages&gt;319327&lt;/rft:pages&gt;&lt;rft:au&gt;A Bookstein&lt;/rft:au&gt;&lt;rft:au&gt;S T Klein&lt;/rft:au&gt;&lt;/book&gt;&lt;/ctx:metadata&gt;&lt;/ctx:metadata-by-val&gt;&lt;/ctx:referent&gt;&lt;/ctx:context-object&gt;&lt;/ctx:context-objects&gt; </code>
  </td>

  <td bgcolor="FF9999" class='choose_option'>
    <input id="unusable"
      name="citation_rating_13655375"
      type="radio"
      value="unusable"
      onclick="new Ajax.Request('/citations/set_rating/13655375', {parameters:{rating: this.value} }); return true;"
       />
    <label for='unusable'>unusable</label>
  </td>

  <td bgcolor="FFFFCC" class='choose_option'>
    <input id="usable"
      name="citation_rating_13655375"
      type="radio"
      value="usable"
      onclick="new Ajax.Request('/citations/set_rating/13655375', {parameters:{rating: this.value} }); return true;"
       />
    <label for='usable'>good enough</label>
  </td>

  <td bgcolor="CCFFCC" class='choose_option'>
    <input id="perfect"
      name="citation_rating_13655375"
      type="radio"
      value="perfect"
    onclick="new Ajax.Request('/citations/set_rating/13655375', {parameters:{rating: this.value} }); return true;"
       />
    <label for='perfect'>perfect</label>
  </td>

  </tr>

</table>

<br>
Key:
<span title="author" class="author">Authors</span>
<span title="title" class="title">Title</span>
<span title="journal" class="journal">Journal</span>
<span title="booktitle" class="booktitle">Booktitle</span>
<span title="editor" class="editor">Editor</span>
<span title="volume" class="volume">Volume</span>
<span title="publisher" class="publisher">Publisher</span>
<span title="institution" class="institution">Institution</span>
<span title="location" class="location">Location</span>
<span title="number" class="number">Number</span>
<span title="pages" class="pages">Pages</span>
<span title="year" class="year">Year</span>
<span title="tech" class="tech">Tech</span>
<span title="note" class="note">Note</span>
<br>
<span class="raw_string">Original citation string</span>
<br>
<code>ContextObject</code>
<br>
<a href="/welcome">Home</a>

n.b.: 在上面的<code>标签里面,有这个XML数据:

<rft:atitle>Detecting content-bearing words by serial clustering</rft:atitle>
<rft:date>1995</rft:date>
<rft:btitle>Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</rft:btitle>
<rft:genre>proceeding</rft:genre>
<rft:pages>319327</rft:pages>
<rft:au>A Bookstein</rft:au>
<rft:au>S T Klein</rft:au>

问题: 错误在哪里,我如何解决这个问题以获得 XML 响应(w/响应代码 201)?

这是您在 Ruby 中所做的...

response = http.post('/citations/create',
   'citation=A. Bookstein and S. T. Klein,  \
   Detecting content-bearing words by serial clustering,  \
   Proceedings of the Nineteenth Annual International ACM SIGIR Conference \
   on Research and Development in Information Retrieval,   \
   pp. 319327,   1995.',
 'Accept' => 'text/xml')

这是您在 Java

中所做的
HttpClient httpclient = HttpClients.createDefault();
HttpPost httppost = new HttpPost(
    "http://freecite.library.brown.edu/citations/create");

// Request parameters and other properties.
List<NameValuePair> params = new ArrayList<NameValuePair>();
params.add(new BasicNameValuePair("citation", 
    "A. Bookstein and S. T. Klein, Detecting content-bearing " +
    "words by serial clustering, " +
    "Proceedings of the Nineteenth Annual International ACM SIGIR " +
    "Conference on Research and Development in Information " +
    "Retrieval, pp. 319327, 1995."));
    httppost.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));

...
response = httpclient.execute(httppost);
response.setHeader("Content-Type", "text/xml");

看出区别了吗?

在Java情况下:

  • 您正在设置 Content-type 而不是 Accept
  • 您将其设置在 Response 对象上而不是 HttpPost 对象上
  • 您正在执行请求后设置它。

现在 AcceptContent-type 意味着不同的东西。第一个说 "I want you to send me something of this type"。第二个说 "I am sending you something of this type".

当然,在您刚收到的响应上设置内容类型比无用更糟糕。它实际上破坏了响应中的真实内容类型……可能是 "text/html",因为您的请求没有指定任何内容。

你实际上应该打电话给

httppost.setHeader("Accept", "text/xml");

在执行调用之前