如何在 java 中进行批量 http 调用
How to make batch http call in java
我正在尝试通过 Http 访问另一项服务以使用 HttpClient 获取数据。 uri 应该类似于 endpoint:80/.../itemId。
我想知道是否有办法进行批量调用以指定一组 itemId?我确实发现有人在创建请求时建议 .setHeader(HttpHeaders.CONNECTION, "keep-alive") 。通过这样做,我如何在获取所有数据后释放客户端?
另外,这个方法好像还需要得到一个响应再发送一个请求?这有可能以异步方式进行吗?如何进行?顺便说一句,由于某种原因,在这种情况下我似乎无法使用 AsyncHttpClient。
由于我对HttpClient几乎一无所知,这个问题可能看起来很愚蠢。真心希望有人能帮我解决问题。
API 服务器支持
API 支持一次请求多个 ID 的可能性很小(例如,使用 http://endpoint:80/.../itemId1,itemId2,itemId3
形式的 URL)。查看 API 文档以查看是否可用,因为如果可用那将是最佳解决方案。
持久连接
看起来 Apache HttpClient 默认使用持久 ("keep alive") 连接(请参阅 Connection Management tutorial linked in ). The logging facilities 可以帮助验证连接是否被多个请求重用。
释放客户端,使用close()
方法。来自 2.3.4. Connection manager shutdown:
When an HttpClient instance is no longer needed and is about to go out of scope it is important to shut down its connection manager to ensure that all connections kept alive by the manager get closed and system resources allocated by those connections are released.
CloseableHttpClient httpClient = <...>
httpClient.close();
持久连接消除了建立新连接的开销,但正如您所注意到的,客户端在发送下一个请求之前仍会等待响应。
多线程和连接池
您可以使程序成为多线程并使用 PoolingHttpClientConnectionManager to control the number of connections made to the server. Here is an example based on 2.3.3. Pooling connection manager and 2.4. Multithreaded request execution:
import java.io.*;
import org.apache.http.*;
import org.apache.http.client.*;
import org.apache.http.client.methods.*;
import org.apache.http.client.protocol.*;
import org.apache.http.impl.client.*;
import org.apache.http.impl.conn.*;
import org.apache.http.protocol.*;
// ...
PoolingHttpClientConnectionManager cm =
new PoolingHttpClientConnectionManager();
cm.setMaxTotal(200); // increase max total connection to 200
cm.setDefaultMaxPerRoute(20); // increase max connection per route to 20
CloseableHttpClient httpClient = HttpClients.custom()
.setConnectionManager(cm)
.build();
String[] urisToGet = { ... };
// start a thread for each URI
// (if there are many URIs, a thread pool would be better)
Thread[] threads = new Thread[urisToGet.length];
for (int i = 0; i < threads.length; i++) {
HttpGet httpget = new HttpGet(urisToGet[i]);
threads[i] = new Thread(new GetTask(httpClient, httpget));
threads[i].start();
}
// wait for all the threads to finish
for (int i = 0; i < threads.length; i++) {
threads[i].join();
}
class GetTask implements Runnable {
private final CloseableHttpClient httpClient;
private final HttpContext context;
private final HttpGet httpget;
public GetTask(CloseableHttpClient httpClient, HttpGet httpget) {
this.httpClient = httpClient;
this.context = HttpClientContext.create();
this.httpget = httpget;
}
@Override
public void run() {
try {
CloseableHttpResponse response = httpClient.execute(
httpget, context);
try {
HttpEntity entity = response.getEntity();
} finally {
response.close();
}
} catch (ClientProtocolException ex) {
// handle protocol errors
} catch (IOException ex) {
// handle I/O errors
}
}
}
多线程将有助于饱和 link(保持尽可能多的数据流动),因为当一个线程发送请求时,其他线程可以接收响应并利用向下link。
流水线
HTTP/1.1 支持pipelining, which sends multiple requests on a single connection without waiting for the responses. The Asynchronous I/O based on NIO tutorial has an example in section 3.10. Pipelined request execution:
HttpProcessor httpproc = <...>
HttpAsyncRequester requester = new HttpAsyncRequester(httpproc);
HttpHost target = new HttpHost("www.apache.org");
List<BasicAsyncRequestProducer> requestProducers = Arrays.asList(
new BasicAsyncRequestProducer(target, new BasicHttpRequest("GET", "/index.html")),
new BasicAsyncRequestProducer(target, new BasicHttpRequest("GET", "/foundation/index.html")),
new BasicAsyncRequestProducer(target, new BasicHttpRequest("GET", "/foundation/how-it-works.html"))
);
List<BasicAsyncResponseConsumer> responseConsumers = Arrays.asList(
new BasicAsyncResponseConsumer(),
new BasicAsyncResponseConsumer(),
new BasicAsyncResponseConsumer()
);
HttpCoreContext context = HttpCoreContext.create();
Future<List<HttpResponse>> future = requester.executePipelined(
target, requestProducers, responseConsumers, pool, context, null);
HttpCore Examples ("Pipelined HTTP GET requests") 中有此示例的完整版本。
较旧的网络服务器可能无法正确处理流水线请求。
我正在尝试通过 Http 访问另一项服务以使用 HttpClient 获取数据。 uri 应该类似于 endpoint:80/.../itemId。
我想知道是否有办法进行批量调用以指定一组 itemId?我确实发现有人在创建请求时建议 .setHeader(HttpHeaders.CONNECTION, "keep-alive") 。通过这样做,我如何在获取所有数据后释放客户端?
另外,这个方法好像还需要得到一个响应再发送一个请求?这有可能以异步方式进行吗?如何进行?顺便说一句,由于某种原因,在这种情况下我似乎无法使用 AsyncHttpClient。
由于我对HttpClient几乎一无所知,这个问题可能看起来很愚蠢。真心希望有人能帮我解决问题。
API 服务器支持
API 支持一次请求多个 ID 的可能性很小(例如,使用 http://endpoint:80/.../itemId1,itemId2,itemId3
形式的 URL)。查看 API 文档以查看是否可用,因为如果可用那将是最佳解决方案。
持久连接
看起来 Apache HttpClient 默认使用持久 ("keep alive") 连接(请参阅 Connection Management tutorial linked in
释放客户端,使用close()
方法。来自 2.3.4. Connection manager shutdown:
When an HttpClient instance is no longer needed and is about to go out of scope it is important to shut down its connection manager to ensure that all connections kept alive by the manager get closed and system resources allocated by those connections are released.
CloseableHttpClient httpClient = <...> httpClient.close();
持久连接消除了建立新连接的开销,但正如您所注意到的,客户端在发送下一个请求之前仍会等待响应。
多线程和连接池
您可以使程序成为多线程并使用 PoolingHttpClientConnectionManager to control the number of connections made to the server. Here is an example based on 2.3.3. Pooling connection manager and 2.4. Multithreaded request execution:
import java.io.*;
import org.apache.http.*;
import org.apache.http.client.*;
import org.apache.http.client.methods.*;
import org.apache.http.client.protocol.*;
import org.apache.http.impl.client.*;
import org.apache.http.impl.conn.*;
import org.apache.http.protocol.*;
// ...
PoolingHttpClientConnectionManager cm =
new PoolingHttpClientConnectionManager();
cm.setMaxTotal(200); // increase max total connection to 200
cm.setDefaultMaxPerRoute(20); // increase max connection per route to 20
CloseableHttpClient httpClient = HttpClients.custom()
.setConnectionManager(cm)
.build();
String[] urisToGet = { ... };
// start a thread for each URI
// (if there are many URIs, a thread pool would be better)
Thread[] threads = new Thread[urisToGet.length];
for (int i = 0; i < threads.length; i++) {
HttpGet httpget = new HttpGet(urisToGet[i]);
threads[i] = new Thread(new GetTask(httpClient, httpget));
threads[i].start();
}
// wait for all the threads to finish
for (int i = 0; i < threads.length; i++) {
threads[i].join();
}
class GetTask implements Runnable {
private final CloseableHttpClient httpClient;
private final HttpContext context;
private final HttpGet httpget;
public GetTask(CloseableHttpClient httpClient, HttpGet httpget) {
this.httpClient = httpClient;
this.context = HttpClientContext.create();
this.httpget = httpget;
}
@Override
public void run() {
try {
CloseableHttpResponse response = httpClient.execute(
httpget, context);
try {
HttpEntity entity = response.getEntity();
} finally {
response.close();
}
} catch (ClientProtocolException ex) {
// handle protocol errors
} catch (IOException ex) {
// handle I/O errors
}
}
}
多线程将有助于饱和 link(保持尽可能多的数据流动),因为当一个线程发送请求时,其他线程可以接收响应并利用向下link。
流水线
HTTP/1.1 支持pipelining, which sends multiple requests on a single connection without waiting for the responses. The Asynchronous I/O based on NIO tutorial has an example in section 3.10. Pipelined request execution:
HttpProcessor httpproc = <...>
HttpAsyncRequester requester = new HttpAsyncRequester(httpproc);
HttpHost target = new HttpHost("www.apache.org");
List<BasicAsyncRequestProducer> requestProducers = Arrays.asList(
new BasicAsyncRequestProducer(target, new BasicHttpRequest("GET", "/index.html")),
new BasicAsyncRequestProducer(target, new BasicHttpRequest("GET", "/foundation/index.html")),
new BasicAsyncRequestProducer(target, new BasicHttpRequest("GET", "/foundation/how-it-works.html"))
);
List<BasicAsyncResponseConsumer> responseConsumers = Arrays.asList(
new BasicAsyncResponseConsumer(),
new BasicAsyncResponseConsumer(),
new BasicAsyncResponseConsumer()
);
HttpCoreContext context = HttpCoreContext.create();
Future<List<HttpResponse>> future = requester.executePipelined(
target, requestProducers, responseConsumers, pool, context, null);
HttpCore Examples ("Pipelined HTTP GET requests") 中有此示例的完整版本。
较旧的网络服务器可能无法正确处理流水线请求。