HttpClient 未加载有效 link 的内容
HttpClient doesn't load the content of the valid link
我使用 HttpClient
获取互联网上的网页内容,遇到了奇怪的行为。
有些网站加载完美,但有些请求因超时而失败。问题是 links 在浏览器中工作得很好。
例如。我有以下 link:https://www.luisaviaroma.com/en-gb/shop/women/shoes?lvrid=_gw_i4。我可以在浏览器中打开它,但我的代码不起作用:
var httpClient = new HttpClient();
var response = await httpClient.GetAsync("https://www.luisaviaroma.com/en-us/sw/women?lvrid=_gw");
可能是什么原因造成的?
可能问题出在 _
符号上?那我该如何解决呢?
我也尝试使用像 RestSharp 这样的第 3 方库,但得到了相同的结果。
例外情况是:
System.Threading.Tasks.TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
---> System.TimeoutException: The operation was canceled.
---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
---> System.IO.IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
---> System.Net.Sockets.SocketException (995): The I/O operation has been aborted because of either a thread exit or an application request.
--- End of inner exception stack trace ---
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter)
at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
--- End of inner exception stack trace ---
--- End of inner exception stack trace ---
at System.Net.Http.HttpClient.HandleFailure(Exception e, Boolean telemetryStarted, HttpResponseMessage response, CancellationTokenSource cts, CancellationToken cancellationToken, CancellationTokenSource pendingRequestsCts)
at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
at Suits.Scheduler.SchedulerHostedService.UpdateClothesLinksAsync(SuitsDbContext dbContext) in C:\Repo\server\src\Suits.Scheduler\SchedulerHostedService.cs:line 98
当没有为 Header-Accept 字段设置适当的值时,我收到了 403 错误。
结果是一个text/html
,所以需要加上合适的Header:
HttpRequestMessage msg = new HttpRequestMessage(
HttpMethod.Get,
"https://www.luisaviaroma.com/en-gb/shop/women/shoes?lvrid=_gw_i4"
);
msg.Headers.Add("Accept", "text/html");
HttpClient client = new HttpClient();
var response = client.SendAsync(msg).Result;
编辑:
在给定的情况下,OP 需要添加 Accept-Encoding
header 两个。 answer of D A 指出了这一点。添加字段的代码:
msg.Headers.Add("Accept-Encoding", "br");
这是 return 200 OK 代码的正确请求:
HttpRequestMessage msg = new HttpRequestMessage(HttpMethod.Get,"https://www.luisaviaroma.com/en-us/sw/women?lvrid=_gw");
msg.Headers.Add("Accept", "text/html");
msg.Headers.Add("accept-encoding", "gzip, deflate, br");
HttpClient client = new HttpClient();
var response1 = client.SendAsync(msg).Result;
响应是压缩发送的,这就是您遇到问题的原因。
我使用 HttpClient
获取互联网上的网页内容,遇到了奇怪的行为。
有些网站加载完美,但有些请求因超时而失败。问题是 links 在浏览器中工作得很好。 例如。我有以下 link:https://www.luisaviaroma.com/en-gb/shop/women/shoes?lvrid=_gw_i4。我可以在浏览器中打开它,但我的代码不起作用:
var httpClient = new HttpClient();
var response = await httpClient.GetAsync("https://www.luisaviaroma.com/en-us/sw/women?lvrid=_gw");
可能是什么原因造成的?
可能问题出在 _
符号上?那我该如何解决呢?
我也尝试使用像 RestSharp 这样的第 3 方库,但得到了相同的结果。
例外情况是:
System.Threading.Tasks.TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
---> System.TimeoutException: The operation was canceled.
---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
---> System.IO.IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
---> System.Net.Sockets.SocketException (995): The I/O operation has been aborted because of either a thread exit or an application request.
--- End of inner exception stack trace ---
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter)
at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
--- End of inner exception stack trace ---
--- End of inner exception stack trace ---
at System.Net.Http.HttpClient.HandleFailure(Exception e, Boolean telemetryStarted, HttpResponseMessage response, CancellationTokenSource cts, CancellationToken cancellationToken, CancellationTokenSource pendingRequestsCts)
at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
at Suits.Scheduler.SchedulerHostedService.UpdateClothesLinksAsync(SuitsDbContext dbContext) in C:\Repo\server\src\Suits.Scheduler\SchedulerHostedService.cs:line 98
当没有为 Header-Accept 字段设置适当的值时,我收到了 403 错误。
结果是一个text/html
,所以需要加上合适的Header:
HttpRequestMessage msg = new HttpRequestMessage(
HttpMethod.Get,
"https://www.luisaviaroma.com/en-gb/shop/women/shoes?lvrid=_gw_i4"
);
msg.Headers.Add("Accept", "text/html");
HttpClient client = new HttpClient();
var response = client.SendAsync(msg).Result;
编辑:
在给定的情况下,OP 需要添加 Accept-Encoding
header 两个。 answer of D A 指出了这一点。添加字段的代码:
msg.Headers.Add("Accept-Encoding", "br");
这是 return 200 OK 代码的正确请求:
HttpRequestMessage msg = new HttpRequestMessage(HttpMethod.Get,"https://www.luisaviaroma.com/en-us/sw/women?lvrid=_gw");
msg.Headers.Add("Accept", "text/html");
msg.Headers.Add("accept-encoding", "gzip, deflate, br");
HttpClient client = new HttpClient();
var response1 = client.SendAsync(msg).Result;
响应是压缩发送的,这就是您遇到问题的原因。