HttpClient 未加载有效 link 的内容

HttpClient doesn't load the content of the valid link

我使用 HttpClient 获取互联网上的网页内容,遇到了奇怪的行为。

有些网站加载完美,但有些请求因超时而失败。问题是 links 在浏览器中工作得很好。 例如。我有以下 link:https://www.luisaviaroma.com/en-gb/shop/women/shoes?lvrid=_gw_i4。我可以在浏览器中打开它,但我的代码不起作用:

var httpClient = new HttpClient();
var response = await httpClient.GetAsync("https://www.luisaviaroma.com/en-us/sw/women?lvrid=_gw");

可能是什么原因造成的? 可能问题出在 _ 符号上?那我该如何解决呢?

我也尝试使用像 RestSharp 这样的第 3 方库,但得到了相同的结果。

例外情况是:

System.Threading.Tasks.TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
 ---> System.TimeoutException: The operation was canceled.
 ---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
 ---> System.Net.Sockets.SocketException (995): The I/O operation has been aborted because of either a thread exit or an application request.
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter)
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   --- End of inner exception stack trace ---
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpClient.HandleFailure(Exception e, Boolean telemetryStarted, HttpResponseMessage response, CancellationTokenSource cts, CancellationToken cancellationToken, CancellationTokenSource pendingRequestsCts)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at Suits.Scheduler.SchedulerHostedService.UpdateClothesLinksAsync(SuitsDbContext dbContext) in C:\Repo\server\src\Suits.Scheduler\SchedulerHostedService.cs:line 98

当没有为 Header-Accept 字段设置适当的值时,我收到了 403 错误。
结果是一个text/html,所以需要加上合适的Header:

HttpRequestMessage msg = new HttpRequestMessage(
  HttpMethod.Get,
  "https://www.luisaviaroma.com/en-gb/shop/women/shoes?lvrid=_gw_i4"
);
msg.Headers.Add("Accept", "text/html");
HttpClient client = new HttpClient();
var response = client.SendAsync(msg).Result;

编辑:
在给定的情况下,OP 需要添加 Accept-Encoding header 两个。 answer of D A 指出了这一点。添加字段的代码:

msg.Headers.Add("Accept-Encoding", "br");

这是 return 200 OK 代码的正确请求:

        HttpRequestMessage msg = new HttpRequestMessage(HttpMethod.Get,"https://www.luisaviaroma.com/en-us/sw/women?lvrid=_gw");
        msg.Headers.Add("Accept", "text/html");
        msg.Headers.Add("accept-encoding", "gzip, deflate, br");            
        HttpClient client = new HttpClient();
        var response1 = client.SendAsync(msg).Result;

响应是压缩发送的,这就是您遇到问题的原因。