net/http 自动将网页重定向到另一种语言
net/http automatically redirects webpage to another language
我正在尝试使用 open-uri
从以下位置抓取数据:
https://www.zomato.com/grande-lisboa/fu-hao-massamá
但是,网站会自动重定向到:
https://www.zomato.com/pt/grande-lisboa/fu-hao-massamá
我不想要西班牙语版本。我要英文的我如何告诉 ruby 停止这样做?
这称为 content negotiation - Web 服务器根据您的请求重定向。 pt
(葡萄牙语)似乎是默认设置:(至少从我所在的位置来看)
$ curl -I https://www.zomato.com/grande-lisboa/fu-hao-massam%C3%A1
HTTP/1.1 301 Moved Permanently
Set-Cookie: zl=pt; ...
Location: https://www.zomato.com/pt/grande-lisboa/fu-hao-massam%C3%A1
您可以通过发送 Accept-Language
header 请求另一种语言。这是 Accept-Language: es
(西班牙语)的答案:
$ curl -I https://www.zomato.com/grande-lisboa/fu-hao-massam%C3%A1 -H "Accept-Language: es"
HTTP/1.1 301 Moved Permanently
Set-Cookie: zl=es_cl; ...
Location: https://www.zomato.com/es/grande-lisboa/fu-hao-massam%C3%A1
这里是 Accept-Language: en
的答案(英语):
$ curl -I https://www.zomato.com/grande-lisboa/fu-hao-massam%C3%A1 -H "Accept-Language: en"
HTTP/1.1 200 OK
Set-Cookie: zl=en; ...
这似乎是您一直在寻找的资源。
在 Ruby 中,您将使用:
require 'nokogiri'
require 'open-uri'
url = 'https://www.zomato.com/grande-lisboa/fu-hao-massam%C3%A1'
headers = {'Accept-Language' => 'en'}
doc = Nokogiri::HTML(open(url, headers))
doc.at('html')[:lang]
#=> "en"
我正在尝试使用 open-uri
从以下位置抓取数据:
https://www.zomato.com/grande-lisboa/fu-hao-massamá
但是,网站会自动重定向到:
https://www.zomato.com/pt/grande-lisboa/fu-hao-massamá
我不想要西班牙语版本。我要英文的我如何告诉 ruby 停止这样做?
这称为 content negotiation - Web 服务器根据您的请求重定向。 pt
(葡萄牙语)似乎是默认设置:(至少从我所在的位置来看)
$ curl -I https://www.zomato.com/grande-lisboa/fu-hao-massam%C3%A1
HTTP/1.1 301 Moved Permanently
Set-Cookie: zl=pt; ...
Location: https://www.zomato.com/pt/grande-lisboa/fu-hao-massam%C3%A1
您可以通过发送 Accept-Language
header 请求另一种语言。这是 Accept-Language: es
(西班牙语)的答案:
$ curl -I https://www.zomato.com/grande-lisboa/fu-hao-massam%C3%A1 -H "Accept-Language: es"
HTTP/1.1 301 Moved Permanently
Set-Cookie: zl=es_cl; ...
Location: https://www.zomato.com/es/grande-lisboa/fu-hao-massam%C3%A1
这里是 Accept-Language: en
的答案(英语):
$ curl -I https://www.zomato.com/grande-lisboa/fu-hao-massam%C3%A1 -H "Accept-Language: en"
HTTP/1.1 200 OK
Set-Cookie: zl=en; ...
这似乎是您一直在寻找的资源。
在 Ruby 中,您将使用:
require 'nokogiri'
require 'open-uri'
url = 'https://www.zomato.com/grande-lisboa/fu-hao-massam%C3%A1'
headers = {'Accept-Language' => 'en'}
doc = Nokogiri::HTML(open(url, headers))
doc.at('html')[:lang]
#=> "en"