在 ruby 中使用 Mechanize 抓取网页时如何解决 HTTP500 错误?
How do i resolve an HTTP500 Error while web scraping with Mechanize in ruby?
我想从该网站(“https://sarathi.nic.in:8443/nrportal/sarathi/HomePage.jsp”)检索我的驾驶执照号码 issue_date 和 expiry_date。当我尝试获取它时,出现错误 Mechanize::ResponseCodeError: 500 => Net::HTTPInternalServerError for https://sarathi.nic.in:8443/nrportal/sarathi/DlDetRequest.jsp -- unhandled response
.
这是我写的用来抓取的代码:
require 'mechanize'
require 'logger'
require 'nokogiri'
require 'open-uri'
require 'openssl'
OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE
agent = Mechanize.new
agent.log = Logger.new "mech.log"
agent.user_agent_alias = 'Mac Safari 4'
Mechanize.new.get("https://sarathi.nic.in:8443/nrportal/sarathi/HomePage.jsp")
page=agent.get('https://sarathi.nic.in:8443/nrportal/sarathi/HomePage.jsp') # opening home page.
page = agent.page.links.find { |l| l.text == 'Status of Licence' }.click # click the link.
page.forms_with(:name=>"dlform").first.field_with(:name=>"dlform:DLNumber").value="TN38 20120001119" #user input to text field.
page.form_with(:name=>"dlform").field_with(:name=>"javax.faces.ViewState").value="SUBMIT" #submit button value assigning.
page.form(:name=>"dlform",:action=>"/nrportal/sarathi/DlDetRequest.jsp") #to specify the form i need.
agent.cookie_jar.clear!
gg=agent.submit page.forms.last #submitting my form
它不起作用,因为您在提交表单之前清除了 cookie,因此删除了您提供的所有输入数据。我可以通过简单地删除它来让它工作:
...
page.forms_with(:name=>"dlform").first.field_with(:name=>"dlform:DLNumber").value="TN38 20120001119" #user input to text field
form = page.form(:name=>"dlform",:action=>"/nrportal/sarathi/DlDetRequest.jsp")
gg = agent.submit form, form.buttons.first
请注意,您无需为#submit 按钮设置值,而是在提交表单时传递提交按钮。
我想从该网站(“https://sarathi.nic.in:8443/nrportal/sarathi/HomePage.jsp”)检索我的驾驶执照号码 issue_date 和 expiry_date。当我尝试获取它时,出现错误 Mechanize::ResponseCodeError: 500 => Net::HTTPInternalServerError for https://sarathi.nic.in:8443/nrportal/sarathi/DlDetRequest.jsp -- unhandled response
.
这是我写的用来抓取的代码:
require 'mechanize'
require 'logger'
require 'nokogiri'
require 'open-uri'
require 'openssl'
OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE
agent = Mechanize.new
agent.log = Logger.new "mech.log"
agent.user_agent_alias = 'Mac Safari 4'
Mechanize.new.get("https://sarathi.nic.in:8443/nrportal/sarathi/HomePage.jsp")
page=agent.get('https://sarathi.nic.in:8443/nrportal/sarathi/HomePage.jsp') # opening home page.
page = agent.page.links.find { |l| l.text == 'Status of Licence' }.click # click the link.
page.forms_with(:name=>"dlform").first.field_with(:name=>"dlform:DLNumber").value="TN38 20120001119" #user input to text field.
page.form_with(:name=>"dlform").field_with(:name=>"javax.faces.ViewState").value="SUBMIT" #submit button value assigning.
page.form(:name=>"dlform",:action=>"/nrportal/sarathi/DlDetRequest.jsp") #to specify the form i need.
agent.cookie_jar.clear!
gg=agent.submit page.forms.last #submitting my form
它不起作用,因为您在提交表单之前清除了 cookie,因此删除了您提供的所有输入数据。我可以通过简单地删除它来让它工作:
...
page.forms_with(:name=>"dlform").first.field_with(:name=>"dlform:DLNumber").value="TN38 20120001119" #user input to text field
form = page.form(:name=>"dlform",:action=>"/nrportal/sarathi/DlDetRequest.jsp")
gg = agent.submit form, form.buttons.first
请注意,您无需为#submit 按钮设置值,而是在提交表单时传递提交按钮。