错误 403:Python 上的 robots.txt 不允许请求

Error 403: Request disallowed by robots.txt on Python

我正尝试在 python 上使用 mechanize 填写表格。当我 运行 代码时,出现错误:

Error 403:request disallowed by robots.txt.

我查看了之前回答过的类似问题,发现添加 br.set_handle_robots(False) 应该可以解决问题,但我仍然遇到同样的错误。那么我在这里缺少什么?

import re
import mechanize
from mechanize import Browser
br = mechanize.Browser()
br.set_handle_equiv(False)
br.set_handle_robots(False)
br.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64; rv:18.0)Gecko/20100101 Firefox/18.0 (compatible;)'),('Accept', '*/*')]
text = "1500103233"
browser = Browser()
browser.open("http://kuhs.ac.in/results.htm")
browser.select_form(nr=0)
browser['Stream']=['Medical']
browser['Level']=['UG']
browser['Course']=['MBBS']
browser['Scheme']=['MBBS 2015 Admissions']
browser['Year']=['Ist Year MBBS']
browser['Examination']=['First Professional MBBS Degree Regular(2015 Admissions) Examinations,August2016']
browser['Reg No']=text
response = browser.submit()
  1. 你设置了 br = mechanize.Browser() 然后你设置了 browser = Browser() ?
  2. link : http://kuhs.ac.in/results.htm 如果从页面源码可以看到,源码是: src="http://14.139.185.148/kms/index.php/results/create"
  3. 从页面源码中可以看到表单的名称。在你的情况下 Stream</labelname="Results[streamId]"

所以,您可以试试这个:

import mechanize
br = mechanize.Browser()
br.set_handle_equiv(False)
br.set_handle_robots(False)
br.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64; rv:18.0)Gecko/20100101 Firefox/18.0 (compatible;)'),('Accept', '*/*')]
text = "1500103233"
br.open("http://14.139.185.148/kms/index.php/results/create").read()
for forms in br.forms():
    print forms
br.select_form(nr=0)
br['Results[streamId]']=['1',] #Medical
#etc..
response = br.submit()
print response.read()

你可以在这里看到:Submitting a form with mechanize (TypeError: ListControl, must set a sequence)

希望这对我有帮助,对我有用!