python 中 for 循环的 readlines() 错误
readlines() error with for-loop in python
这个错误很难描述,因为我无法弄清楚循环是如何影响 readline()
和 readlines()
方法的。当我尝试使用前者时,我遇到了这些意外的 Traceback 错误。当我尝试后者时,我的代码 运行s 没有任何反应。我确定错误位于前八行。 Topics.txt
文件的前几行已发布。
Code
import requests
from html.parser import HTMLParser
from bs4 import BeautifulSoup
Url = "https://ritetag.com/best-hashtags-for/"
Topicfilename = "Topics.txt"
Topicfile = open(Topicfilename, 'r')
Line = Topicfile.readlines()
Linenumber = 0
for Line in Topicfile:
Linenumber += 1
print("Reading line", Linenumber)
Topic = Line
Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
print(Newtopic)
Link = Url.join(Newtopic)
print(Link)
Sourcecode = requests.get(Link)
当我在这里 运行 这个位时,它打印 URL 前面是 line.For 示例的第一个字符,它打印为 2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ 等 24 小时健身。
Topics.txt
- 21 世纪福克斯
- 24 小时健身
- 2K 游戏
- 3M
Full Error
Reading line 1 24HourFitness
2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ohttps://ritetag.com/best-hashtags-for/uhttps://ritetag.com/best-hashtags-for/rhttps://ritetag.com/best-hashtags-for/Fhttps://ritetag.com/best-hashtags-for/ihttps://ritetag.com/best-hashtags-for/thttps://ritetag.com/best-hashtags-for/nhttps://ritetag.com/best-hashtags-for/ehttps://ritetag.com/best-hashtags-for/shttps://ritetag.com/best-hashtags-for/s
Traceback (most recent call last): File
"C:\Users\Caden\Desktop\Programs\LususStudios\AutoDealBot\HashtagScanner.py",
line 17, in
Sourcecode = requests.get(Link) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\api.py",
line 71, in get
return request('get', url, params=params, **kwargs) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\api.py",
line 57, in request
return session.request(method=method, url=url, **kwargs) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\sessions.py",
line 475, in request
resp = self.send(prep, **send_kwargs) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\sessions.py",
line 579, in send
adapter = self.get_adapter(url=request.url) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\sessions.py",
line 653, in get_adapter
raise InvalidSchema("No connection adapters were found for '%s'" % url) requests.exceptions.InvalidSchema: No connection adapters were
found for
'2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ohttps://ritetag.com/best-hashtags-for/uhttps://ritetag.com/best-hashtags-for/rhttps://ritetag.com/best-hashtags-for/Fhttps://ritetag.com/best-hashtags-for/ihttps://ritetag.com/best-hashtags-for/thttps://ritetag.com/best-hashtags-for/nhttps://ritetag.com/best-hashtags-for/ehttps://ritetag.com/best-hashtags-for/shttps://ritetag.com/best-hashtags-for/s'
首先,python 惯例是将所有变量名小写。
其次,当你开始读取所有行时,你正在耗尽文件指针,然后继续循环遍历文件。
尝试简单地打开文件,然后遍历它
linenumber = 0
with open("Topics.txt") as topicfile:
for line in topicfile:
# do work
linenumber += 1
然后,回溯中的问题,如果你仔细观察,你正在建立这个很长的 url 字符串,那绝对不是 url,所以请求会抛出错误
InvalidSchema: No connection adapters were found for '2https://ritetag.com/best-hashtags-for/4https://ritetag.com/...
并且你可以调试看到 Url.join(Newtopic)
是 "interleaving" Newtopic
列表的每个字符之间的 Url
字符串,也就是 str.join
会做
我认为有两个问题:
- 您似乎在迭代
Topicfile
而不是 Topicfile.readLines()
。
Url.join(Newtopic)
没有返回您认为的那样。 .join
获取一个列表(在本例中,字符串是一个字符列表)并将在每个列表之间插入 Url
。
这是解决了这些问题的代码:
import requests
Url = "https://ritetag.com/best-hashtags-for/"
Topicfilename = "topics.txt"
Topicfile = open(Topicfilename, 'r')
Lines = Topicfile.readlines()
Linenumber = 0
for Line in Lines:
Linenumber += 1
print("Reading line", Linenumber)
Topic = Line
Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
print(Newtopic)
Link = '{}{}'.format(Url, Newtopic)
print(Link)
Sourcecode = requests.get(Link)
顺便说一句,我还建议使用小写变量名,因为驼峰式大小写通常保留给 Python 中的 class 名称 :)
这个错误很难描述,因为我无法弄清楚循环是如何影响 readline()
和 readlines()
方法的。当我尝试使用前者时,我遇到了这些意外的 Traceback 错误。当我尝试后者时,我的代码 运行s 没有任何反应。我确定错误位于前八行。 Topics.txt
文件的前几行已发布。
Code
import requests
from html.parser import HTMLParser
from bs4 import BeautifulSoup
Url = "https://ritetag.com/best-hashtags-for/"
Topicfilename = "Topics.txt"
Topicfile = open(Topicfilename, 'r')
Line = Topicfile.readlines()
Linenumber = 0
for Line in Topicfile:
Linenumber += 1
print("Reading line", Linenumber)
Topic = Line
Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
print(Newtopic)
Link = Url.join(Newtopic)
print(Link)
Sourcecode = requests.get(Link)
当我在这里 运行 这个位时,它打印 URL 前面是 line.For 示例的第一个字符,它打印为 2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ 等 24 小时健身。
Topics.txt
- 21 世纪福克斯
- 24 小时健身
- 2K 游戏
- 3M
Full Error
Reading line 1 24HourFitness 2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ohttps://ritetag.com/best-hashtags-for/uhttps://ritetag.com/best-hashtags-for/rhttps://ritetag.com/best-hashtags-for/Fhttps://ritetag.com/best-hashtags-for/ihttps://ritetag.com/best-hashtags-for/thttps://ritetag.com/best-hashtags-for/nhttps://ritetag.com/best-hashtags-for/ehttps://ritetag.com/best-hashtags-for/shttps://ritetag.com/best-hashtags-for/s
Traceback (most recent call last): File "C:\Users\Caden\Desktop\Programs\LususStudios\AutoDealBot\HashtagScanner.py", line 17, in Sourcecode = requests.get(Link) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\api.py", line 71, in get return request('get', url, params=params, **kwargs) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\api.py", line 57, in request return session.request(method=method, url=url, **kwargs) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\sessions.py", line 475, in request resp = self.send(prep, **send_kwargs) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\sessions.py", line 579, in send adapter = self.get_adapter(url=request.url) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\sessions.py", line 653, in get_adapter raise InvalidSchema("No connection adapters were found for '%s'" % url) requests.exceptions.InvalidSchema: No connection adapters were found for '2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ohttps://ritetag.com/best-hashtags-for/uhttps://ritetag.com/best-hashtags-for/rhttps://ritetag.com/best-hashtags-for/Fhttps://ritetag.com/best-hashtags-for/ihttps://ritetag.com/best-hashtags-for/thttps://ritetag.com/best-hashtags-for/nhttps://ritetag.com/best-hashtags-for/ehttps://ritetag.com/best-hashtags-for/shttps://ritetag.com/best-hashtags-for/s'
首先,python 惯例是将所有变量名小写。
其次,当你开始读取所有行时,你正在耗尽文件指针,然后继续循环遍历文件。
尝试简单地打开文件,然后遍历它
linenumber = 0
with open("Topics.txt") as topicfile:
for line in topicfile:
# do work
linenumber += 1
然后,回溯中的问题,如果你仔细观察,你正在建立这个很长的 url 字符串,那绝对不是 url,所以请求会抛出错误
InvalidSchema: No connection adapters were found for '2https://ritetag.com/best-hashtags-for/4https://ritetag.com/...
并且你可以调试看到 Url.join(Newtopic)
是 "interleaving" Newtopic
列表的每个字符之间的 Url
字符串,也就是 str.join
会做
我认为有两个问题:
- 您似乎在迭代
Topicfile
而不是Topicfile.readLines()
。 Url.join(Newtopic)
没有返回您认为的那样。.join
获取一个列表(在本例中,字符串是一个字符列表)并将在每个列表之间插入Url
。
这是解决了这些问题的代码:
import requests
Url = "https://ritetag.com/best-hashtags-for/"
Topicfilename = "topics.txt"
Topicfile = open(Topicfilename, 'r')
Lines = Topicfile.readlines()
Linenumber = 0
for Line in Lines:
Linenumber += 1
print("Reading line", Linenumber)
Topic = Line
Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
print(Newtopic)
Link = '{}{}'.format(Url, Newtopic)
print(Link)
Sourcecode = requests.get(Link)
顺便说一句,我还建议使用小写变量名,因为驼峰式大小写通常保留给 Python 中的 class 名称 :)