读取和附加文件上下文管理器:似乎不读取,只写入
Reading and appending files context manager: Doesn't seem to read, only writes
我正在尝试读取文件并将其附加到文件,但是当我使用上下文管理器时它似乎不起作用。
在这段代码中,我试图获取包含我的 'serien' 列表中的一项的网站上的所有 link。如果 link 在列表中,我首先检查 link 是否已经在文件中。如果找到 link,则不应再次附加 link。但确实如此。
我猜测我没有使用正确的模式,或者我以某种方式搞砸了我的上下文管理器。还是我完全错了
import requests
from bs4 import BeautifulSoup
serien = ['izombie', 'grandfathered', 'new-girl']
serien_links = []
#Gets chapter links
def episode_links(index_url):
r = requests.get(index_url)
soup = BeautifulSoup(r.content, 'lxml')
links = soup.find_all('a')
url_list = []
for url in links:
url_list.append((url.get('href')))
return url_list
urls_unfiltered = episode_links('http://watchseriesus.tv/last-350-posts/')
with open('link.txt', 'a+') as f:
for serie in serien:
for x in urls_unfiltered:
#check whether link is already in file. If not write link to file
if serie in x and serie not in f.read():
f.write('{}\n'.format(x))
这是我第一次使用上下文管理器。提示将不胜感激。
编辑:没有上下文管理器的类似项目。在这里,我也尝试过使用上下文管理器,但在遇到同样的问题后放弃了。
file2_out = open('url_list.txt', 'a') #local url list for chapter check
for x in link_list:
#Checking chapter existence in folder and downloading chapter
if x not in open('url_list.txt').read(): #Is url of chapter in local url list?
#push = pb.push_note(get_title(x), x)
file2_out.write('{}\n'.format(x)) #adding downloaded chapter to local url list
print('{} saved.'.format(x))
file2_out.close()
并且使用上下文管理器:
with open('url_list.txt', 'a+') as f:
for x in link_list:
#Checking chapter existence in folder and downloading chapter
if x not in f.read(): #Is url of chapter in local url list?
#push = pb.push_note(get_title(x), x)
f.write('{}\n'.format(x)) #adding downloaded chapter to local url list
print('{} saved.'.format(x))
如@martineau 所述,f.read()
读取整个文件,然后获取空字符串。试试下面的代码。它读取要列出的内容,然后在列表中进行比较。
import requests
from bs4 import BeautifulSoup
serien = ['izombie', 'grandfathered', 'new-girl']
serien_links = []
# Gets chapter links
def episode_links(index_url):
r = requests.get(index_url)
soup = BeautifulSoup(r.content, 'lxml')
links = soup.find_all('a')
url_list = []
for url in links:
url_list.append((url.get('href')))
return url_list
urls_unfiltered = episode_links('http://watchseriesus.tv/last-350-posts/')
with open('link.txt', 'a+') as f:
cont = f.read().splitlines()
for serie in serien:
for x in urls_unfiltered:
# check whether link is already in file. If not write link to file
if (serie in x) and (x not in cont):
f.write('{}\n'.format(x))
我正在尝试读取文件并将其附加到文件,但是当我使用上下文管理器时它似乎不起作用。
在这段代码中,我试图获取包含我的 'serien' 列表中的一项的网站上的所有 link。如果 link 在列表中,我首先检查 link 是否已经在文件中。如果找到 link,则不应再次附加 link。但确实如此。
我猜测我没有使用正确的模式,或者我以某种方式搞砸了我的上下文管理器。还是我完全错了
import requests
from bs4 import BeautifulSoup
serien = ['izombie', 'grandfathered', 'new-girl']
serien_links = []
#Gets chapter links
def episode_links(index_url):
r = requests.get(index_url)
soup = BeautifulSoup(r.content, 'lxml')
links = soup.find_all('a')
url_list = []
for url in links:
url_list.append((url.get('href')))
return url_list
urls_unfiltered = episode_links('http://watchseriesus.tv/last-350-posts/')
with open('link.txt', 'a+') as f:
for serie in serien:
for x in urls_unfiltered:
#check whether link is already in file. If not write link to file
if serie in x and serie not in f.read():
f.write('{}\n'.format(x))
这是我第一次使用上下文管理器。提示将不胜感激。
编辑:没有上下文管理器的类似项目。在这里,我也尝试过使用上下文管理器,但在遇到同样的问题后放弃了。
file2_out = open('url_list.txt', 'a') #local url list for chapter check
for x in link_list:
#Checking chapter existence in folder and downloading chapter
if x not in open('url_list.txt').read(): #Is url of chapter in local url list?
#push = pb.push_note(get_title(x), x)
file2_out.write('{}\n'.format(x)) #adding downloaded chapter to local url list
print('{} saved.'.format(x))
file2_out.close()
并且使用上下文管理器:
with open('url_list.txt', 'a+') as f:
for x in link_list:
#Checking chapter existence in folder and downloading chapter
if x not in f.read(): #Is url of chapter in local url list?
#push = pb.push_note(get_title(x), x)
f.write('{}\n'.format(x)) #adding downloaded chapter to local url list
print('{} saved.'.format(x))
如@martineau 所述,f.read()
读取整个文件,然后获取空字符串。试试下面的代码。它读取要列出的内容,然后在列表中进行比较。
import requests
from bs4 import BeautifulSoup
serien = ['izombie', 'grandfathered', 'new-girl']
serien_links = []
# Gets chapter links
def episode_links(index_url):
r = requests.get(index_url)
soup = BeautifulSoup(r.content, 'lxml')
links = soup.find_all('a')
url_list = []
for url in links:
url_list.append((url.get('href')))
return url_list
urls_unfiltered = episode_links('http://watchseriesus.tv/last-350-posts/')
with open('link.txt', 'a+') as f:
cont = f.read().splitlines()
for serie in serien:
for x in urls_unfiltered:
# check whether link is already in file. If not write link to file
if (serie in x) and (x not in cont):
f.write('{}\n'.format(x))