通过从报纸上收集文本重新输入代码时,除了第一个 link 之外的所有内容都被忽略了
Everything ignored but the first link when re-entering the code by gathering text from newspapers
我需要从多个 URL 收集文章的文本。输入时代码功能完美。但是,通过重新输入 print(first_article.text) 将输出导出为 CSV,仅会出现第一篇文章。发生这种情况是否有原因?如何从所有文件中导出文本?
import newspaper
from newspaper import Article
lista = ['url','url']
for list in lista:
first_article = Article(url="%s" % list, language='en')
first_article.download()
first_article.parse()
print(first_article.text)
#This prints all articles
print(first_article)
#This prints only one
参考:Downloading articles from multiple urls with newspaper
我想我明白了问题所在。您想要获取文章列表。您可以通过附加列表来实现此目的:
lista = ['url','url']
articles = [] #initialize a list
for list in lista:
first_article = Article(url="%s" % list, language='en')
first_article.download()
first_article.parse()
articles += [first_article.text] # Add article to list
print(first_article.text)
print(articles) #Print all articles
我需要从多个 URL 收集文章的文本。输入时代码功能完美。但是,通过重新输入 print(first_article.text) 将输出导出为 CSV,仅会出现第一篇文章。发生这种情况是否有原因?如何从所有文件中导出文本?
import newspaper
from newspaper import Article
lista = ['url','url']
for list in lista:
first_article = Article(url="%s" % list, language='en')
first_article.download()
first_article.parse()
print(first_article.text)
#This prints all articles
print(first_article)
#This prints only one
参考:Downloading articles from multiple urls with newspaper
我想我明白了问题所在。您想要获取文章列表。您可以通过附加列表来实现此目的:
lista = ['url','url']
articles = [] #initialize a list
for list in lista:
first_article = Article(url="%s" % list, language='en')
first_article.download()
first_article.parse()
articles += [first_article.text] # Add article to list
print(first_article.text)
print(articles) #Print all articles