Python: return 异常时为空值

Question

我在Python方面有一些经验，但由于缺乏正规培训，我从未使用过try&except函数来捕获错误。

我正在努力从维基百科中提取几篇文章。为此，我有一系列标题，其中一些标题最后没有任何文章或搜索结果。我希望页面检索功能只是跳过那几个名字并继续运行其余的脚本。可重现的代码如下。

import wikipedia
# This one works.
links = ["CPython"]
test = [wikipedia.page(link, auto_suggest=False) for link in links]
test = [testitem.content for testitem in test]
print(test)

#The sequence breaks down if there is no wikipedia page.
links = ["CPython","no page"]
test = [wikipedia.page(link, auto_suggest=False) for link in links]
test = [testitem.content for testitem in test]
print(test)

图书馆运行它使用了这样的方法。通常这将是非常糟糕的做法，但由于这只是为了 one-off 数据提取，我愿意更改库的本地副本以使其工作。编辑我现在包含了完整的功能。

def page(title=None, pageid=None, auto_suggest=True, redirect=True, preload=False):
  '''
  Get a WikipediaPage object for the page with title `title` or the pageid
  `pageid` (mutually exclusive).

  Keyword arguments:

  * title - the title of the page to load
  * pageid - the numeric pageid of the page to load
  * auto_suggest - let Wikipedia find a valid page title for the query
  * redirect - allow redirection without raising RedirectError
  * preload - load content, summary, images, references, and links during initialization
  '''
  if title is not None:
    if auto_suggest:
      results, suggestion = search(title, results=1, suggestion=True)
      try:
        title = suggestion or results[0]
      except IndexError:
        # if there is no suggestion or search results, the page doesn't exist
        raise PageError(title)
    return WikipediaPage(title, redirect=redirect, preload=preload)
  elif pageid is not None:
    return WikipediaPage(pageid=pageid, preload=preload)
  else:
    raise ValueError("Either a title or a pageid must be specified")

我应该怎么做才能只检索没有给出错误的页面。也许有一种方法可以过滤掉列表中出现此错误或某种错误的所有项目。对于不存在的页面，返回 "NA" 或类似的内容是可以的。在没有通知的情况下跳过它们也可以。谢谢！

Answer 1

了解这将是一种不好的做法，但对于一次性的快速而肮脏的脚本，您可以：

编辑：等等，抱歉。我刚刚注意到列表理解。我实际上不确定如果不分解它是否可行：

links = ["CPython", "no page"]
test = []
for link in links:
    try:
        page = wikipedia.page(link, auto_suggest=False)
        test.append(page)
    except wikipedia.exceptions.PageError:
        pass
test = [testitem.content for testitem in test]
print(test)

pass 告诉 python 基本上相信你并忽略错误，这样它就可以继续它的一天。

Answer 2

如果页面不存在，函数 wikipedia.page 将引发 wikipedia.exceptions.PageError。这就是您要捕获的错误。

import wikipedia
links = ["CPython","no page"]
test=[]
for link in links:
    try:
        #try to load the wikipedia page
        page=wikipedia.page(link, auto_suggest=False)
        test.append(page)
    except wikipedia.exceptions.PageError:
        #if a "PageError" was raised, ignore it and continue to next link
        continue

你必须用 try 块包围函数 wikipedia.page，所以恐怕你不能使用列表理解。

Python: return 异常时为空值

Python: return empty value on exception

python

try-catch

wikipedia-api

pywikibot