尝试除了多次检查的更好方法
Better way to try-except multiple checks
假设我有一些(简化的)BeautifulSoup 代码,将数据提取到字典中:
tournament_info = soup.find_all('li')
stats['Date'] = tournament_info[0].text
stats['Location'] = tournament_info[1].text
stats['Prize'] = tournament_info[3].text.split(':')[1].strip()
在初始find_all returns异常的情况下,我希望所有字典条目都是'None'。在任何单个字典分配返回异常的情况下,我也想要 'None'。
除了像下面这样可怕的东西之外,还有什么好的写法吗?
try:
tournament_info = soup.find_all('li')
except:
m_stats['Date'] = 'None'
m_stats['Location'] = 'None'
m_stats['Prize'] = 'None'
try:
m_stats['Date'] = tournament_info[0].text
except:
m_stats['Date'] = 'None'
try:
m_stats['Location'] = tournament_info[1].text
except:
m_stats['Location'] = 'None'
try:
m_stats['Prize'] = tournament_info[3].text.split(':')[1].strip()
except:
m_stats['Prize'] = 'None'
以下是我可以为您的代码提供的建议:
info = soup.find_all('li')
if not info:
m_stats = dict.fromkeys(m_stats, None)
return
mappings = {
'Date': 0,
'Location': 1,
'Prize': 3
}
for key in mappings:
value = None
try:
value = info[mappings[key]].text
if mappings[key] == 3:
value = value.split(':')[1].strip()
except IndexError:
pass
m_stats[key] = value
或者,您可以创建一个函数来为您处理异常:
def get_value(idx):
value = None
try:
value = info[idx].text
except IndexError:
pass
return value
m_stats['Date'] = get_value(0)
m_stats['Location'] = get_value(1)
m_stats['Prize'] = get_value(3)
if m_stats['Prize']:
m_stats['Prize'].split(':')[1].strip()
创建自己的class
class Stats(dict):
tournament_info = []
def __init__(self, tournament_info, **kwargs):
super(Stats, self).__init__(**kwargs)
self.tournament_info = tournament_info
self['Date'] = self.get_tournament_info_text(0)
self['Location'] = self.get_tournament_info_text(1)
prize = self.get_tournament_info_text(2)
if prize is not None:
prize = prize.split(':')[1].strip()
self['Prize'] = prize
def get_tournament_info_text(self, index):
try:
return self.tournament_info[index]['text']
except:
return None
tournament_info = [
{
'text': 'aaa'
},
{},
{
'text': 'bbb:ccc '
}
]
m_stats = Stats(tournament_info)
print m_stats
我寻求的解决方案是创建一个空白模板字典(实际上是 JSON),所有键都设置为 'None'。
每次抓取页面时,m_stats 首先使用这个空白字典(从 JSON 加载)进行初始化。如果发生异常,它只是简单地传递(带有一些日志记录),并且该值保留为 'None'。这样就不需要每次都显式分配 'None' 。
不确定将此标记为 "answer" 是否正确,因为它非常符合我的需求,但我还是这么做了。
假设我有一些(简化的)BeautifulSoup 代码,将数据提取到字典中:
tournament_info = soup.find_all('li')
stats['Date'] = tournament_info[0].text
stats['Location'] = tournament_info[1].text
stats['Prize'] = tournament_info[3].text.split(':')[1].strip()
在初始find_all returns异常的情况下,我希望所有字典条目都是'None'。在任何单个字典分配返回异常的情况下,我也想要 'None'。
除了像下面这样可怕的东西之外,还有什么好的写法吗?
try:
tournament_info = soup.find_all('li')
except:
m_stats['Date'] = 'None'
m_stats['Location'] = 'None'
m_stats['Prize'] = 'None'
try:
m_stats['Date'] = tournament_info[0].text
except:
m_stats['Date'] = 'None'
try:
m_stats['Location'] = tournament_info[1].text
except:
m_stats['Location'] = 'None'
try:
m_stats['Prize'] = tournament_info[3].text.split(':')[1].strip()
except:
m_stats['Prize'] = 'None'
以下是我可以为您的代码提供的建议:
info = soup.find_all('li')
if not info:
m_stats = dict.fromkeys(m_stats, None)
return
mappings = {
'Date': 0,
'Location': 1,
'Prize': 3
}
for key in mappings:
value = None
try:
value = info[mappings[key]].text
if mappings[key] == 3:
value = value.split(':')[1].strip()
except IndexError:
pass
m_stats[key] = value
或者,您可以创建一个函数来为您处理异常:
def get_value(idx):
value = None
try:
value = info[idx].text
except IndexError:
pass
return value
m_stats['Date'] = get_value(0)
m_stats['Location'] = get_value(1)
m_stats['Prize'] = get_value(3)
if m_stats['Prize']:
m_stats['Prize'].split(':')[1].strip()
创建自己的class
class Stats(dict):
tournament_info = []
def __init__(self, tournament_info, **kwargs):
super(Stats, self).__init__(**kwargs)
self.tournament_info = tournament_info
self['Date'] = self.get_tournament_info_text(0)
self['Location'] = self.get_tournament_info_text(1)
prize = self.get_tournament_info_text(2)
if prize is not None:
prize = prize.split(':')[1].strip()
self['Prize'] = prize
def get_tournament_info_text(self, index):
try:
return self.tournament_info[index]['text']
except:
return None
tournament_info = [
{
'text': 'aaa'
},
{},
{
'text': 'bbb:ccc '
}
]
m_stats = Stats(tournament_info)
print m_stats
我寻求的解决方案是创建一个空白模板字典(实际上是 JSON),所有键都设置为 'None'。
每次抓取页面时,m_stats 首先使用这个空白字典(从 JSON 加载)进行初始化。如果发生异常,它只是简单地传递(带有一些日志记录),并且该值保留为 'None'。这样就不需要每次都显式分配 'None' 。
不确定将此标记为 "answer" 是否正确,因为它非常符合我的需求,但我还是这么做了。