当网页中没有 table 时,将 BS4 编辑为 return none
Edit BS4 to return none when table is not available in a webpage
当带有文本 "Content Logical definition"
的标记 h2
在 html 页面中可用时,以下代码有效,例如 https://www.hl7.org/fhir/valueset-account-status.html
def extract_table(url):
r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
h2 = soup.find(lambda elm: elm.name == 'h2' and 'Content Logical Definition' in elm.text)
div = h2.find_next_sibling('div')
return div.find('table')
但是对于以下不包含h2带`"Content Logical definition"的网页,如https://www.hl7.org/fhir/valueset-cpt-all.html returns出现如下错误:
'NoneType' object has no attribute 'find_next_sibling'
当网页中没有 h2 with content logical definition
时,如何将代码编辑为 return non for table。
您可以通过两种常用方式进行操作:
-
h2 = soup.find(lambda elm: elm.name == 'h2' and 'Content Logical Definition' in elm.text)
return div.find_next_sibling('div').find('table') if h2 else None
EAFP
- easier to ask for forgiveness than permission:
try:
h2 = soup.find(lambda elm: elm.name == 'h2' and 'Content Logical Definition' in elm.text)
div = h2.find_next_sibling('div')
return div.find('table')
except AttributeError:
return None
当带有文本 "Content Logical definition"
的标记 h2
在 html 页面中可用时,以下代码有效,例如 https://www.hl7.org/fhir/valueset-account-status.html
def extract_table(url):
r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
h2 = soup.find(lambda elm: elm.name == 'h2' and 'Content Logical Definition' in elm.text)
div = h2.find_next_sibling('div')
return div.find('table')
但是对于以下不包含h2带`"Content Logical definition"的网页,如https://www.hl7.org/fhir/valueset-cpt-all.html returns出现如下错误:
'NoneType' object has no attribute 'find_next_sibling'
当网页中没有 h2 with content logical definition
时,如何将代码编辑为 return non for table。
您可以通过两种常用方式进行操作:
-
h2 = soup.find(lambda elm: elm.name == 'h2' and 'Content Logical Definition' in elm.text) return div.find_next_sibling('div').find('table') if h2 else None
EAFP
- easier to ask for forgiveness than permission:try: h2 = soup.find(lambda elm: elm.name == 'h2' and 'Content Logical Definition' in elm.text) div = h2.find_next_sibling('div') return div.find('table') except AttributeError: return None