如何提取 <a> 标签中 rel 内的内容?
How do I extract contents within rel in <a> tag?
<a href="#" class="tip" rel="
Principal Name - S. BALKAR SINGH
Mobile No. - 8146611008
Email ID - gsssdhapaiasr@gmail.com
" style="user-select: text;">View Contact Details<span
class="caret"></span></a>
主要姓名,手机号码和电子邮件ID是我感兴趣的内容。当我指定soup.find('a', {'class':'tip'})
时,它只给我"View Contact Details"。
有没有办法提取 rel
中的内容?
rel
是属性,因此您必须使用 ['rel']
- 即。 soup.find('a', {'class':'tip'})['rel']
工作示例
data = '''<a href="#" class="tip" rel="
Principal Name - S. BALKAR SINGH
Mobile No. - 8146611008
Email ID - gsssdhapaiasr@gmail.com
" style="user-select: text;">View Contact Details<span
class="caret"></span></a>'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'html.parser')
item = soup.find('a', {'class':'tip'})
print('text:', item.text)
print(' rel:', item['rel'])
print(' rel:', ' '.join(item['rel']))
结果:
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'S.', 'BALKAR', 'SINGH', 'Mobile', 'No.', '-', '8146611008', 'Email', 'ID', '-', 'gsssdhapaiasr@gmail.com', '']
rel: Principal Name - S. BALKAR SINGH Mobile No. - 8146611008 Email ID - gsssdhapaiasr@gmail.com
BS
for rel
returns 列表,不是一个字符串,因为 Multi-valued attributes
编辑: 获取 table 数据,您必须发送 POST
请求以及通常将浏览器发送到服务器的所有数据 - 这意味着数据在表单,它甚至可以是空字符串,但服务器必须接收表单字段。
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0'}
# form fields send to server
params = {
'SchoolType': '',
'Dist1': '',
'Sch1': '',
'SearchString': ''
}
r = requests.post('http://www.registration.pseb.ac.in/School/Schoollist', headers=headers, data=params)
soup = BeautifulSoup(r.text, 'html.parser')
all_a = soup.find_all('a', {'class':'tip'})
for items in all_a:
print('text:', item.text)
print(' rel:', item['rel'])
print(' rel:', ' '.join(item['rel']))
print('-----')
结果:
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
<a href="#" class="tip" rel="
Principal Name - S. BALKAR SINGH
Mobile No. - 8146611008
Email ID - gsssdhapaiasr@gmail.com
" style="user-select: text;">View Contact Details<span
class="caret"></span></a>
主要姓名,手机号码和电子邮件ID是我感兴趣的内容。当我指定soup.find('a', {'class':'tip'})
时,它只给我"View Contact Details"。
有没有办法提取 rel
中的内容?
rel
是属性,因此您必须使用 ['rel']
- 即。 soup.find('a', {'class':'tip'})['rel']
工作示例
data = '''<a href="#" class="tip" rel="
Principal Name - S. BALKAR SINGH
Mobile No. - 8146611008
Email ID - gsssdhapaiasr@gmail.com
" style="user-select: text;">View Contact Details<span
class="caret"></span></a>'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'html.parser')
item = soup.find('a', {'class':'tip'})
print('text:', item.text)
print(' rel:', item['rel'])
print(' rel:', ' '.join(item['rel']))
结果:
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'S.', 'BALKAR', 'SINGH', 'Mobile', 'No.', '-', '8146611008', 'Email', 'ID', '-', 'gsssdhapaiasr@gmail.com', '']
rel: Principal Name - S. BALKAR SINGH Mobile No. - 8146611008 Email ID - gsssdhapaiasr@gmail.com
BS
for rel
returns 列表,不是一个字符串,因为 Multi-valued attributes
编辑: 获取 table 数据,您必须发送 POST
请求以及通常将浏览器发送到服务器的所有数据 - 这意味着数据在表单,它甚至可以是空字符串,但服务器必须接收表单字段。
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0'}
# form fields send to server
params = {
'SchoolType': '',
'Dist1': '',
'Sch1': '',
'SearchString': ''
}
r = requests.post('http://www.registration.pseb.ac.in/School/Schoollist', headers=headers, data=params)
soup = BeautifulSoup(r.text, 'html.parser')
all_a = soup.find_all('a', {'class':'tip'})
for items in all_a:
print('text:', item.text)
print(' rel:', item['rel'])
print(' rel:', ' '.join(item['rel']))
print('-----')
结果:
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----
text: View Contact Details
rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
rel: Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM
-----