坚持尝试使用 xpath select 给定 class 的 html 元素,我哪里错了?
Stuck trying to select an html element of a given class with xpath, where am I wrong?
完成后 research on selecting an html element of a given class,我被卡住了
无法 select 具有 class 名为 "treasuries-table"
的 table
元素
来自 https://www.buybitcoinworldwide.com/treasuries/。我注意到有
<table class="treasuries-index treasuries-table treasuries-table--smaller">
在页面上但它不是我的目标,只有
<table class="treasuries-table">
是。
我试过了
//table[contains(concat(" ", normalize-space(@class), " "), " treasuries-table ")]
以及the less verbose, but not necessarily correct、
//table[@class="treasuries-table"]
,这一切都无济于事。
我哪里错了?我一般都是先找这个online tester的元素,会不会是测试人员的问题?
P.S。抱歉,如果它看起来确实重复,但类似问题中提到的解决方案似乎对我不起作用。
实际上,'//table[contains(@class,"treasuries-table")]'
选择了 5 table,但是 '//table[@class="treasuries-index treasuries-table treasuries-table--smaller")]' 正在选择第一个 table 相当于 (//table[contains(@class,"treasuries-table") ])[1]
尝试:
'//table[contains(@class,"treasuries-table")]'
你试过这个吗?
table[class*="treasuries-table"]
或
table[class*=" treasuries-table "]
将鼠标悬停在检查中的元素上 window 鼠标右键单击 > 复制 > 复制 XPath
我不确定你用的是哪个抓取器,但它return给我的数据是用漂亮的汤
sp = soup.find_all("table", attrs={"class":"treasuries-table"})[1]
这个returns
Entity Country Symbol:Exchange Filings & Sources # of BTC Value Today % of 21m MicroStrategy MSTR:NADQ Filing | News 129,218 129218 0.615% Tesla, Inc TSLA:NADQ Filing | News 42,902 42902 0.204% Galaxy Digital Holdings BRPHF:OTCMKTS Filing | News 16,400 16400 0.078% Voyager Digital LTD VOYG:TSX Filing | News 12,260 12260 0.058% Marathon Digital Holdings Inc MARA:NADQ Filing | News 9,373 9373 0.045% Square Inc. SQ:NYSE Filing | News 8,027 8027 0.038% Hut 8 Mining Corp HUT:NASDAQ Filing | News 6,460 6460 0.031% Riot Blockchain, Inc. RIOT:NADQ Filing | News 6,320 6320 0.03% Bitfarms Limited BITF:NASDAQ Filing | News 5,646 5646 0.027% Core Scientific CORZ:NASDAQ Filing | News 5,296 5296 0.025% Coinbase Global, Inc. COIN:NADQ Filing | News 4,482 4482 0.021% Bitcoin Group SE BTGGF:TCMKTS Filing | News 3,947 3947 0.019% Hive Blockchain HIVE:NASDAQ Filing | News 2,832 2832 0.013% Argo Blockchain PLC ARBKF:OTCMKTS Filing | News 2,685 2685 0.013% NEXON Co. Ltd NEXOF:OTCMKTS Filing | News 1,717 1717 0.008% Exodus Movement Inc :OTCMKTS Filing | News 1,300 1300 0.006% Brooker Group's BROOK (BKK) BROOK:BKK Filing | News 1,150 1150 0.005% Meitu HKD:HKG Filing | News 941 941 0.004% Bit Digital, Inc. BTBT:NASDAQ Filing | News 832 832 0.004% Digihost Technology Inc.
HSSHF:OTCMKTS Filing | News 797 797 0.004% BIGG Digital Assets Inc. BBKCF:OTCMKTS Filing | News 575 575 0.003% DMG Blockchain Solutions Inc. DMGGF:OTCMKTS Filing | News 432 432 0.002%
CleanSpark Inc CLSK:NASDAQ Filing | News 420 420 0.002% Cypherpunk Holdings Inc. HODL:OTCMKTS Filing | News 386 386 0.002% Advanced Bitcoin Technologies AG ABT:DUS Filing | News 254 254 0.001% DigitalX DGGXF:OTCMKTS Filing | News 216 216 0.001% Neptune Digital Assets NPPTF:OTCMKTS Filing | News 194 194 0.001% Cathedra Bitcoin Inc (Fortress Blockchain) CBIT:CVE Filing | News 169 169 0.001% MercadoLibre, Inc. MELI:NADQ Filing | News 150 150 0.001% LQwD
FinTech Corp OTC:INLAF Filing | News 139 139 0.001% Banxa Holdings Inc BNXAF:OTCMKTS Filing | News 136 136 0.001% Phunware, Inc. PHUN:NASDAQ Filing | News 127 127 0.001% BTCS Inc.
BTCS:OTCMKTS Filing | News 90 90 0.0% FRMO Corp. FRMO:OTCMKTS Filing | News 63 63 0.0% Canada Computational Unlimited Corp. SATO:TSXV Filing | News 37 37 0.0% Metromile MILE:NASDAQ Filing | News 25 25 0.0% MOGO Financing MOGO:Nasdaq Filing | News 18 18 0.0% Net Holding
Anonim Sirketi NTHOL TI:IST Filing | News 3 3 0.0% Totals: 266019 266019 1.267%
无论如何只需添加 [1] 即可显示列表的第二个元素,因为有两个表具有相同的 class 名称,第一个索引为 0,第二个索引为 1
用于硒
driver.find_elements(By.CLASS_NAME, value="treasuries-table")[1]
希望对您有所帮助
这里需要使用xpath吗?甚至是硒?我还会考虑使用 pandas
来解析 <table>
标签。这returns 4 表与treasuries-table
class。我只是在这里打印第一个。
import pandas as pd
import requests
response = requests.get('https://www.buybitcoinworldwide.com/treasuries/')
df = pd.read_html(response.text, attrs={'class':'treasuries-table'})[0]
输出:
print(df)
Entity Country ... Value Today % of 21m
0 NaN NaN ... 266019 1.267%
1 MicroStrategy NaN ... 129218 0.615%
2 Tesla, Inc NaN ... 42902 0.204%
3 Galaxy Digital Holdings NaN ... 16400 0.078%
4 Voyager Digital LTD NaN ... 12260 0.058%
5 Marathon Digital Holdings Inc NaN ... 9373 0.045%
6 Square Inc. NaN ... 8027 0.038%
7 Hut 8 Mining Corp NaN ... 6460 0.031%
8 Riot Blockchain, Inc. NaN ... 6320 0.03%
9 Bitfarms Limited NaN ... 5646 0.027%
10 Core Scientific NaN ... 5296 0.025%
11 Coinbase Global, Inc. NaN ... 4482 0.021%
12 Bitcoin Group SE NaN ... 3947 0.019%
13 Hive Blockchain NaN ... 2832 0.013%
14 Argo Blockchain PLC NaN ... 2685 0.013%
15 NEXON Co. Ltd NaN ... 1717 0.008%
16 Exodus Movement Inc NaN ... 1300 0.006%
17 Brooker Group's BROOK (BKK) NaN ... 1150 0.005%
18 Meitu NaN ... 941 0.004%
19 Bit Digital, Inc. NaN ... 832 0.004%
20 Digihost Technology Inc. NaN ... 797 0.004%
21 BIGG Digital Assets Inc. NaN ... 575 0.003%
22 DMG Blockchain Solutions Inc. NaN ... 432 0.002%
23 CleanSpark Inc NaN ... 420 0.002%
24 Cypherpunk Holdings Inc. NaN ... 386 0.002%
25 Advanced Bitcoin Technologies AG NaN ... 254 0.001%
26 DigitalX NaN ... 216 0.001%
27 Neptune Digital Assets NaN ... 194 0.001%
28 Cathedra Bitcoin Inc (Fortress Blockchain) NaN ... 169 0.001%
29 MercadoLibre, Inc. NaN ... 150 0.001%
30 LQwD FinTech Corp NaN ... 139 0.001%
31 Banxa Holdings Inc NaN ... 136 0.001%
32 Phunware, Inc. NaN ... 127 0.001%
33 BTCS Inc. NaN ... 90 0.0%
34 FRMO Corp. NaN ... 63 0.0%
35 Canada Computational Unlimited Corp. NaN ... 37 0.0%
36 Metromile NaN ... 25 0.0%
37 MOGO Financing NaN ... 18 0.0%
38 Net Holding Anonim Sirketi NaN ... 3 0.0%
[39 rows x 7 columns]
完成后 research on selecting an html element of a given class,我被卡住了
无法 select 具有 class 名为 "treasuries-table"
的 table
元素
来自 https://www.buybitcoinworldwide.com/treasuries/。我注意到有
<table class="treasuries-index treasuries-table treasuries-table--smaller">
在页面上但它不是我的目标,只有
<table class="treasuries-table">
是。 我试过了
//table[contains(concat(" ", normalize-space(@class), " "), " treasuries-table ")]
以及the less verbose, but not necessarily correct、
//table[@class="treasuries-table"]
,这一切都无济于事。
我哪里错了?我一般都是先找这个online tester的元素,会不会是测试人员的问题?
P.S。抱歉,如果它看起来确实重复,但类似问题中提到的解决方案似乎对我不起作用。
实际上,'//table[contains(@class,"treasuries-table")]'
选择了 5 table,但是 '//table[@class="treasuries-index treasuries-table treasuries-table--smaller")]' 正在选择第一个 table 相当于 (//table[contains(@class,"treasuries-table") ])[1]
尝试:
'//table[contains(@class,"treasuries-table")]'
你试过这个吗?
table[class*="treasuries-table"]
或
table[class*=" treasuries-table "]
将鼠标悬停在检查中的元素上 window 鼠标右键单击 > 复制 > 复制 XPath
我不确定你用的是哪个抓取器,但它return给我的数据是用漂亮的汤
sp = soup.find_all("table", attrs={"class":"treasuries-table"})[1]
这个returns
Entity Country Symbol:Exchange Filings & Sources # of BTC Value Today % of 21m MicroStrategy MSTR:NADQ Filing | News 129,218 129218 0.615% Tesla, Inc TSLA:NADQ Filing | News 42,902 42902 0.204% Galaxy Digital Holdings BRPHF:OTCMKTS Filing | News 16,400 16400 0.078% Voyager Digital LTD VOYG:TSX Filing | News 12,260 12260 0.058% Marathon Digital Holdings Inc MARA:NADQ Filing | News 9,373 9373 0.045% Square Inc. SQ:NYSE Filing | News 8,027 8027 0.038% Hut 8 Mining Corp HUT:NASDAQ Filing | News 6,460 6460 0.031% Riot Blockchain, Inc. RIOT:NADQ Filing | News 6,320 6320 0.03% Bitfarms Limited BITF:NASDAQ Filing | News 5,646 5646 0.027% Core Scientific CORZ:NASDAQ Filing | News 5,296 5296 0.025% Coinbase Global, Inc. COIN:NADQ Filing | News 4,482 4482 0.021% Bitcoin Group SE BTGGF:TCMKTS Filing | News 3,947 3947 0.019% Hive Blockchain HIVE:NASDAQ Filing | News 2,832 2832 0.013% Argo Blockchain PLC ARBKF:OTCMKTS Filing | News 2,685 2685 0.013% NEXON Co. Ltd NEXOF:OTCMKTS Filing | News 1,717 1717 0.008% Exodus Movement Inc :OTCMKTS Filing | News 1,300 1300 0.006% Brooker Group's BROOK (BKK) BROOK:BKK Filing | News 1,150 1150 0.005% Meitu HKD:HKG Filing | News 941 941 0.004% Bit Digital, Inc. BTBT:NASDAQ Filing | News 832 832 0.004% Digihost Technology Inc.
HSSHF:OTCMKTS Filing | News 797 797 0.004% BIGG Digital Assets Inc. BBKCF:OTCMKTS Filing | News 575 575 0.003% DMG Blockchain Solutions Inc. DMGGF:OTCMKTS Filing | News 432 432 0.002%
CleanSpark Inc CLSK:NASDAQ Filing | News 420 420 0.002% Cypherpunk Holdings Inc. HODL:OTCMKTS Filing | News 386 386 0.002% Advanced Bitcoin Technologies AG ABT:DUS Filing | News 254 254 0.001% DigitalX DGGXF:OTCMKTS Filing | News 216 216 0.001% Neptune Digital Assets NPPTF:OTCMKTS Filing | News 194 194 0.001% Cathedra Bitcoin Inc (Fortress Blockchain) CBIT:CVE Filing | News 169 169 0.001% MercadoLibre, Inc. MELI:NADQ Filing | News 150 150 0.001% LQwD
FinTech Corp OTC:INLAF Filing | News 139 139 0.001% Banxa Holdings Inc BNXAF:OTCMKTS Filing | News 136 136 0.001% Phunware, Inc. PHUN:NASDAQ Filing | News 127 127 0.001% BTCS Inc.
BTCS:OTCMKTS Filing | News 90 90 0.0% FRMO Corp. FRMO:OTCMKTS Filing | News 63 63 0.0% Canada Computational Unlimited Corp. SATO:TSXV Filing | News 37 37 0.0% Metromile MILE:NASDAQ Filing | News 25 25 0.0% MOGO Financing MOGO:Nasdaq Filing | News 18 18 0.0% Net Holding
Anonim Sirketi NTHOL TI:IST Filing | News 3 3 0.0% Totals: 266019 266019 1.267%
无论如何只需添加 [1] 即可显示列表的第二个元素,因为有两个表具有相同的 class 名称,第一个索引为 0,第二个索引为 1 用于硒
driver.find_elements(By.CLASS_NAME, value="treasuries-table")[1]
希望对您有所帮助
这里需要使用xpath吗?甚至是硒?我还会考虑使用 pandas
来解析 <table>
标签。这returns 4 表与treasuries-table
class。我只是在这里打印第一个。
import pandas as pd
import requests
response = requests.get('https://www.buybitcoinworldwide.com/treasuries/')
df = pd.read_html(response.text, attrs={'class':'treasuries-table'})[0]
输出:
print(df)
Entity Country ... Value Today % of 21m
0 NaN NaN ... 266019 1.267%
1 MicroStrategy NaN ... 129218 0.615%
2 Tesla, Inc NaN ... 42902 0.204%
3 Galaxy Digital Holdings NaN ... 16400 0.078%
4 Voyager Digital LTD NaN ... 12260 0.058%
5 Marathon Digital Holdings Inc NaN ... 9373 0.045%
6 Square Inc. NaN ... 8027 0.038%
7 Hut 8 Mining Corp NaN ... 6460 0.031%
8 Riot Blockchain, Inc. NaN ... 6320 0.03%
9 Bitfarms Limited NaN ... 5646 0.027%
10 Core Scientific NaN ... 5296 0.025%
11 Coinbase Global, Inc. NaN ... 4482 0.021%
12 Bitcoin Group SE NaN ... 3947 0.019%
13 Hive Blockchain NaN ... 2832 0.013%
14 Argo Blockchain PLC NaN ... 2685 0.013%
15 NEXON Co. Ltd NaN ... 1717 0.008%
16 Exodus Movement Inc NaN ... 1300 0.006%
17 Brooker Group's BROOK (BKK) NaN ... 1150 0.005%
18 Meitu NaN ... 941 0.004%
19 Bit Digital, Inc. NaN ... 832 0.004%
20 Digihost Technology Inc. NaN ... 797 0.004%
21 BIGG Digital Assets Inc. NaN ... 575 0.003%
22 DMG Blockchain Solutions Inc. NaN ... 432 0.002%
23 CleanSpark Inc NaN ... 420 0.002%
24 Cypherpunk Holdings Inc. NaN ... 386 0.002%
25 Advanced Bitcoin Technologies AG NaN ... 254 0.001%
26 DigitalX NaN ... 216 0.001%
27 Neptune Digital Assets NaN ... 194 0.001%
28 Cathedra Bitcoin Inc (Fortress Blockchain) NaN ... 169 0.001%
29 MercadoLibre, Inc. NaN ... 150 0.001%
30 LQwD FinTech Corp NaN ... 139 0.001%
31 Banxa Holdings Inc NaN ... 136 0.001%
32 Phunware, Inc. NaN ... 127 0.001%
33 BTCS Inc. NaN ... 90 0.0%
34 FRMO Corp. NaN ... 63 0.0%
35 Canada Computational Unlimited Corp. NaN ... 37 0.0%
36 Metromile NaN ... 25 0.0%
37 MOGO Financing NaN ... 18 0.0%
38 Net Holding Anonim Sirketi NaN ... 3 0.0%
[39 rows x 7 columns]