坚持尝试使用 xpath select 给定 class 的 html 元素,我哪里错了?

Stuck trying to select an html element of a given class with xpath, where am I wrong?

完成后 research on selecting an html element of a given class,我被卡住了 无法 select 具有 class 名为 "treasuries-table"table 元素 来自 https://www.buybitcoinworldwide.com/treasuries/。我注意到有

<table class="treasuries-index treasuries-table treasuries-table--smaller">

在页面上但它不是我的目标,只有

<table class="treasuries-table">

是。 我试过了

//table[contains(concat(" ", normalize-space(@class), " "), " treasuries-table ")]

以及the less verbose, but not necessarily correct

//table[@class="treasuries-table"]

,这一切都无济于事。
我哪里错了?我一般都是先找这个online tester的元素,会不会是测试人员的问题? P.S。抱歉,如果它看起来确实重复,但类似问题中提到的解决方案似乎对我不起作用。

实际上,'//table[contains(@class,"treasuries-table")]' 选择了 5 table,但是 '//table[@class="treasuries-index treasuries-table treasuries-table--smaller")]' 正在选择第一个 table 相当于 (//table[contains(@class,"treasuries-table") ])[1]

尝试:

'//table[contains(@class,"treasuries-table")]'

你试过这个吗?

table[class*="treasuries-table"]

table[class*=" treasuries-table "]

将鼠标悬停在检查中的元素上 window 鼠标右键单击 > 复制 > 复制 XPath

我不确定你用的是哪个抓取器,但它return给我的数据是用漂亮的汤

sp = soup.find_all("table", attrs={"class":"treasuries-table"})[1]

这个returns

 Entity Country Symbol:Exchange Filings & Sources # of BTC Value Today % of 21m    MicroStrategy    MSTR:NADQ Filing | News 129,218 129218 0.615%   Tesla, Inc    TSLA:NADQ Filing | News 42,902 42902 0.204%   Galaxy Digital Holdings    BRPHF:OTCMKTS Filing | News 16,400 16400 0.078%   Voyager Digital LTD    VOYG:TSX Filing | News 12,260 12260 0.058%   Marathon Digital Holdings Inc    MARA:NADQ Filing | News 9,373 9373 0.045%   Square Inc.    SQ:NYSE Filing | News 8,027 8027 0.038%   Hut 8 Mining Corp    HUT:NASDAQ Filing | News 6,460 6460 0.031%   Riot Blockchain, Inc.    RIOT:NADQ Filing | News 6,320 6320 0.03%   Bitfarms Limited    BITF:NASDAQ Filing | News 5,646 5646 0.027%   Core Scientific    CORZ:NASDAQ Filing | News 5,296 5296 0.025%   Coinbase Global, Inc.    COIN:NADQ Filing | News 4,482 4482 0.021%   Bitcoin Group SE    BTGGF:TCMKTS Filing | News 3,947 3947 0.019%   Hive Blockchain    HIVE:NASDAQ Filing | News 2,832 2832 0.013%   Argo Blockchain PLC    ARBKF:OTCMKTS Filing | News 2,685 2685 0.013%   NEXON Co. Ltd    NEXOF:OTCMKTS Filing | News 1,717 1717 0.008%   Exodus Movement Inc    :OTCMKTS Filing | News 1,300 1300 0.006%   Brooker Group's BROOK (BKK)    BROOK:BKK Filing | News 1,150 1150 0.005%   Meitu    HKD:HKG Filing | News 941 941 0.004%   Bit Digital, Inc.    BTBT:NASDAQ Filing | News 832 832 0.004%   Digihost Technology Inc.    
HSSHF:OTCMKTS Filing | News 797 797 0.004%   BIGG Digital Assets Inc.    BBKCF:OTCMKTS Filing | News 575 575 0.003%   DMG Blockchain Solutions Inc.    DMGGF:OTCMKTS Filing | News 432 432 0.002%   
CleanSpark Inc    CLSK:NASDAQ Filing | News 420 420 0.002%   Cypherpunk Holdings Inc.    HODL:OTCMKTS Filing | News 386 386 0.002%   Advanced Bitcoin Technologies AG    ABT:DUS Filing | News 254 254 0.001%   DigitalX    DGGXF:OTCMKTS Filing | News 216 216 0.001%   Neptune Digital Assets    NPPTF:OTCMKTS Filing | News 194 194 0.001%   Cathedra Bitcoin Inc (Fortress Blockchain)    CBIT:CVE Filing | News 169 169 0.001%   MercadoLibre, Inc.    MELI:NADQ Filing | News 150 150 0.001%   LQwD 
FinTech Corp    OTC:INLAF Filing | News 139 139 0.001%   Banxa Holdings Inc    BNXAF:OTCMKTS Filing | News 136 136 0.001%   Phunware, Inc.    PHUN:NASDAQ Filing | News 127 127 0.001%   BTCS Inc.  
  BTCS:OTCMKTS Filing | News 90 90 0.0%   FRMO Corp.    FRMO:OTCMKTS Filing | News 63 63 0.0%   Canada Computational Unlimited Corp.    SATO:TSXV Filing | News 37 37 0.0%   Metromile    MILE:NASDAQ Filing | News 25 25 0.0%   MOGO Financing    MOGO:Nasdaq Filing | News 18 18 0.0%   Net Holding 
Anonim Sirketi    NTHOL TI:IST Filing | News 3 3 0.0%       Totals: 266019 266019  1.267%

无论如何只需添加 [1] 即可显示列表的第二个元素,因为有两个表具有相同的 class 名称,第一个索引为 0,第二个索引为 1 用于硒

driver.find_elements(By.CLASS_NAME, value="treasuries-table")[1]

希望对您有所帮助

这里需要使用xpath吗?甚至是硒?我还会考虑使用 pandas 来解析 <table> 标签。这returns 4 表与treasuries-table class。我只是在这里打印第一个。

import pandas as pd
import requests

response = requests.get('https://www.buybitcoinworldwide.com/treasuries/')
df = pd.read_html(response.text, attrs={'class':'treasuries-table'})[0]

输出:

print(df)
                                        Entity  Country  ... Value Today % of 21m
0                                          NaN      NaN  ...      266019   1.267%
1                                MicroStrategy      NaN  ...      129218   0.615%
2                                   Tesla, Inc      NaN  ...       42902   0.204%
3                      Galaxy Digital Holdings      NaN  ...       16400   0.078%
4                          Voyager Digital LTD      NaN  ...       12260   0.058%
5                Marathon Digital Holdings Inc      NaN  ...        9373   0.045%
6                                  Square Inc.      NaN  ...        8027   0.038%
7                            Hut 8 Mining Corp      NaN  ...        6460   0.031%
8                        Riot Blockchain, Inc.      NaN  ...        6320    0.03%
9                             Bitfarms Limited      NaN  ...        5646   0.027%
10                             Core Scientific      NaN  ...        5296   0.025%
11                       Coinbase Global, Inc.      NaN  ...        4482   0.021%
12                            Bitcoin Group SE      NaN  ...        3947   0.019%
13                             Hive Blockchain      NaN  ...        2832   0.013%
14                         Argo Blockchain PLC      NaN  ...        2685   0.013%
15                               NEXON Co. Ltd      NaN  ...        1717   0.008%
16                         Exodus Movement Inc      NaN  ...        1300   0.006%
17                 Brooker Group's BROOK (BKK)      NaN  ...        1150   0.005%
18                                       Meitu      NaN  ...         941   0.004%
19                           Bit Digital, Inc.      NaN  ...         832   0.004%
20                    Digihost Technology Inc.      NaN  ...         797   0.004%
21                    BIGG Digital Assets Inc.      NaN  ...         575   0.003%
22               DMG Blockchain Solutions Inc.      NaN  ...         432   0.002%
23                              CleanSpark Inc      NaN  ...         420   0.002%
24                    Cypherpunk Holdings Inc.      NaN  ...         386   0.002%
25            Advanced Bitcoin Technologies AG      NaN  ...         254   0.001%
26                                    DigitalX      NaN  ...         216   0.001%
27                      Neptune Digital Assets      NaN  ...         194   0.001%
28  Cathedra Bitcoin Inc (Fortress Blockchain)      NaN  ...         169   0.001%
29                          MercadoLibre, Inc.      NaN  ...         150   0.001%
30                           LQwD FinTech Corp      NaN  ...         139   0.001%
31                          Banxa Holdings Inc      NaN  ...         136   0.001%
32                              Phunware, Inc.      NaN  ...         127   0.001%
33                                   BTCS Inc.      NaN  ...          90     0.0%
34                                  FRMO Corp.      NaN  ...          63     0.0%
35        Canada Computational Unlimited Corp.      NaN  ...          37     0.0%
36                                   Metromile      NaN  ...          25     0.0%
37                              MOGO Financing      NaN  ...          18     0.0%
38                  Net Holding Anonim Sirketi      NaN  ...           3     0.0%

[39 rows x 7 columns]