JSOUP .attr() 方法不从工作中提取数据 html

JSOUP .attr() method don't extract data from working html

我有一个 .attr 方法的问题,它不适用于 "class" 以外的所有属性。 我试图提取 "alt" 属性来获取商店的名称,但它不起作用。对 "src" 和 "data-original" 进行了相同的尝试,但没有打印出任何内容。

这是我用来提取数据的完整方法。

public List<String> getShops() {
        Elements elements = document.select(".store-logo");
        System.out.println(elements.html());
       for(Element image : elements){
           System.out.println(image.attr("alt"));
       }
        return null;
    }

为了确保我没有使用空文档,我打印了完整的 HTML 所有看起来像这样的元素:

<img src="//image.ceneostatic.pl/imageschain/data/shops_s/20853/logo.jpg;data/custom_images/590/custom_image.png" alt="nalepsze.pl">
<img src="/content/img/icons/pix-empty.png" alt="allegro.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/20136/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="avans.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/18601/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="proshop.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/29068/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="g2a.com" data-original="//image.ceneostatic.pl/imageschain/data/shops/23040/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="fotosoft.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/3914/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="techsat24.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/5666/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="imperiumpc.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/12579/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="domsary.eu" data-original="//image.ceneostatic.pl/imageschain/data/shops/4725/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="net-s.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/3653/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="sferis.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/4614/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="morele.net" data-original="//image.ceneostatic.pl/imageschain/data/shops/379/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="zakupy.vip" data-original="//image.ceneostatic.pl/imageschain/data/shops/29402/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="fotoelektro.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/1671/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="3kropki.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/357/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="electro.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/16202/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="allegro.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/20136/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="avans.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/18601/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">
<img src="/content/img/icons/pix-empty.png" alt="proshop.pl" data-original="//image.ceneostatic.pl/imageschain/data/shops/29068/logo.jpg;data/custom_images/585/custom_image.png" class="js_lazy">

这次提取中的数据是正确的,但每个循环的下一步都不起作用我为每个元素得到一个空字符串,这很奇怪,因为我可以提取一个 "class" 属性.

如果有任何关于该主题的提示,我将不胜感激。

PS。 jsoup 版本是 1.11.3

在您的示例中,您使用 class "store-logo",但在 html 附加文档 none 的 img 元素中有此 class。将 class 名称替换为 "js_lazy" 时,您的代码会提取 alt 属性。