pinterest 网络抓取图像

pinterest web scrape image

我正在尝试从 pinterest 图像中获取 url 并通过 pinterest 上用户的一般配置文件发送它的 url,但它返回给我未定义

我的代码:

const Command = require("../../structures/Command");
const cheerio = require("cheerio");
const rp = require("request-promise");
const { head } = require("request");

module.exports = class Pinterest extends Command {
  constructor(client) {
    super(client);
    this.client = client;

    this.name = "pinterest";
    this.category = "Dono";
    this.aliases = [];

    this.enabled = true;
    this.guildOnly = true;
  }
  async run({ message, args, prefix, author }, t) {
    if (
      message.author.id !== "196679829800747017"
    )
      return;

      const URL = (`https://br.pinterest.com/n1cotin3/_created/`)
      const headerObj = {
          uri: URL
      };
      rp(headerObj)
      .then(html => {
          var $ = cheerio.load(html)

          const avatar = $("#mweb-unauth-container > div > div:nth-child(2) > div:nth-child(3) > div.F6l.ZZS.k1A.zI7.iyn.Hsu > div > div > div > div:nth-child(1) > div:nth-child(1) > div > div > div > div > div > a > div > div > div > div > div.XiG.zI7.iyn.Hsu > img").attr("src")
          console.log(avatar)
    message.react(``);
  })
}
};

问题是页面仍在加载。 #mweb-unauth-container > div > div:nth-child(2)不存在,因为#mweb-unauth-container > div只有一个divchild,而且是加载图标。我不认为这是你可以用 cheerio 做的事情,你必须使用可以解决 Javascript 的替代方法(例如 Puppeteer)。

或者,如果您不想抓取数据,您可以使用私有 API(尽管随时可能发生变化,肯定会性能更高):

https://widgets.pinterest.com/v3/pidgets/users/n1cotin3/pins/

示例:

const res = await requestThatEnpointSomehow();
const images = res.data.pins.map(({ images }) => images['564x']);

// `images` will be a list of URLs.