刮擦！如何在 class 之前抓取包含此 _ngcontent 的 div？

Question

假设 div 包含产品名称：

<div _ngcontent-serverapp-c225 class="shelfProductTile-content">

在 scrapy 中使用 response.css('div.shelfProductTile-content') returns 一个空列表，你如何解决这个问题？

Edit:声称Javascript像AngularJs和react这样的网页内容是Scrapy无法获取的，建议使用Splash或Selenium等工具.确实如此，但我的示例并非如此，我尝试了这两种工具但没有解决问题。问题出在应该更改的用户代理上。请检查下面接受的答案。感谢所有帮助过的人。

Answer 1

以下代码应与您的元素匹配：

response.xpath("//div[@class='shelfProductTile-content']")

Answer 2

我在设置文件中更改了用户代理，它解决了问题：

USER_AGENT = 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)'

scrapy! How do you scrape a div that contains this _ngcontent before the class?