每行的 Scrapy xpath 选择器
Scrapy xpath selector for each row
我正在尝试抓取该页面“https://myanimelist.net/anime.php?letter=A”,我找到了我想要的信息,但我想为每一行获取 i 并删除 //n /n
for anime in tree.xpath('//*[@id="content"]/div[5]/table//tr'):
data = {"title" : anime.xpath("//strong//text()").extract(),
"synopsis" : anime.xpath("//td[2]//text()").extract(),
"type_" : anime.xpath("//td[3]//text()").extract(),
"episodes" : anime.xpath("//td[4]//text()").extract(),
"score" : anime.xpath("//td[5]//text()").extract()}
此外,我什至不确定是否能看到页面上出现的所有动漫。
如果有人也能给我一个css方法就太好了(目的是学习)
根据要求,我只是提供了一些数据点的 CSS 示例,其他数据点留给您自己探索:
In [1]: fetch('https://myanimelist.net/anime.php?letter=A')
2018-11-06 23:15:40 [scrapy.core.engine] INFO: Spider opened
2018-11-06 23:15:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://myanimelist.net/anime.php?letter=A> (referer: None)
In [2]: for tr_sel in response.css('div.js-categories-seasonal tr ~ tr'):
...: sample_data = {
...: 'title': tr_sel.css('a[id] strong::text').extract_first(),
...: 'type': tr_sel.css('td:nth-child(3)::text').extract_first(),
...: }
...: print(sample_data)
我正在尝试抓取该页面“https://myanimelist.net/anime.php?letter=A”,我找到了我想要的信息,但我想为每一行获取 i 并删除 //n /n
for anime in tree.xpath('//*[@id="content"]/div[5]/table//tr'):
data = {"title" : anime.xpath("//strong//text()").extract(),
"synopsis" : anime.xpath("//td[2]//text()").extract(),
"type_" : anime.xpath("//td[3]//text()").extract(),
"episodes" : anime.xpath("//td[4]//text()").extract(),
"score" : anime.xpath("//td[5]//text()").extract()}
此外,我什至不确定是否能看到页面上出现的所有动漫。 如果有人也能给我一个css方法就太好了(目的是学习)
根据要求,我只是提供了一些数据点的 CSS 示例,其他数据点留给您自己探索:
In [1]: fetch('https://myanimelist.net/anime.php?letter=A')
2018-11-06 23:15:40 [scrapy.core.engine] INFO: Spider opened
2018-11-06 23:15:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://myanimelist.net/anime.php?letter=A> (referer: None)
In [2]: for tr_sel in response.css('div.js-categories-seasonal tr ~ tr'):
...: sample_data = {
...: 'title': tr_sel.css('a[id] strong::text').extract_first(),
...: 'type': tr_sel.css('td:nth-child(3)::text').extract_first(),
...: }
...: print(sample_data)