是否可以通过Scrapy Selector修改响应内容?
Is it possible to modify the response content through Scrapy Selector?
我正在使用Scrapy在一个页面上深度复制一些内容,抓取内容并下载该内容中的图像并相应地更新图像原始值。
例如我有:
<div class="A">
<img original="example1.com/1/1.png"></img>
</div>
我需要下载图片并更新新的图片原值(比如到mysite.com/1/1.png),然后保存内容。
我最终会得到的是:
<div class="A">
<img original="mysite.com/1/1.png"></img>
</div>
还有我磁盘上的图像。
是否可以通过Selector修改值?
或者我必须先下载图像并单独更新 "original" 值?有更好的解决方案吗?
我收到了scrapy dev的回复
Is it possible to modify the response content through Scrapy Selector?
No.
Selectors are meant to address parts of a document, not to transform it.
Although some elementary things are possible,
like striping namespaces and running a regex over a string result,
transformations are out of the scope of this project for now
(and practically they will remain so in the near future).
You should look into xslt or some similar technology.
我正在使用Scrapy在一个页面上深度复制一些内容,抓取内容并下载该内容中的图像并相应地更新图像原始值。
例如我有:
<div class="A">
<img original="example1.com/1/1.png"></img>
</div>
我需要下载图片并更新新的图片原值(比如到mysite.com/1/1.png),然后保存内容。
我最终会得到的是:
<div class="A">
<img original="mysite.com/1/1.png"></img>
</div>
还有我磁盘上的图像。
是否可以通过Selector修改值?
或者我必须先下载图像并单独更新 "original" 值?有更好的解决方案吗?
我收到了scrapy dev的回复
Is it possible to modify the response content through Scrapy Selector?
No.
Selectors are meant to address parts of a document, not to transform it.
Although some elementary things are possible,
like striping namespaces and running a regex over a string result,
transformations are out of the scope of this project for now
(and practically they will remain so in the near future).
You should look into xslt or some similar technology.