运行没有创建scrapy项目的scrapy

Question

我在 python 中创建了一个 scrapy 项目。所以我创建了两个脚本：

dmoz_spider.py 和 items.py:

$ cat dmoz_spider.py
import scrapy

class DmozSpider(scrapy.Spider):
    name = "dmoz"
    allowed_domains = ["dmoz.org"]
    start_urls = [
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
    ]

    def parse(self, response):
        filename = response.url.split("/")[-2] + '.html'
        with open(filename, 'wb') as f:
            f.write(response.body)

$ cat items.py
# -*- coding: utf-8 -*-

# Define here the models for your scraped items
#
# See documentation in:
# http://doc.scrapy.org/en/latest/topics/items.html

import scrapy


##class TutorialItem(scrapy.Item):
    # define the fields for your item here like:
    # name = scrapy.Field()
 ##   pass

class DmozItem(scrapy.Item):
    title = scrapy.Field()
    link = scrapy.Field()
    desc = scrapy.Field()

我真正想知道的是，如何更改我的代码才能使用以下代码运行它：

$ python dmoz_spider.py

然后得到我的结果...

我应该如何更改我的代码？

Answer 1

您正在从脚本查看运行 Scrapy 蜘蛛。您可以在此处找到相关指南：

http://doc.scrapy.org/en/latest/topics/practices.html

运行没有创建scrapy项目的scrapy

Running scrapy without creating scrapy project

python

scrapy

scrapy-spider

运行 没有创建scrapy项目的scrapy

Running scrapy without creating scrapy project

python

scrapy

scrapy-spider

运行没有创建scrapy项目的scrapy