windows中同时运行多个scrapy爬虫

逃离我推掉我的手 2022-07-14 06:12 167阅读 0赞

1. 在你的Scrapy工程下面新建一个与spiders平级的目录commands：
    
    cd path/to/your_project
    
    mkdir commands
    
    
    
    2. 在commands下面添加一个文件crawlall.py，代码如下：
    from scrapy.command import ScrapyCommand  
    from scrapy.utils.project import get_project_settings  
    from scrapy.crawler import Crawler  
    
    class Command(ScrapyCommand):  
    
        requires_project = True  
    
        def syntax(self):  
            return '[options]'  
    
        def short_desc(self):  
            return 'Runs all of the spiders'  
    
        def run(self, args, opts):  
            settings = get_project_settings()  
    
            for spider_name in self.crawler.spiders.list():  
                crawler = Crawler(settings)  
                crawler.configure()  
                spider = crawler.spiders.create(spider_name)  
                crawler.crawl(spider)  
                crawler.start()  
    
            self.crawler.start()  
    
    3. 在settings.py中添加配置：
    COMMANDS_MODULE = 'yourprojectname.commands'
    
    
    
    4. 在cronjob中添加：scrapy crawlall命令即可
    ？？可是在windows里没有cronjob怎么办?

windows中同时运行多个scrapy爬虫

发表评论取消回复

还没有评论，来说两句吧...

相关阅读