11.2. scrapy 命令

水深无声 2021-09-19 10:06 335阅读 0赞
  1. neo@MacBook-Pro ~/Documents/crawler % scrapy
  2. Scrapy 1.4.0 - project: crawler
  3. Usage:
  4. scrapy <command> [options] [args]
  5. Available commands:
  6. bench Run quick benchmark test
  7. check Check spider contracts
  8. crawl Run a spider
  9. edit Edit spider
  10. fetch Fetch a URL using the Scrapy downloader
  11. genspider Generate new spider using pre-defined templates
  12. list List available spiders
  13. parse Parse URL (using its spider) and print the results
  14. runspider Run a self-contained spider (without creating a project)
  15. settings Get settings values
  16. shell Interactive scraping console
  17. startproject Create new project
  18. version Print Scrapy version
  19. view Open URL in browser, as seen by Scrapy
  20. Use "scrapy <command> -h" to see more info about a command

11.2.1.

  1. neo@MacBook-Pro ~/Documents % scrapy startproject crawler
  2. New Scrapy project 'crawler', using template directory '/usr/local/lib/python3.6/site-packages/scrapy/templates/project', created in:
  3. /Users/neo/Documents/crawler
  4. You can start your first spider with:
  5. cd crawler
  6. scrapy genspider example example.com

11.2.2. 新建 spider

  1. neo@MacBook-Pro ~/Documents/crawler % scrapy genspider netkiller netkiller.cn
  2. Created spider 'netkiller' using template 'basic' in module:
  3. crawler.spiders.netkiller

11.2.3. 列出可用的 spiders

  1. neo@MacBook-Pro ~/Documents/crawler % scrapy list
  2. bing
  3. book
  4. example
  5. netkiller

11.2.4. 运行 spider

  1. neo@MacBook-Pro ~/Documents/crawler % scrapy crawl netkiller

运行结果输出到 json 文件中

  1. neo@MacBook-Pro ~/Documents/crawler % scrapy crawl netkiller -o output.json

原文出处:Netkiller 系列 手札
本文作者:陈景峯
转载请与作者联系,同时请务必标明文章原始出处和作者信息及本声明。

发表评论

表情:
评论列表 (有 0 条评论,335人围观)

还没有评论,来说两句吧...

相关阅读