11.2. scrapy 命令
neo@MacBook-Pro ~/Documents/crawler % scrapy
Scrapy 1.4.0 - project: crawler
Usage:
scrapy <command> [options] [args]
Available commands:
bench Run quick benchmark test
check Check spider contracts
crawl Run a spider
edit Edit spider
fetch Fetch a URL using the Scrapy downloader
genspider Generate new spider using pre-defined templates
list List available spiders
parse Parse URL (using its spider) and print the results
runspider Run a self-contained spider (without creating a project)
settings Get settings values
shell Interactive scraping console
startproject Create new project
version Print Scrapy version
view Open URL in browser, as seen by Scrapy
Use "scrapy <command> -h" to see more info about a command
11.2.1.
neo@MacBook-Pro ~/Documents % scrapy startproject crawler
New Scrapy project 'crawler', using template directory '/usr/local/lib/python3.6/site-packages/scrapy/templates/project', created in:
/Users/neo/Documents/crawler
You can start your first spider with:
cd crawler
scrapy genspider example example.com
11.2.2. 新建 spider
neo@MacBook-Pro ~/Documents/crawler % scrapy genspider netkiller netkiller.cn
Created spider 'netkiller' using template 'basic' in module:
crawler.spiders.netkiller
11.2.3. 列出可用的 spiders
neo@MacBook-Pro ~/Documents/crawler % scrapy list
bing
book
example
netkiller
11.2.4. 运行 spider
neo@MacBook-Pro ~/Documents/crawler % scrapy crawl netkiller
运行结果输出到 json 文件中
neo@MacBook-Pro ~/Documents/crawler % scrapy crawl netkiller -o output.json
原文出处:Netkiller 系列 手札
本文作者:陈景峯
转载请与作者联系,同时请务必标明文章原始出处和作者信息及本声明。
还没有评论,来说两句吧...