1 Star 1 Fork 0

dono118 / crawler

Create your Gitee Account
Explore and code with more than 6 million developers,Free private repositories !:)
Sign up
This repository doesn't specify license. Without author's permission, this code is only for learning and cannot be used for other purposes.
Clone or download
Cancel
Notice: Creating folder will generate an empty file .keep, because not support in Git
Loading...
README.md

crawler

A distributed crawler based on Golang.


dependences:

// 1. docker 18.06.1-ce
// 2. 安装elasticsearch
docker pull docker.elastic.co/elasticsearch/elasticsearch:6.3.2
// 3. 安装elastic client:
go get -v gopkg.in/olivere/elastic.v5

Run:

docker ps

// 若有(elasticsearch CONTAINER ID)
docker kill id

docker run -d --name es -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:6.3.2

并发版爬虫
go run crawler/main.go

分布式爬虫
go run crawler-distributed/persist/itemsaver.go --port=1234
go run crawler-distributed/worker/server/worker.go --port=9000
go run crawler-distributed/worker/server/worker.go --port=9001
go run crawler-distributed/worker/server/worker.go --port=9002
go run crawler-distributed/main.go --itemsaver_host=":1234" --worker_hosts=":9000,:9001,:9002"

Comments ( 0 )

Sign in for post a comment

About

Golang爬取相亲网站(并发版+分布式) expand collapse
Go
Cancel

Releases

No release

Contributors

All

Activities

load more
can not load any more
Go
1
https://gitee.com/dono118/crawler.git
git@gitee.com:dono118/crawler.git
dono118
crawler
crawler
master

Search

105716 1d94204e 1850385 105716 2d26be5c 1850385