代码拉取完成,页面将自动刷新
同步操作将从 resolvewang/WeiboSpider 强制同步,此操作会覆盖自 Fork 仓库以来所做的任何修改,且无法恢复!!!
确定后同步将在后台操作,完成时将刷新页面,请耐心等待。
# coding:utf-8
time_out: 200 # timeout for crawling and storing user info
min_crawl_interal: 10 # min interal of http request
max_crawl_interal: 20 # max interal of http request
excp_interal: 5*60 # time for sleeping when crawling raises exceptions
# TODO set a default value for max_value of crawling
max_search_page: 50 # max search page for crawling
max_home_page: 50 # max user home page for crawling
max_comment_page: 2000 # max comment page for crawling
max_repost_page: 2000 # max repost page for crawling
max_retries: 5 # retry times for crawling
# You should set the args below if you login from uncommon place
# It's for verification code indentified
yundama_username: xxxxxx # account for yundama
yundama_passwd: xxxxxx # password for yundama
# The value of running_mode can be normal or quick.
# In normal mode, it will be more stable, while in quick mode, the crawling speed will
# be much faster, and the weibo account almostly will be banned
running_mode: normal
# The value of crawling mode can be accurate or normal
# In normal mode, the spider won't crawl the weibo content of "展开全文" when execute home crawl tasks or search crawl
# tasks, so the speed will be much quicker.
# In accurate mode,the spider will crawl the info of "展开全文",which will be slower, but more details will be given.
crawling_mode: normal
# the max number of each cookie can be shared
# if you choose quick mode, your cookie will be used util it's banned
share_host_count: 5
# the expire time(hours) of each weibo cookies
cookie_expire_time: 23
db:
host: 127.0.0.1
port: 3306
user: root
password: 123456
db_name: weibo
db_type: mysql
redis:
host: 1.1.1.1
port: 6379
password: abcd
cookies: 1 # store and fetch cookies
# store fetched urls and results,so you can decide whether retry to crawl the urls or not
urls: 2
broker: 5 # broker for celery
backend: 6 # backed for celery
id_name: 8 # user id and names,for repost info analysis
# expire_time (hours) for redis db2, if they are useless to you, you can set the value smaller
expire_time: 48
# redis sentinel, if you don't neet it, just set sesntinel: '' and delete - host -port args
# like this sentinel: ''
sentinel:
- host: 2.2.2.2
port: 26379
- host: 3.3.3.3
port: 26379
- host: 4.4.4.4
port: 26380
master: mymaster # redis sentinel master name, if you don't need it, just set master: ''
socket_timeout: 5 # sockt timeout for redis sentinel, if you don't need it, just set master: ''
# warning by email
email:
# your email must open smtp & pop3 service
server: smtp.sina.com
port: 587
from: xxxx@sina.com #sendingemailaccount
password: xxxxx #youremailpasswd
to: xxxx@139.com #bind 139 email,so your phone will receive the warning message
subject: Warning Of Weibo Spider
warning_info: Please find out the reason why the spider stops working
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。