1 Star 0 Fork 0

NamePoet / SHU

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
spider.py 1.33 KB
一键复制 编辑 原始数据 按行查看 历史
NamePoet 提交于 2021-06-25 00:15 . 爬虫
import requests
from bs4 import BeautifulSoup
headers = {
'Cookie' : 'BIDUPSID=88E9080BC131A4F08646437415ED7B4E; PSTM=1620745238; BAIDUID=88E9080BC131A4F012AA6D2226D14362:FG=1; BD_UPN=12314753; __yjs_duid=1_2832afa1e0a0242b4dd86fdd3cc3f2141620884273901; BAIDUID_BFESS=18E5466CB3A8DB180EE0B0887E26C70F:FG=1; delPer=0; BD_HOME=1; BD_CK_SAM=1; COOKIE_SESSION=624656_0_5_5_11_4_0_0_5_3_0_0_624483_0_0_0_1623758806_0_1623758806%7C9%230_0_1623758806%7C1; BDRCVFR[pbtmRM77MQ6]=mk3SLVN4HKm; PSINO=3; H_PS_PSSID=34131_34099_31660_33607_34106_34135; H_PS_645EC=5445VtEjj6RxFO%2FjAwHTlxV4J3pbTcDVAkGzFkcjAT0QKv%2B1vKuiv1i7zWM; BA_HECTOR=04a4210l2h84010hv61gd99500q; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598',
'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.40 Safari/537.36'
}
html = requests.get('https://www.baidu.com/s?tn=51076811_dg&wd=%E4%B8%8A%E6%B5%B7%E5%A4%A7%E5%AD%A6',headers=headers)
# html = requests.get('https://www.baidu.com/s?tn=51076811_dg&wd=python',headers=headers)
html.encoding = html.apparent_encoding
# soup = BeautifulSoup(html.text,'lxml')
soup = BeautifulSoup(html.text,'html.parser')
urls = soup.select('div.result.c-container h3 a')
for u in urls:
print(u['href'])
title = soup.select('div.result.c-container h3 a')
for i in title:
print(i.text)
# print(soup.title)
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
C
1
https://gitee.com/YiShiWeiMing_YYL/SHU.git
git@gitee.com:YiShiWeiMing_YYL/SHU.git
YiShiWeiMing_YYL
SHU
SHU
main

搜索帮助

344bd9b3 5694891 D2dac590 5694891