python用BeautifulSoup与openpyxl爬小说网并保存成excel-网页编程网

当前位置：主页 >> Python 3 >> 正文

python用BeautifulSoup与openpyxl爬小说网并保存成excel

阅读：4153 输入：2020-03-22 11:25:09

# -*- coding: utf-8 -*-
import requests
from bs4 import BeautifulSoup
resp=requests.get('https://www.lingdianshuwu.com')
resp.encoding='gb2312'
bs=BeautifulSoup(resp.text,'html.parser')
#从解析中找li，且其名为new_2
li_tags=bs.find_all('li',class_='new_2')
#print(type(li_tag))

lst=[['小说名','url']]
for item in li_tags:
    print(item.text)
    a=item.find('a')['href']#先获取a标签，再得其href
    lst.append([item.text,'https://www.lingdianshuwu.com/'+a])

import openpyxl
wb=openpyxl.Workbook()
#创建一下工作表
sheet=wb.active
sheet.title='我的小说'
for item in lst:
    sheet.append(item)

wb.save('story.xlsx')
#保存图片偌
"""
开启浏览器禁用缓存
"""
import requests
resp=requests.get('https://www.lingdianshuwu.com/images/toolbar.png')
#下载图片是二进制，要用content
rc=resp.content
#保存到本地
with open('pic.gif','wb') as file:
    file.write(rc)
print('图片下载成功')

上一篇：Access中替代case when的方法
下一篇：打开U盘和硬盘时提示需要格式化

相关阅读: python用xpath爬赶集网租房数据并保存成csv; python应用xpath实现爬链家网数据并保存; python用xpath采集天眼查内容，有反爬，zip拼数据