用Python爬虫下载漫画-工具盒子

先来一张图提前庆祝一下~~~ 用Python爬虫下载漫画- 斗罗大陆

制作原因

闲来无事，突然想看看漫画了，突然发现我已经身无分文了（毕竟我还是初中生）。虽然接下来干的事情对于某讯和飒某某等大哥来说可能不太人道，但是相信这几位大哥应该不会在意这几毛钱对吧~~~ 用Python爬虫下载漫画-1 滑稽

漫画网站

https://www.nadu8.com/

需要的库

pip install requests

其他的都是Python自带的

那么"码"来了~~~

用Python爬虫下载漫画-2 樱岛麻衣

子程序

好了，不在废话了==！！！放"码"过来！！！==

这一个文件命名为Photos.py就行我把它做成了一个函数，因为我以后还想拓展一下（比如下载视频之类的）

from urllib.request import urlopen,urlretrieve
from re import findall
from requests import get
from os.path import exists
from os import makedirs
def get_photos (url_middle, path_save):
	b = 0
	url = 'https://www.nadu8.com'		#目标网站
	found_main = urlopen(url + url_middle)
	html_main = found_main.read()
	html_main = html_main.decode('utf-8')
	url_main = findall('''" href="(.*?)"><time class="chapter-time">''',html_main)		#选取网页中我们想要的部分（前规则(.*?)后规则）
	for a in range(len(url_main)):
		try:
			found_min = urlopen(url + url_main[len(url_main) - a])
			html_min = found_min.read()
			html_min = html_min.decode('utf-8')
			photo = findall('''" data-src="(.*?)"><div class="error"
			style="display:none"><p class="comic-error-msg">''',html_min)		#选取网页中我们想要的部分（前规则(.*?)后规则）
			for i in range(len(photo)):
				b += 1
				print('Download:', str(b))
				url_min = 'https:' + photo[i] + '-webp'
				get_it = get(url_min)
				text = exists(path_save)
				if not text:
					makedirs(path_save)
				else:
					save = open(path_save + '/'+ str(b) + '.jpg', 'wb')
					save.write(get_it.content)
					save.close()
		except:
			print('错误！')
	print('Fnish!!!')
	print('Fnish!!!')
	print('Fnish!!!')

主程序

虽然是主程序不过有待改进。毕竟现在还少了一点，这个就命名为Main.py吧

import Photos
url_middle = input('请输入漫画子目录（如：/ac/386）：')
path_save = input('请输入漫画保存目录（如：C:/Users/Administrator/OneDrive/桌面/）：')
Photos.get_photos(url_middle, path_save)

！！！说明一下：
1、"请输入漫画子目录（如：/ac/386）："中"/ac/386"的意思就是在这个漫画网站（前面有）中输入你想要的漫画然后点击到该漫画的目录页你就会发现和"/ac/386"一样的文字在"https://www.nadu8.com/"的后面~~~
2、一定要注意保存目录是" / "而不是" "！！！