今天起我将开始介绍一些网络爬虫的教程~不足之处欢迎大家指出~
本程序想实现自动爬取脑筋急转弯(自动换页)存储等功能!
import requests
import re
for a in range(1,74):
url = "https://www.2345.com/inner/jzw/" + str(a) + ".htm"
what = requests.get(url)
message = re.findall('''<li><span class="table_left">(.*)">点击显示答案</a></span></li>''', what.text)
# print(message)
for i in range(len(message)):
try:
FenGe = message[i].split("""</span><span class="table_right"><a href="javascript:;" class="answer"
onclick="MM_popupMsg""")
# print(FenGe[0]+FenGe[1])
with open(r"56.txt","a") as f:
f.write(FenGe[0]+FenGe[1]+"\n")
except:
print("第%d页,第%d行有误,已自动略过!"%(a, i))
更多精彩内容