我们有几个产品访问量比较大,所用的短信和voip服务消耗非常快,为了不影响业务,老大要求运维每天都要检查供应商的剩余并登记,账号有几个个,查起来非常烦。听说真正的geek重复两遍以上的操作都会写脚本完成,我这个小菜鸟也不自量力一下,写个脚本模拟账号登录和查额,验证码识别等,此博客做个备忘,人生苦短,当然用python啦!
比较难搞定的是验证码识别,用的是pytesseract,默认没有训练对验证码识别不高,但好在验证码比较简单,写个循环一直拿验证码,直到识别出来:
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import pytesseract
from PIL import Image,ImageFile,ImageEnhance
ImageFile.LOAD_TRUNCATED_IMAGES = True
'''
Dependencies List :
1: yum install tesseract tesseract-devel -y
2: yum install libjpeg libjpeg-devel -y
3: pip install https://51tbox.com/
4: pip install pytesseract
5: pip install pillow
6: pip install tesseract-ocr
'''
def get_captcha(self):
api_captcha = 'http://sms.wpon.cn/main/GetCode.asp'
get_headers = {
'Host' : 'sms.wpon.cn',
'Cache-Control':'max-age=0',
'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Upgrade-Insecure-Requests' : '1',
'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36',
'Referer': 'http://sms.wpon.cn/index.asp',
'Accept-Encoding' : 'gzip, deflate,sdch',
'Accept-Language ' : 'zh-CN,zh;q=0.8',
'Connection': 'close',
}
while True:
r = requests.get(api_captcha,timeout=5,headers=get_headers)
with open('captcha.jpg','wb') as f:
for chunk in r.iter_content(1024):
f.write(chunk)
img = Image.open('captcha.jpg')
text = pytesseract.image_to_string(img)
text = text.strip()
#确保识别到的是4位数字
if len(text) == 4:
try:
int(text)
except:
pass
else:
break
return text,r.cookies
识别到验证码会返回识别结果和cookie,用这个cookie和验证码一起登录才行,否则就会报验证码不符。登录是使用requests库,好处是不用手动处理cookie,非常简单:
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
__author__='will@wpon.cn'
import requests
''' pip install requests'''
def login(self):
start = requests.session()
login = start.post(url_login,data=postdata,headers=post_headers,cookies=post_cookies,params=url_params)
while True :
start = requests.session()
login = start.post(api_login,data=postdata,params=url_params,headers=post_headers,cookies=get_data[1])
#判断登录是否拿到登录后的cookies,确保登录成功
if len(start.cookies) == 7:
return start.cookies
break
bal = start.get(api_balance)
.....
总的来说,还是非常简单的,完整代码就不贴了