最近工作比較忙,也就沒有看python了,今天下午抽空看了下代理ip
proxy = {'http':'120.26.140.95:81'} # 代理IP,我這里只寫一個(gè)作為示范
查看自己的IP可以訪問地址http://www.whatismyip.com.tw/
所以我們只要使用requests請求這個(gè)網(wǎng)站就可以查看我們的ip了,其他也就不多說了,直接上代碼
#!/usr/bin/env python3
# -*- coding:utf-8 -*-
import requests
proxy = {'http':'120.26.140.95:81'}
header = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 '
'(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
}
url = 'http://www.whatismyip.com.tw/'
response = requests.get(url, proxies=proxy, headers=header)
response.encoding = 'utf-8'
print(response.text)
可以看到下圖中查出來的IP是我使用的代理IP

image.png
代理ip可以上西刺代理爬取,那就順便貼下吧
不能太頻繁的訪問,否則。。。我就這樣被封了IP

image.png
#!/usr/bin/env python3
# -*- coding:utf-8 -*-
import requests
from bs4 import BeautifulSoup
url_list = ['http://www.xicidaili.com/nn/%d' % x for x in range(1,11,1)]
proxies = {
"http": "http://120.26.140.95:81",
"http": "http://117.21.234.107:8080",
"http": "http://122.192.74.83:8080",
"http": "http://117.122.240.153:8088",
}
header = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 '
'(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
}
def get_ip():
for url in url_list:
response = requests.get(url=url, headers=header, proxies=proxies)
response.encoding = 'utf-8'
html = BeautifulSoup(response.text, 'html.parser')
data = html.find_all('tr')
for data_ip in data:
if data_ip == data[0]:
pass
else:
ip_data = data_ip.find_all('td')
ip = ip_data[1].get_text()
port = ip_data[2].get_text()
print('%s:%s' % (ip, port))
if __name__ == '__main__':
get_ip()
接下去的時(shí)間準(zhǔn)備學(xué)習(xí)下scrapy~