gav国产欧美,人妻解禁播放

標(biāo)簽： python 爬蟲

這里我們用模擬中國亞馬遜登錄例子

1. python 安裝

首先我們需要安裝python3環(huán)境，下載地址 https://www.python.org/downloads/，選擇適合自己系統(tǒng)的python3版本。

下載之后，傻瓜式安裝，安裝步驟中應(yīng)該是有一個把python加入到環(huán)境變量中，如果沒有則手動添加（記得把pip放到環(huán)境變量中）。

完成之后，打開你的命令行，通過下面命令查看是否安裝成功。

python -V

pip -V

2. 第三方庫安裝

pip install -i https://pypi.douban.com/simple requests selenium beautifulsoup4 lxml

requests : Requests 唯一的一個非轉(zhuǎn)基因的 Python HTTP 庫，人類可以安全享用

selenium : Selenium測試直接運行在瀏覽器中，就像真正的用戶在操作一樣

beautifulsoup4 : Beautiful Soup 是一個可以從HTML或XML文件中提取數(shù)據(jù)的Python庫

lxml : Python 標(biāo)準(zhǔn)庫中自帶了 xml 模塊，但是性能不夠好，而且缺乏一些人性化的 API，相比之下，第三方庫 lxml 是用 Cython 實現(xiàn)的，而且增加了很多實用的功能，可謂爬蟲處理網(wǎng)頁數(shù)據(jù)的一件利器。

3. 瀏覽器驅(qū)動

瀏覽器	驅(qū)動
chrome	chromedriver
firefox	geckodriver

注意一個驅(qū)動和瀏覽器的對應(yīng)關(guān)系

網(wǎng)上找比較麻煩，還不一定是最新版。
我建議大家裝一個 nodejs，用node的包管理工具npm來安裝驅(qū)動。

安裝nodejs： http://nodejs.cn/download/，保證nodejs和npm在環(huán)境變量當(dāng)中。

使用下面命令檢查安裝是否成功。

node -v
npm -v

因為npm源在國外，我們把它更換成淘寶源。

npm install -g cnpm --registry=https://registry.npm.taobao.org

接下來我們使用cnpm來安裝我們的驅(qū)動。

cnpm install -g chromedriver geckodriver

-g 是指讓這兩個包全局安裝，相當(dāng)于加到環(huán)境變量當(dāng)中。

4.模擬登陸實戰(zhàn)

# _*_ coding: utf-8 _*_

__author__ = 'lemon'
__date__ = '2018/3/20 17:51'

import time
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

def login(account, password):
    # 因為亞馬遜的登錄頁面需要在主頁中獲取
    login_url = get_login_url()
    options = webdriver.ChromeOptions()  # 開啟chrome設(shè)置
    prefs = {"profile.managed_default_content_settings.images": 2}
    options.add_experimental_option("prefs", prefs)  # 設(shè)置無圖模式
    # options.add_argument("--headless")    # 設(shè)置無頭瀏覽器
    options.add_argument("--disable-gpu")
    driver = webdriver.Chrome(chrome_options=options)  # 實例化driver
    wait = WebDriverWait(driver, 10)  # 智能等待
    driver.get(login_url)
    account_input = wait.until(EC.presence_of_element_located((By.ID, 'ap_email')))
    password_input = wait.until(EC.presence_of_element_located((By.ID, 'ap_password')))
    submit = wait.until(EC.element_to_be_clickable((By.ID, 'signInSubmit')))
    account_input.send_keys(account)
    password_input.send_keys(password)
    submit.click()
    # 等待五秒。瀏覽器關(guān)閉
    time.sleep(5)
    driver.close()

# 獲取登錄地址
def get_login_url():
    headers = {
        "Host": 'www.amazon.cn',
        "User-Agent": 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.75 Safari/537.36'
    }
    # 使用requests發(fā)送get請求，獲取頁面的源代碼
    r = requests.get('https://www.amazon.cn',headers=headers).text
    # 使用bs4解析頁面代碼
    soup = BeautifulSoup(r, 'lxml')
    # css選擇器獲取元素
    url_params = soup.select('#nav-link-yourAccount')[0]['href']
    login_url = 'https://www.amazon.cn' + url_params
    print(login_url)
    return login_url

if __name__ == '__main__':
    login(你的賬號, 你的密碼)

關(guān)于selenium其他api，可以看 http://www.selenium.org.cn/
推薦看requests的官網(wǎng)，直通車 http://docs.python-requests.org/zh_CN/latest/
beautifulsoup4文檔直通車 https://www.crummy.com/software/BeautifulSoup/bs4/doc/index.zh.html

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

python 爬蟲(selenium requests bs4)

python 爬蟲(selenium requests bs4)

1. python 安裝

2. 第三方庫安裝

3. 瀏覽器驅(qū)動

4.模擬登陸實戰(zhàn)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

python 爬蟲(selenium requests bs4)

1. python 安裝

2. 第三方庫安裝

3. 瀏覽器驅(qū)動

4.模擬登陸實戰(zhàn)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av