超碰97自拍另,日韩日BaV,九九九九干逼

我們直接看例子：
網(wǎng)址：http://quotes.toscrape.com/

1. xpath提取方法：
用谷歌瀏覽器打開(kāi)網(wǎng)頁(yè)，右鍵檢查，選中標(biāo)簽-copy-copyxpath

copyxpath得到：/html/body/div/div[2]/div[1]/div[1]/span[1]

2.如何得到網(wǎng)頁(yè)信息：
在jupyter中的terminal中（jupyter中的termimal不能運(yùn)行在windows系統(tǒng)中）輸入 scrapy shell http://quotes.toscrape.com/
會(huì)有請(qǐng)求信息返回，返回response對(duì)象，里面包含網(wǎng)頁(yè)所有信息。
樓主安裝了3.6的anaconda，但是里面沒(méi)集成scrapy框架。但是也安裝了python2.7，里面成功安裝了scrapy（添加環(huán)境變量了，命令行任意位置識(shí)別scrapy命令，不添加環(huán)境變量的話，只在它的文件夾下識(shí)別這個(gè)命令）。打開(kāi)windows命令行，同樣鍵入：scrapy shell http://quotes.toscrape.com/ 會(huì)有請(qǐng)求信息返回。[s]開(kāi)頭
如下：

response是請(qǐng)求后所返回的對(duì)象，200說(shuō)明返回正確
要驗(yàn)證表達(dá)式對(duì)不對(duì)，會(huì)返回一個(gè)對(duì)象叫response，這個(gè)response包含了這個(gè)網(wǎng)頁(yè)的所有內(nèi)容:

>>>response.xpath('/html/body/div/div[2]/div[1]/div[1]/span[1]/text()')
>>> response.xpath('/html/body/div/div[2]/div[1]/div[1]/span[1]/text()').extract()```
比較：一個(gè)返回對(duì)象，一個(gè)返回列表，一個(gè)返回字符串

response.xpath('/html/body/div/div[2]/div[1]/div[1]/span[1]/text()')
response.xpath('/html/body/div/div[2]/div[1]/div[1]/span[1]/text()').extract()
[u'\u201cThe world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.\u201d']
response.xpath('/html/body/div/div[2]/div[1]/div[1]/span[1]/text()').extract_first()
u'\u201cThe world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.\u201d'```
response中自帶的xpath函數(shù)來(lái)驗(yàn)證路徑表達(dá)式是不是對(duì)的，這是利用chrome自帶的工具定位元素,以上驗(yàn)證出來(lái)了網(wǎng)頁(yè)內(nèi)標(biāo)簽的內(nèi)容，說(shuō)明是正確的。

3.如何自己寫(xiě)xpath獲取同一標(biāo)簽下的所有信息：

方法：所有的框都是在span class="text"中

>>> response.xpath('//span[@class="text"]/text()').extract()```
分析：response是之前scrapy shell+網(wǎng)頁(yè)請(qǐng)求后返回來(lái)一個(gè)所有的對(duì)象。它的xpath函數(shù)里面是路徑表達(dá)式，//表示取出所有對(duì)象，@表示屬性，寫(xiě)完后返回的是對(duì)象，所以返回文本加上.extract()返回了一個(gè)列表：抽出的是名言
如下：
![8](http://upload-images.jianshu.io/upload_images/5076126-c4a786c56eefb44e.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
抽出作者：
![9](http://upload-images.jianshu.io/upload_images/5076126-6bb485b4996cc0b3.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
退出：
![10](http://upload-images.jianshu.io/upload_images/5076126-1f3186bdcbfedfa1.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

**總結(jié)：**scrapy shell 算是一個(gè)工具，來(lái)驗(yàn)證抽取的對(duì)不對(duì)，對(duì)的話就可以大膽的去寫(xiě)代碼了。

補(bǔ)充：scrapy的命令

![Paste_Image.png](http://upload-images.jianshu.io/upload_images/5076126-724447a95bb6690c.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

1.Scrapy爬蟲(chóng)之靜態(tài)網(wǎng)頁(yè)爬取之一了解response.xpath()

1.Scrapy爬蟲(chóng)之靜態(tài)網(wǎng)頁(yè)爬取之一了解response.xpath()

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

1.Scrapy爬蟲(chóng)之靜態(tài)網(wǎng)頁(yè)爬取之一 了解response.xpath()

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

1.Scrapy爬蟲(chóng)之靜態(tài)網(wǎng)頁(yè)爬取之一了解response.xpath()