webscraper 4個(gè)Sitemap

一、抓取公眾號(hào)標(biāo)題、時(shí)間、內(nèi)容鏈接

{"_id":"gongzhonghao","startUrl":["https://mp.weixin.qq.com/mp/profile_ext?action=home&__biz=MzIxODUxMDM5MQ==&scene=124&#wechat_redirect"],"selectors":[{"id":"total","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"div.weui_msg_card:nth-of-type(n+2)","multiple":true,"delay":"1000"},{"id":"title","type":"SelectorText","parentSelectors":["total"],"selector":"h4.weui_media_title","multiple":false,"regex":"","delay":0},{"id":"date","type":"SelectorText","parentSelectors":["total"],"selector":"p.weui_media_extra_info","multiple":false,"regex":"","delay":0},{"id":"link","type":"SelectorElementAttribute","parentSelectors":["total"],"selector":"h4.weui_media_title","multiple":false,"extractAttribute":"hrefs","delay":0}]}

二、知乎

1、知乎大 V 所有文章標(biāo)題、鏈接、點(diǎn)贊數(shù)、評(píng)論數(shù)

{"_id":"zhihu-article","startUrl":["https://www.zhihu.com/people/zhang-jia-wei/posts?page=[1-44]"],"selectors":[{"id":"aaa","type":"SelectorElement","parentSelectors":["_root"],"selector":"div.List-item","multiple":true,"delay":"2000"},{"id":"title","type":"SelectorLink","parentSelectors":["aaa"],"selector":"h2.ContentItem-title a","multiple":false,"delay":0},{"id":"like","type":"SelectorText","parentSelectors":["aaa"],"selector":"button.Button.VoteButton--up","multiple":false,"regex":"","delay":0},{"id":"comments","type":"SelectorText","parentSelectors":["aaa"],"selector":"button.Button.ContentItem-action:nth-of-type(1)","multiple":false,"regex":"","delay":0}]}

2、知乎大 V 所有回答、鏈接、點(diǎn)贊數(shù)、評(píng)論數(shù)

{"_id":"zhihu-questions","startUrl":["https://www.zhihu.com/people/zhang-jia-wei/answers?page=[1-169]"],"selectors":[{"id":"total","type":"SelectorElement","parentSelectors":["_root"],"selector":"div.List-item","multiple":true,"delay":"2000"},{"id":"questions","type":"SelectorLink","parentSelectors":["total"],"selector":"h2.ContentItem-title a","multiple":false,"delay":0},{"id":"likes","type":"SelectorText","parentSelectors":["total"],"selector":"button.Button.VoteButton--up","multiple":false,"regex":"","delay":0},{"id":"comments","type":"SelectorText","parentSelectors":["total"],"selector":"button.Button.ContentItem-action:nth-of-type(1)","multiple":false,"regex":"","delay":0}]}

3、抓取知乎搜索關(guān)鍵字,所有結(jié)果標(biāo)題、鏈接、點(diǎn)贊數(shù)、評(píng)論數(shù)

{"_id":"zhihu-search","startUrl":["https://www.zhihu.com/search?q=%E8%B5%9A%E9%92%B1&type=content"],"selectors":[{"id":"total","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"div.List div div.List-item","multiple":true,"delay":"3000"},{"id":"link","type":"SelectorLink","parentSelectors":["total"],"selector":"h2.ContentItem-title a","multiple":false,"delay":0},{"id":"likes","type":"SelectorText","parentSelectors":["total"],"selector":"button.Button.VoteButton--up","multiple":false,"regex":"","delay":0},{"id":"comments","type":"SelectorText","parentSelectors":["total"],"selector":"button.Button.ContentItem-action","multiple":false,"regex":"","delay":0}]}

三、抓取頭條熱點(diǎn)文章標(biāo)題、發(fā)布源、評(píng)論數(shù)、發(fā)布時(shí)間

{"_id":"toutiao","startUrl":["https://www.toutiao.com/ch/news_hot/"],"selectors":[{"id":"total","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"div.item-inner","multiple":true,"delay":"4000"},{"id":"link","type":"SelectorLink","parentSelectors":["total"],"selector":"a.link","multiple":false,"delay":0},{"id":"source","type":"SelectorText","parentSelectors":["total"],"selector":"a.lbtn.source","multiple":false,"regex":"","delay":0},{"id":"comments","type":"SelectorText","parentSelectors":["total"],"selector":"a.lbtn.comment","multiple":false,"regex":"","delay":0},{"id":"time","type":"SelectorText","parentSelectors":["total"],"selector":"span.lbtn","multiple":false,"regex":"","delay":0}]}

四、微博

1、抓取微博內(nèi)容、轉(zhuǎn)發(fā)鏈接、轉(zhuǎn)發(fā)數(shù)、評(píng)論數(shù)、點(diǎn)贊數(shù)、發(fā)布時(shí)間

{"_id":"weibo","startUrl":["https://weibo.com/bylixiaolai?is_search=0&visible=0&is_hot=1&is_tag=0&profile_ftype=1&page=[1-60]#feedtop"],"selectors":[{"id":"total","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"div.WB_cardwrap.WB_feed_type:nth-of-type(n+2)","multiple":true,"delay":"1000"},{"id":"click","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.WB_cardwrap:nth-of-type(2) div.WB_text","multiple":true,"delay":"2000","clickElementSelector":"div.WB_text.W_f14 a.WB_text_opt","clickType":"clickOnce","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"},{"id":"real-total","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"div.WB_cardwrap.WB_feed_type:nth-of-type(n+2)","multiple":true,"delay":"10000"},{"id":"content","type":"SelectorText","parentSelectors":["real-total"],"selector":"div.WB_text","multiple":false,"regex":"","delay":0},{"id":"forward","type":"SelectorLink","parentSelectors":["real-total"],"selector":"a.S_func1.W_autocut","multiple":false,"delay":0},{"id":"shares","type":"SelectorText","parentSelectors":["real-total"],"selector":"li:nth-of-type(2) em:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"comments","type":"SelectorText","parentSelectors":["real-total"],"selector":"li:nth-of-type(3) em:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"likes","type":"SelectorText","parentSelectors":["real-total"],"selector":"li:nth-of-type(4) em:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"time","type":"SelectorText","parentSelectors":["real-total"],"selector":"div.WB_detail > div.WB_from a.S_txt2:nth-of-type(1)","multiple":false,"regex":"","delay":0}]}

2、抓取某條微博所有評(píng)論

{"_id":"weibo-comment","startUrl":["https://weibo.com/1576218000/Gqjfh0VYa?filter=hot&root_comment_id=0&type=comment"],"selectors":[{"id":"scroll","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"div.list_box > div.list_ul > div.list_li:nth-of-type(1) > div.list_con > div.WB_text","multiple":true,"delay":"1000"},{"id":"click","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.list_box > div.list_ul > div.list_li > div.list_con > div.WB_text","multiple":true,"delay":"3000","clickElementSelector":"span.more_txt","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueCSSSelector"},{"id":"content","type":"SelectorText","parentSelectors":["click"],"selector":"parent","multiple":false,"regex":"","delay":0}]}


我寫作的一個(gè)網(wǎng)站,很好玩:http://www.zsxq100.com/

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • Spring Cloud為開發(fā)人員提供了快速構(gòu)建分布式系統(tǒng)中一些常見模式的工具(例如配置管理,服務(wù)發(fā)現(xiàn),斷路器,智...
    卡卡羅2017閱讀 136,554評(píng)論 19 139
  • mean to add the formatted="false" attribute?.[ 46% 47325/...
    ProZoom閱讀 3,197評(píng)論 0 3
  • "use strict";function _classCallCheck(e,t){if(!(e instanc...
    久些閱讀 2,142評(píng)論 0 2
  • 到底是我的心在騙我自己 還是我自己在騙我的心?
    顧陌北閱讀 219評(píng)論 0 0
  • (八)迷茫前程,卻遇意外之喜 皇帝走后,就有宣制使來宣讀旨意,頒賜封賞。沒過幾日,木蘭便搬進(jìn)了城北的將軍府中。搬府...
    沄苓閱讀 386評(píng)論 0 0

友情鏈接更多精彩內(nèi)容