【python】從豆瓣獲取評(píng)分

目的

快速的從豆瓣獲取電影的評(píng)分情況

方案

直接訪問(wèn)https://www.douban.com/search?q={movie_name},獲取網(wǎng)頁(yè)相關(guān)內(nèi)容,終端直接輸出。
借此步驟,以后配合alfred做快速信息瀏覽,美滋滋。

依賴庫(kù)安裝

pip3 install bs4 requests

執(zhí)行效果

~ python3 douban.py 四體
[電影]    4.9 144人評(píng)價(jià)  2004    四體
[電影]    6.3 21288人評(píng)價(jià)    2005    美國(guó)派(番外篇)4:集體露營(yíng)

支持開源

#! /usr/bin/env python3
# -*- coding:utf-8 -*-
import requests
import bs4
import sys


"""
python3 {douban.py} {movie_name}
"""


def get_web(url):
    header = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36 Edg/91.0.864.59"
    }
    res = requests.get(url, headers=header, timeout=5)
    return res.text


def parse_city_date(soup):
    location = soup.find("div", class_="crumbs fl")

    date = soup.find("h1", class_="clearfix city")
    return (
        location.text.strip().replace("\n", "").replace(" ", ""),
        date.i.text.strip()[:16],
    )


def temp_string(high, low):
    return f"{high} / {low}" if high and low else f"{high}{low}"


def format_infos(*infos):
    datas = list(*infos)
    return f"%-6s\t%s\t%s" % (datas[0], datas[1], "\t".join(datas[2:]))


def parse_content(e):
    def text_or_empty(o):
        return o.text if o else ""

    def sub_cast_year(o):
        return o.split("/")[-1].strip() if o else ""

    def format_rating_person(s):
        return s[1:-1]

    type = text_or_empty(e.h3.span)
    name = text_or_empty(e.h3.a)
    rating_info = e.div
    rating_nums = text_or_empty(rating_info.find("span", class_="rating_nums"))
    sub_cast = text_or_empty(rating_info.find("span", class_="subject-cast"))
    year = sub_cast_year(sub_cast)
    rating_person_nums = format_rating_person(
        text_or_empty(rating_info.find("span", class_=None))
    )
    return type, rating_nums, rating_person_nums, year, name


def parse_contents(soup):
    def filter(infos):
        for i in range(len(infos) - 1, -1, -1):
            if (
                infos[i].find("[小組]") == 0
                or infos[i].find(" ") == 0
                or infos[i].find("[日記](méi)") == 0
            ):
                infos.pop(i)
        return infos

    contents = soup.find_all("div", class_="title")
    return filter([format_infos(parse_content(content)) for content in contents])


def print_weather(day, weather, tem, wind):
    for i in range(0, 7):
        print(f"{day[i]:<10}{tem[i]:^15}{wind[i]:<10}\t{weather[i]}")


def create_soup(movie_name):
    return bs4.BeautifulSoup(
        get_web(f"https://www.douban.com/search?q={movie_name}"), "lxml"
    )


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("please input movie name")
        exit(0)

    movie_name = sys.argv[1]
    soup = create_soup(movie_name)
    contents = parse_contents(soup)
    print(*contents, sep="\n")

感覺(jué)好玩的就來(lái)個(gè)贊呀!

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容