狠狠撸色综合狠狠操,亚洲精品日韩成人无套

本文可以學習到以下內(nèi)容：

免費獲取全國省份、城市編碼以及經(jīng)緯度數(shù)

使用 pandas 中的 read_sql 讀取 sqlite 中的數(shù)據(jù)

使用 pandas 中的 merge 方法合并數(shù)據(jù)

使用 groupgy+sort_values 方法實現(xiàn)統(tǒng)計各省人數(shù)并降序排列

數(shù)據(jù)及源碼地址：https://gitee.com/myrensheng/data_analysis

項目背景

“小凡，數(shù)據(jù)庫users表中有客戶的資料，我需要你統(tǒng)計一下各省份的客戶數(shù)量發(fā)給我”，經(jīng)理在早會上給每個人布置任務時說道。

"收到"，小凡一邊記錄著一邊回答到。

早會結(jié)束后，小凡接杯熱水，回到工位上，打開dataworks、jupyter、datav、quickbi等工具，開始了新一天的工作......

為什么沒有省份的數(shù)據(jù)呢？小凡看著要統(tǒng)計的數(shù)據(jù)，滿臉疑問。

本來以為是簡單的統(tǒng)計數(shù)據(jù)任務，沒想到 users 表中只有城市編碼數(shù)據(jù)，沒有省份編碼，也沒有對應的省份中文名。小凡心中頓時有種不祥的預感，在釘釘上聯(lián)系數(shù)據(jù)庫運維人員詢問情況。

運維同學說，當初在設計表的時候沒有考慮到省份，所以數(shù)據(jù)庫沒有省份字段，讓小凡自己想想辦法。

小凡也很無奈，現(xiàn)在急切需要找到一份省份編碼映射表，逛了各大論壇，找了各種博客網(wǎng)站，問了許多技術朋友

終于在高德地圖網(wǎng)站上找到了需要的數(shù)據(jù)資源：

數(shù)據(jù)已經(jīng)寫入 data.db 數(shù)據(jù)庫中的 adcode_lng_lat 表中
Excel 文件《省市adcode與經(jīng)緯度映射表.xlsx》存放在文件夾【數(shù)據(jù)加工廠】中

剩下的就交給代碼吧！

項目代碼

小凡常用的數(shù)據(jù)分析工具：

import os
import datetime
import numpy as np
import pandas as pd
from sqlalchemy import create_engine

數(shù)據(jù)放在上一級的目錄下名為 data.db 的文件

# 數(shù)據(jù)庫地址：數(shù)據(jù)庫放在上一級目錄下
db_path = os.path.join(os.path.dirname(os.getcwd()), "data.db")
engine_path = "sqlite:///" + db_path
# 創(chuàng)建數(shù)據(jù)庫引擎
engine = create_engine(engine_path)

sql = """
select * from users
"""
df = pd.read_sql(sql, engine)

用 pandas 的 head() 方法查看前5條數(shù)據(jù)：

df.head()

1.png

新增省份編碼

adcode 是城市編碼，用前2位加上0000就是省份編碼，比如：431081對應的省份編碼是430000。

在df后面新增一列省份編碼：

df = df.astype(str)
df["province_adcode"] = df["adcode"].map(lambda x:x[:2]+"0000")

獲取編碼映射數(shù)據(jù)

sql = """
select * from adcode_lng_lat
"""

adcode_lng_lat_df = pd.read_sql(sql, engine)

合并數(shù)據(jù)

result_df = pd.merge(df,adcode_lng_lat_df[["adcode","name"]].astype(str),left_on="province_adcode",right_on="adcode",how="left")

用pandas 中的 sample() 方法隨機查看10條數(shù)據(jù)：

result_df.sample(10)

2.png

統(tǒng)計省份用戶數(shù)

使用 groupgy+sort_values 方法實現(xiàn)統(tǒng)計各省人數(shù)并降序排列，代碼如下：

province_count_df = result_df.groupby(by="name").agg(
    {"user_id":"count"}
).sort_values(by="user_id",ascending=False).reset_index()

用 pandas 的 head() 方法查看前5條數(shù)據(jù)：

province_count_df.head()

3.png

使用SQL實現(xiàn)

sqlite3 的字符串拼接用"||"符號實現(xiàn)
sqlite3 字符串切割用 SUBSTRING(字符串，開始位置，結(jié)束位置)

select
    b.name
    ,count(a.user_id) as users_num
from
(
    select
        user_id,
        username,
        adcode,
        SUBSTRING(adcode, 1, 2) || '0000' as province_code
    from
        users
) as a
left join
(
    select
        adcode as province_code,
        name
    FROM
        adcode_lng_lat
) as b on a.province_code = b.province_code
group by b.name
order by count(a.user_id) desc
;

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

第九章用Python處理省份城市編碼數(shù)據(jù)

第九章用Python處理省份城市編碼數(shù)據(jù)

項目背景

項目代碼

新增省份編碼

獲取編碼映射數(shù)據(jù)

合并數(shù)據(jù)

統(tǒng)計省份用戶數(shù)

使用SQL實現(xiàn)

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

第九章用Python處理省份城市編碼數(shù)據(jù)

項目背景

項目代碼

新增省份編碼

獲取編碼映射數(shù)據(jù)

合并數(shù)據(jù)

統(tǒng)計省份用戶數(shù)

使用SQL實現(xiàn)

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av