39 Python使用Pandas將Excel存入MySQL
一個(gè)典型的數(shù)據(jù)處理流:
- Pandas從多方數(shù)據(jù)源讀取數(shù)據(jù),比如excel、csv、mysql、爬蟲等等
- Pandas對(duì)數(shù)據(jù)做過(guò)濾、統(tǒng)計(jì)分析
- Pandas將數(shù)據(jù)存儲(chǔ)到MySQL,用于Web頁(yè)面展示、后續(xù)的進(jìn)一步SQL分析等等
官網(wǎng)文檔: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html#pandas.DataFrame.to_sql
數(shù)據(jù)準(zhǔn)備:學(xué)生信息Excel表
import pandas as pd
df = pd.read_excel("./course_datas/c23_excel_vlookup/學(xué)生信息表.xlsx")
df.head()
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
<pre><code>.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</code></pre>
| 學(xué)號(hào) | 姓名 | 性別 | 年齡 | 籍貫 | |
|---|---|---|---|---|---|
| 0 | S001 | 怠涵 | 女 | 23 | 山東 |
| 1 | S002 | 婉清 | 女 | 25 | 河南 |
| 2 | S003 | 溪榕 | 女 | 23 | 湖北 |
| 3 | S004 | 漠涓 | 女 | 19 | 陜西 |
| 4 | S005 | 祈博 | 女 | 24 | 山東 |
# 展示索引的name
df.index.name
df.index.name = "id"
df.head()
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
<pre><code>.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</code></pre>
| 學(xué)號(hào) | 姓名 | 性別 | 年齡 | 籍貫 | |
|---|---|---|---|---|---|
| id | |||||
| 0 | S001 | 怠涵 | 女 | 23 | 山東 |
| 1 | S002 | 婉清 | 女 | 25 | 河南 |
| 2 | S003 | 溪榕 | 女 | 23 | 湖北 |
| 3 | S004 | 漠涓 | 女 | 19 | 陜西 |
| 4 | S005 | 祈博 | 女 | 24 | 山東 |
創(chuàng)建sqlalchemy對(duì)象連接MySQL
SQLAlchemy是Python中的ORM框架, Object-Relational Mapping,把關(guān)系數(shù)據(jù)庫(kù)的表結(jié)構(gòu)映射到對(duì)象上。
- 官網(wǎng):https://www.sqlalchemy.org/
- 如果sqlalchemy包不存在,用這個(gè)命令安裝:pip install sqlalchemy
- 需要安裝依賴Python庫(kù):pip install mysql-connector-python
可以直接執(zhí)行SQL語(yǔ)句
from sqlalchemy import create_engine
engine = create_engine("mysql+mysqlconnector://root:123456@127.0.0.1:3306/test", echo=False)
方法1:當(dāng)數(shù)據(jù)表不存在時(shí),每次覆蓋整個(gè)表
每次運(yùn)行會(huì)drop table,新建表
df.to_sql(name='student', con=engine, if_exists="replace")
engine.execute("show create table student").first()[1]
'CREATE TABLE `student` (\n `id` bigint(20) DEFAULT NULL,\n `學(xué)號(hào)` text,\n `姓名` text,\n `性別` text,\n `年齡` bigint(20) DEFAULT NULL,\n `籍貫` text,\n KEY `ix_student_id` (`id`)\n) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4'
print(engine.execute("show create table student").first()[1])
CREATE TABLE `student` (
`id` bigint(20) DEFAULT NULL,
`學(xué)號(hào)` text,
`姓名` text,
`性別` text,
`年齡` bigint(20) DEFAULT NULL,
`籍貫` text,
KEY `ix_student_id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
engine.execute("select count(1) from student").first()
(24,)
engine.execute("select * from student limit 5").fetchall()
[(0, 'S001', '怠涵', '女', 23, '山東'),
(1, 'S002', '婉清', '女', 25, '河南'),
(2, 'S003', '溪榕', '女', 23, '湖北'),
(3, 'S004', '漠涓', '女', 19, '陜西'),
(4, 'S005', '祈博', '女', 24, '山東')]
方法2:當(dāng)數(shù)據(jù)表存在時(shí),每次新增數(shù)據(jù)
場(chǎng)景:每天會(huì)新增一部分?jǐn)?shù)據(jù),要添加到數(shù)據(jù)表,怎么處理?
df_new = df.loc[:4, :]
df_new
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
<pre><code>.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</code></pre>
| 學(xué)號(hào) | 姓名 | 性別 | 年齡 | 籍貫 | |
|---|---|---|---|---|---|
| id | |||||
| 0 | S001 | 怠涵 | 女 | 23 | 山東 |
| 1 | S002 | 婉清 | 女 | 25 | 河南 |
| 2 | S003 | 溪榕 | 女 | 23 | 湖北 |
| 3 | S004 | 漠涓 | 女 | 19 | 陜西 |
| 4 | S005 | 祈博 | 女 | 24 | 山東 |
df_new.to_sql(name='student', con=engine, if_exists="append")
engine.execute("SELECT * FROM student where id<5 ").fetchall()
[(0, 'S001', '怠涵', '女', 23, '山東'),
(1, 'S002', '婉清', '女', 25, '河南'),
(2, 'S003', '溪榕', '女', 23, '湖北'),
(3, 'S004', '漠涓', '女', 19, '陜西'),
(4, 'S005', '祈博', '女', 24, '山東'),
(0, 'S001', '怠涵', '女', 23, '山東'),
(1, 'S002', '婉清', '女', 25, '河南'),
(2, 'S003', '溪榕', '女', 23, '湖北'),
(3, 'S004', '漠涓', '女', 19, '陜西'),
(4, 'S005', '祈博', '女', 24, '山東')]
問(wèn)題解決:先根據(jù)數(shù)據(jù)KEY刪除舊數(shù)據(jù)
df_new.index
RangeIndex(start=0, stop=5, step=1, name='id')
for id in df_new.index:
## 先刪除要新增的數(shù)據(jù)
delete_sql = f"delete from student where id={id}"
print(delete_sql)
engine.execute(delete_sql)
delete from student where id=0
delete from student where id=1
delete from student where id=2
delete from student where id=3
delete from student where id=4
engine.execute("SELECT * FROM student where id<5 ").fetchall()
[]
engine.execute("select count(1) from student").first()
(19,)
# 新增數(shù)據(jù)到表中
df_new.to_sql(name='student', con=engine, if_exists="append")
engine.execute("SELECT * FROM student where id<5 ").fetchall()
[(0, 'S001', '怠涵', '女', 23, '山東'),
(1, 'S002', '婉清', '女', 25, '河南'),
(2, 'S003', '溪榕', '女', 23, '湖北'),
(3, 'S004', '漠涓', '女', 19, '陜西'),
(4, 'S005', '祈博', '女', 24, '山東')]
engine.execute("SELECT count(1) FROM student").first()
(24,)
本文使用 文章同步助手 同步