Pandas+SQLAlchemy与数据库交互

在使用Pandas分析数据时，除了读取文本类数据，如Excel、CSV文件外，还会涉及到数据库的读取。通常的方案是Pandas+ SQLAlchemy。

以下内容是记录的一些操作记录，算是备忘。

使用sqlalchemy连接数据库

Engine 翻译过来就是引擎的意思，汽车通过引擎来驱动，而 SQLAlchemy 是通过 Engine 来驱动，Engine 维护了一个连接池（Pool）对象和方言（Dialect）。方言简单而言就是你连的到底是 MySQL 还是 Oracle 或者 PostgreSQL 还是其它数据库。

连接示例：

from sqlalchemy import create_engine

engine = create_engine('sqlite:///test.db')

# engine = create_engine("mysql+pymysql://root:root@localhost/hackathon")

from sqlalchemy import create_engine

engine = create_engine('sqlite:///test.db')
# engine = create_engine("mysql+pymysql://root:root@localhost/hackathon")

SQLAlchemy支持的数据库包括：PostgreSQL、MySQL、MariaDB、Oracle、Microsoft SQL Server、SQLite等。具体的连接方式这里不再详述。其中最容易出错的部分数据库密码存在特殊字符，需要先编码处理后再日提交。

解决方案：

import urllib.parse

urllib.parse.quote_plus("kx%jj5/g")

import urllib.parse
urllib.parse.quote_plus("kx%jj5/g")

将 DataFrame保存到数据库中

DataFrame自带to_sql方法，使用方法如下：

DataFrame.to_sql(self, name: str, con, schema=None, if_exists: str = 'fail', index: bool = True, index_label=None, chunksize=None, dtype=None, method=None)

DataFrame.to_sql(self, name: str, con, schema=None, if_exists: str = 'fail', index: bool = True, index_label=None, chunksize=None, dtype=None, method=None)

参数说明：

name：数据库表名称
con：数据库连接
schema：数据库模的schema，大部分连接的时候不需要
if_exists：可选：{‘fail’, ‘replace’, ‘append’}
index：是否将DataFrame的index列写入
index_label：index列的名称
chunksize：设置批量写入的条目数量
dtype：设置字段类型
method：插入方法，可不做设置

使用示例：

df.to_sql(con=engine, name='pair_result', if_exists='append')

df.to_sql(con=engine, name='pair_result', if_exists='append')

传统方式：通过游标方式取数据库数据

代码示例：

with engine.connect() as con:

rs = con.execute('SELECT * FROM testdb.abc')

for row in rs:

print(row)

with engine.connect() as con:
    rs = con.execute('SELECT * FROM testdb.abc')
    for row in rs:
        print(row)

将数据库的数据转化为DataFrame：

with engine.connect() as con:

rs = con.execute('SELECT * FROM testdb.abc ')

df = pd.DataFrame(rs.fetchall())

df.columns = rs.keys()

with engine.connect() as con:
    rs = con.execute('SELECT * FROM testdb.abc ')
    df = pd.DataFrame(rs.fetchall())
    df.columns = rs.keys()

简便方式：直接使用pd.read_sql()读取

from sqlalchemy import create_engine

import pandas as pd

engine = create_engine('sqlite:///test.db')

# 查询表数据

df = pd.read_sql("sub_place", engine)

print(df.head())

# 使用SQL查询数据

query = '''SELECT *

FROM sub_place

WHERE father_area_name='中国'

china_df = pd.read_sql(query, engine)

print(china_df.head())

from sqlalchemy import create_engine
import pandas as pd

engine = create_engine('sqlite:///test.db')

# 查询表数据
df = pd.read_sql("sub_place", engine)
print(df.head())

# 使用SQL查询数据
query = '''SELECT * 
            FROM sub_place 
            WHERE father_area_name='中国' 
            '''
china_df = pd.read_sql(query, engine)
print(china_df.head())

使用sqlalchemy连接数据库

将 DataFrame保存到数据库中

传统方式：通过游标方式取数据库数据

简便方式：直接使用pd.read_sql()读取

Recommend

恒大汽车恒驰7路试谍照曝光！或将于今年底试生产

Patch Gapping a Safari Type Confusion

Another Chrome and Edge exploit published online as browser makers deal with pat...

Google cuts Chrome 'patch gap' in half, from 33 to 15 days

That's correct, Safari does its image decoding via OS-level image decoding (http...

全新UI/新增多款应用/优化NOMI 蔚来发布新车机系统

销量｜零跑汽车8月交付4488台新增订单超7000台

小米汽车总部落户北京已搭建近300人团队

交警可查、即时扣分！全国28个城市上线电子驾照

Inside look at modern web browser (part 1)

About Joyk