2

pandasql库解析

 2 years ago
source link: https://xujiahua.github.io/posts/20200916-pandas-sql/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
pandasql库解析
pandasql库解析
2020-09-16 10:48 label readcode  

对 Pandas 库没有原生 SQL 支持表示遗憾,今天看到这么一个库 https://github.com/yhat/pandasql

它的思路比较简单粗暴,先把 pandas Dataframe 存储到一个关系型数据库(sqlite/postgresql),最后 SQL query 的结果其实是查询关系型数据库返回的结果,封装成 Dataframe 来呈现。其中数据库操作使用 sqlalchemy 这个库。

学到的知识点:locals(),globals() 这两个方法。

The globals() method returns the dictionary of the current global symbol table. https://www.programiz.com/python-programming/methods/built-in/globals

The locals() method updates and returns a dictionary of the current local symbol table. https://www.programiz.com/python-programming/methods/built-in/locals

如下代码,meat 是一个 Dataframe 变量,为何 SQL query 里能直接引用这个变量。

$ python
>>> from pandasql import sqldf, load_meat, load_births
>>> pysqldf = lambda q: sqldf(q, globals())
>>> meat = load_meat()
>>> print pysqldf("SELECT * FROM meat LIMIT 10;").head()

因为可以从 globals() 里通过符号 meat 拿到这个变量。

$ python
>>> a=4
>>> globals()['a']
4

Last modified on 2020-09-16


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK