Pandas read_sql with parameters
PythonSqlPandasPsycopg2Python Problem Overview
Are there any examples of how to pass parameters with an SQL query in Pandas?
In particular I'm using an SQLAlchemy engine to connect to a PostgreSQL database. So far I've found that the following works:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN %s AND %s'),
db,params=[datetime(2014,6,24,16,0),datetime(2014,6,24,17,0)],
index_col=['Timestamp'])
The Pandas documentation says that params can also be passed as a dict, but I can't seem to get this to work having tried for instance:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN :dstart AND :dfinish'),
db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
index_col=['Timestamp'])
What is the recommended way of running these types of queries from Pandas?
Python Solutions
Solution 1 - Python
The read_sql
docs say this params
argument can be a list, tuple or dict (see docs).
To pass the values in the sql query, there are different syntaxes possible: ?
, :1
, :name
, %s
, %(name)s
(see PEP249).
But not all of these possibilities are supported by all database drivers, which syntax is supported depends on the driver you are using (psycopg2
in your case I suppose).
In your second case, when using a dict, you are using 'named arguments', and according to the psycopg2
documentation, they support the %(name)s
style (and so not the :name
I suppose), see http://initd.org/psycopg/docs/usage.html#query-parameters.
So using that style should work:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN %(dstart)s AND %(dfinish)s'),
db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
index_col=['Timestamp'])