65

I have a list of ids of rows to fetch from database. I'm using python and psycopg2, and my problem is how to effectively pass those ids to SQL? I mean that if I know the length of that list, it is pretty easy because I can always manually or automatically add as many "%s" expressions into query string as needed, but here I don't know how much of them I need. It is important that I need to select that rows using sql "id IN (id1, id2, ...)" statement. I know that it is possible to check the length of the list and concatenate suitable number of "%s" into query string, but I'm afraid that it would be very slow and ugly. Does anyone have an idea on how to solve it? And please don't ask why I need to do it with "IN" statement - it is a benchmark which is a part of my class assignment. Thanks in advance!

1
  • Are you opposed to a SQL answer? Using dynamic SQL, you can give a string of any length and have SQL correctly read it. Commented Dec 29, 2011 at 18:41

3 Answers 3

101

Python tuples are converted to sql lists in psycopg2:

cur.mogrify("SELECT * FROM table WHERE column IN %s;", ((1,2,3),))

would output

'SELECT * FROM table WHERE column IN (1,2,3);'

For Python newcomers: It is unfortunately important to use a tuple, not a list here. Here's a second example:

cur.mogrify("SELECT * FROM table WHERE column IN %s;", 
    tuple([row[0] for row in rows]))
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you, I didn't notice that. The only issue with your solution is that you have forgotten comma in that tuple.
Note that Python LISTS will be converted to Postgres ARRAY types so I find that I frequently need to do something like (tuple(SOME_LIST),) in my cursor.execute(...) arguments. Note how we're wrapping the tuple() of the list in a single element literal tuple, so we have a tuple of tuples as shown in this example.
cur.mogrify("SELECT * FROM table WHERE column IN %s;", tuple([row[0] for row in rows])) this is unclear. It doesn't seem to relate to the previous example.
Thank you, I was pulling my hair out trying to figure out why it wasn't working when I was doing cursor.execute(sql, tuple(ids)). Makes sense that it needs to be cursor.execute(sql, (tuple(ids),))
20

this question is old and maybe there is a newer one out there, but the answer my colleagues are going with right now is this:

sql = "SELECT * FROM table WHERE column = ANY(%(parameter_array)s)"
cur.execute(sql,{"parameter_array": [1, 2, 3]})

3 Comments

This worked for me for a special case where I had to use $1 instead of %s while the accepted answer didn't. Thanks for posting this!
Above is great, especially if you have multiple arguments to keep track of, but if one only has a single argument a dictionary is not required: python sql = "SELECT * FROM table WHERE column = ANY(%s)" cur.execute(sql,[[1,2,3],])
This worked great with the latest psycopg (psycopg3)
14

Now sql module of psycopg2 (https://www.psycopg.org/docs/sql.html) can be used to safeguard against errors and injections, like e.g.:

import psycopg2
from psycopg2 import sql

params = config()
conn = psycopg2.connect(**params)
cur = conn.cursor()

ids = ['a','b','c']
sql_query = sql.SQL('SELECT * FROM {} WHERE id IN ({});').format(
                    sql.Identifier('table_name'),
                    sql.SQL(',').join(map(sql.Literal, ids))
                )
print (sql_query.as_string(cur)) # for debug
cur.execute(sql_query)

from configparser import ConfigParser
def config(filename='database.ini', section='postgresql'):
    # create a parser
    parser = ConfigParser()
    # read config file
    parser.read(filename)

    # get section, default to postgresql
    db = {}
    if parser.has_section(section):
        params = parser.items(section)
        for param in params:
            db[param[0]] = param[1]
    else:
        raise Exception('Section {0} not found in the {1} file'.format(section, filename))

    return db

Note: sql.Identifier will add quotes if needed so it will work if you use quoted identifiers in PostgreSQL also (they have to be used to allow e.g. case sensitive naming).

Example and structure of database.ini:

[postgresql]
host=localhost
port=5432
database=postgres
user=user
password=mypass

1 Comment

+1 passing sql.SQL(',').join(map(sql.Literal, your_data) as the format() parameter seems the most query flexible & pythonic approach seen here so far.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.