1

I am trying to search a specific column in a MySQL database for lines that contain strings, e.g. if the strings I am searching for are ['bar', 'foo', 'dog', 'cat'], and the column has sentences, e.g. 'This is a sentence' I want to pick out all the database rows in which there is a matching string in the sentence. So if 'This is a cat' were in the column I am searching though, this should be returned.

To do this I am making a query using sqlalchemy like:

import pandas as pd
import sqlalchemy as sql

connect_string = 'some_db_address'
sql_engine = sql.create_engine(connect_string)


query = """SELECT * FROM database
           WHERE column_name LIKE '%word1%'
           AND column_name LIKE '%word2%' """

result = pd.read_sql_query(query, sql_engine)

This is giving me a ValueError: unsupported format character 'L' (0x4c) at index 87 error- any ideas why this is arising?

edit:

I am using a python fstring to make the query, like:

query = f"""SELECT * FROM database
                 WHERE column_name LIKE '%{word1}%'
                 AND column_name LIKE '%{word2}%' """
result = pd.read_sql_query(query, sql_engine)

Where the number of word variables changes, but in the above raised example word1 began with 'L', hence the error message

6
  • 1
    What is pd? Could you share more context and stacktrace? Commented Jul 24, 2020 at 20:16
  • Add more code, including what word1 and word2 are formatted to be. Commented Jul 24, 2020 at 20:36
  • 1
    It's hard to believe that the code presented here would cause that error, because there is no %L anywhere. The error is the result of internal details of your MySQL driver, which uses %-formatting internally, after escaping any parameters. Are you using some form of string formatting in your actual code to substitute word1 & word2? If so, there's your problem; you should never pass arguments to queries that way, but use your driver's placeholders instead and pass the arguments to execute separately. Commented Jul 25, 2020 at 7:38
  • @lija I've updated the question, I was using python f strings, not sure why that would make a difference though? Commented Jul 25, 2020 at 11:41
  • @merrydeath pd is pandas Commented Jul 25, 2020 at 11:44

1 Answer 1

1

I am able to confirm that your approach fails with both mysqlclient and PyMySQL. There is definitely something peculiar going on with read_sql_query when the SQL command text contains percent signs.

However, as noted in the comments, using SQL injection like that is a "Bad Idea"™ anyway. Instead, you should be using query parameters like so:

import pandas as pd
import sqlalchemy as sa

engine = sa.create_engine("mysql+mysqldb://root:[email protected]:3307/mydb")

with engine.begin() as conn:
    conn.execute(sa.text("CREATE TEMPORARY TABLE tmp1 (txt varchar(50))"))
    conn.execute(
        sa.text("INSERT INTO tmp1 (txt) VALUES ('Apple Lemon'), ('Lemon')")
    )

word1 = "Lemon"
word2 = "Apple"

qry = sa.text("SELECT * FROM tmp1 WHERE txt LIKE :p1 AND txt LIKE :p2")
my_params = {"p1": f"%{word1}%", "p2": f"%{word2}%"}
result = pd.read_sql_query(qry, engine, params=my_params)

print(result)
"""console output:
           txt
0  Apple Lemon
"""
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.