3

Consider a declarative SQLAlchemy model with an indexed String field:

class User(Base):
    name = Column(String(100), index=True, nullable=False)

The name field is case sensitive, meaning the original case should be preserved, but efficient case-insensitive queries on the index should be supported.

What's the best way to achieve this and implement in SQLAlchemy?

Queries can use lower() if needed

session.query(User).filter_by(name=lower('SOME_name'))

but it doesn't matter too much, as long as the solution is elegant and performant.

Queries using ILIKE and Postgres-level lower() are unacceptable due to performance requirements, they've been tested and do not perform fast enough on large tables for my use case.

1 Answer 1

6

Create a functional index that indexes the expression LOWER(name):

Index('idx_user_name_lower', func.lower(User.name))

With the index in place queries such as

session.query(User).filter(func.lower(User.name) == 'SOME_name'.lower())

may perform better, if LOWER(name) has high cardinality.

You could then encapsulate handling the lowercasing in a custom comparator:

# Verbatim from the documentation
class CaseInsensitiveComparator(Comparator):
    def __eq__(self, other):
        return func.lower(self.__clause_element__()) == func.lower(other)

class User(Base):
    ...
    @hybrid_property
    def name_insensitive(self):
        return self.name.lower()

    @name_insensitive.comparator
    def name_insensitive(cls):
        return CaseInsensitiveComparator(cls.name)

The comparator will apply func.lower() to both sides behind the scenes:

session.query(User).filter_by(name_insensitive='SOME_name')

is equivalent to

session.query(User).filter(func.lower(User.name) == func.lower('SOME_name'))
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for the thorough reply! When you say "may perform better" can I expect performance that is on par with regular indexed querying?
Yes, it should perform as well, given that the statistics and the query are such that the index is used.
For the record, this is supported since SQLAlchemy 0.8.0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.