1

Introduction:

I built a flask application where each user have their own database, on login their database connection is registered in SQLALCHEMY_BINDS and from here on, any database transaction is handled within their database context as shown below:

with self.database.context(bind=bind):
    results = query.all()

The above solution was possible with the help of this Question/Answer

The first problem:

Everything worked well until the number of users increased and the chances of simultaneous API calls increased.

The problem happens when a relation between 2 models is defined with lazy="select", requesting parent returns the correct parent from the correct database while requesting parent.child through marshmallow schema .dump() returns a different child from a different database (any other database that was requested at the same moment)

Reproducing the problem:

I managed to reproduce this by writing a small script to do concurrent requests from different users on different databases, and the result was that Parents are correct each from their database while children are from one of the 2 databases.

with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:
    taskA = executor.submit(get_parents_from_db_A)
    taskB = executor.submit(get_parents_from_db_B)

Example output:
taskA results are [{"parent_A": "child_A"}, {"parent_A": "child_A"}, {"parent_A": "child_A"}]
taskB results are [{"parent_B": "child_A"}, {"parent_B": "child_A"}, {"parent_B": "child_A"}]

Example output:
taskA results are [{"parent_A": "child_B"}, {"parent_A": "child_B"}, {"parent_A": "child_B"}]
taskB results are [{"parent_B": "child_B"}, {"parent_B": "child_B"}, {"parent_B": "child_B"}]

Desired output:
taskA results are [{"parent_A": "child_A"}, {"parent_A": "child_A"}, {"parent_A": "child_A"}]
taskB results are [{"parent_B": "child_B"}, {"parent_B": "child_B"}, {"parent_B": "child_B"}]

Tested solutions:

  • calling marshmelloaw schem .dump() within context, failed.
  • changing relations definitions to lazy="subquery" also failed.
  • changing relations definitions to lazy="joined" worked.

I am sure you understand the reason I am using lazy="select" which is to request data only when needed.

The new problem:

However since lazy="joined" worked and I can accept it for now, I am facing a different issue that is:

RecursionError: maximum recursion depth exceeded in comparison

Below is an example of how models and schema are defined in my application.

class Parent(Model):
    id = Column(Integer, primary_key=True)
    name = Column(String(100), unique=True, nullable=False)


class Child(Model):
    id = Column(Integer, primary_key=True)
    name = Column(String(100), unique=True, nullable=False)

    # FKs
    parent_id = Column(Integer, ForeignKey('parent.guide'), nullable=True)
  
    # Relations
    parent = relationship("Parent", backref='child', foreign_keys=parent_id, lazy="joined")


class ChildSchema(SQLAlchemyAutoSchema):
    parent = fields.Nested(parent_schema)

    class Meta:
        model = Child
        load_instance = True
        include_relationships = False
        include_fk = False

Question:

Is it okay to keep using joined relation and try to review my FKs relations to avoid the recursion exception? Or is there anything else that I can try to keep using select and accuratetly accessing data.

I tried to give as much details as possible, please let me know if you require any clarifications.

1
  • How is get_parents_from_db_A implemented? Commented Jul 24, 2023 at 12:01

1 Answer 1

0

A thread-safe implementation of context_bind_key using thread-local data:

class MySQLAlchemy(SQLAlchemy):
    # context_bind_key = None
    _context = threading.local()

    @property
    def context_bind_key(self):
        return getattr(self._context, 'bind_key', None)

    @context_bind_key.setter
    def context_bind_key(self, value):
        self._context.bind_key = value

    ...
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your answer, I solved it by updating the get_bind from the original answer where info['bind_key'] = self.db.context_bind_key is making the bind shared with concurrent requests, I replaced it with info['bind_key'] = request.bind where I assign the bind to each request manually. I suppose this is close to the suggested solution, I will give a try soon and reply with results.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.