2

I want to generate a dataframe with pandas read_sql from my sqlalchemy query with a PostgreSQL's jsonb attribute to columns.

Actually this will give me my answer:

query = session.query(
    cls.id,
    cls._my_jsonb_column
).all()
pd.DataFrame.from_dict([dict(id=id_, **i) for id_,i in query])

But I would prefer to unpack the jsonb with PostgreSQL instead of in the application.

My attempt gives

query = session.query(
    cls.id,
    func.jsonb_to_record(cls._my_jsonb_column)
)
pd.read_sql(query.statement, query.session.bind)

(psycopg2.NotSupportedError) function returning record called in context that cannot accept type record

1 Answer 1

0

json_to_record (and jsonb_to_recordset) returns record as if it was the result of a SELECT query. In sqlalchemy context, it provides a selectable which can be used like a table.

So you should consider the result of func.jsonb_to_record(cls._my_jsonb_column) as a sort of table you can join to you your original table.

That is you query should look like:

jsonb_data = func.jsonb_to_record(cls._my_jsonb_column)
query = session.query(
    select(
        [cls.id, <other columns>]
    ).select_from(
        cls.join(jsonb_data, <on_clause>)
    )
)

You could even flatten your JSON data using JSON processing functions but it is not possible to be more precise without knowing the structure of the JSON data.

Alternatively, I recently released a package to easily flatten JSONB fields from a description of the json data, I would be happy to have some feedbacks: pg_jsonb_flattener

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.