Python SQL to pandas DataFrame 2

Question

pd.read_sql_query("""SELECT Tab1.Title, NewTab.NewCol1 FROM
                            (SELECT Col1 AS NewCol, COUNT(*) AS NewCol1
                            FROM Tab2 GROUP BY Col1) AS NewTab
                     JOIN Tab1 ON NewTab.NewCol=Tab1.Id
                     WHERE Tab1.Num=1
                     ORDER BY NewCol1 DESC""", conn)

My goal is to rewrite it using only pandas' methods and functions. First things first, I'd like to assign a new column NewCol that would contain also a new column PostId, but I highly doubt that I should do it in two steps. Could anyone please guide me towards solution or provide a full code I could analyze?

treskov · Accepted Answer · 2019-12-08 22:11:17Z

2

Would you like to rewrite this query in pandas in only one line? It might be done but it's highly unreadable. Something like this looks much neater

NewTab = Tab2.groupby('Col1').size().reset_index(name = 'NewCol1').rename(columns = {'Col1': 'NewCol'})

And now you can merge those two tables:

result_df = pd.merge(NewTab, Tab1, left_on = 'NewCol', right_on = 'Id')[result_df.Num == 1]

You can now sort the data frame after merging and specify the columns:

result_df.sort_values(by=['NewCol1'], inplace = True)
result_df = result_df[['Title','NewCol1']]

answered Dec 8, 2019 at 22:11

treskov

3281 gold badge4 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python SQL to pandas DataFrame 2

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related