1
pd.read_sql_query("""SELECT Tab1.Title, NewTab.NewCol1 FROM
                            (SELECT Col1 AS NewCol, COUNT(*) AS NewCol1
                            FROM Tab2 GROUP BY Col1) AS NewTab
                     JOIN Tab1 ON NewTab.NewCol=Tab1.Id
                     WHERE Tab1.Num=1
                     ORDER BY NewCol1 DESC""", conn)

My goal is to rewrite it using only pandas' methods and functions. First things first, I'd like to assign a new column NewCol that would contain also a new column PostId, but I highly doubt that I should do it in two steps. Could anyone please guide me towards solution or provide a full code I could analyze?

1 Answer 1

2

Would you like to rewrite this query in pandas in only one line? It might be done but it's highly unreadable. Something like this looks much neater

NewTab = Tab2.groupby('Col1').size().reset_index(name = 'NewCol1').rename(columns = {'Col1': 'NewCol'})

And now you can merge those two tables:

result_df = pd.merge(NewTab, Tab1, left_on = 'NewCol', right_on = 'Id')[result_df.Num == 1]

You can now sort the data frame after merging and specify the columns:

result_df.sort_values(by=['NewCol1'], inplace = True)
result_df = result_df[['Title','NewCol1']]
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.