Split a single data frame into multiple data frames based on a columns value in pandas

Question

I have a data frame that has thousands of rows in a Jupyter Notebook using Pandas. I am attempting to use a particular column in this data frame to split the df into multiple dfs based on the columns values. If there is a way to do this without specifying the different values in the column explicitly, that would be great.

Col1	Col2	Col3	Col4
dat1	Val1	etc1	set1
dat2	Val2	etc2	set2
dat3	Val3	etc3	set2
dat4	Val4	etc4	set3

An example of one of the variations of my code:

NAM_df2 = NAM_df1.loc[NAM_df1["Col4"] == 'set2']

user17242583 · Accepted Answer · 2022-02-09 00:21:49Z

3

Try this:

dfs = [d for _, d in df.groupby('Col4')]

answered Feb 9, 2022 at 0:21

user17242583

Sign up to request clarification or add additional context in comments.

4 Comments

Joshua3535Econ Over a year ago

does this convert my single df into a list? There was an error message when I went to view the dfs using dfs.head(). "list object has no attribute head"

user17242583 Over a year ago

Yes, it does. Is that not what you wanted? I'm sorry, it wasn't perfectly obvious what you wanted. Please clarify :)

Joshua3535Econ Over a year ago

My apologies. Asking the question the right way seems more complicated than piecing together an answer sometimes. The goal is to take a single df and split it into multiple dfs based on the value in a particular column. In the example above, based on 'Col4' values I would want to take the 1 df and make 3 dfs from it since there are 3 different values in Col4 to group by

user17242583 Over a year ago

That's what this code does, actually. This code splits the dataframe by all the unique values in Col4, so if Col4 has values 1 2 2 2 4 5 5, dfs will contains 4 dataframes, each containing all of the rows with a particular value (the first will contain all (1) the rows where Col4 is 1, the second will contain all (3) rows where Col4 is 2, etc.)

Roppon Picha · Accepted Answer · 2022-02-09 00:59:37Z

0

what about

df1s = [df.loc[df['Col1']==x, :] for x in df['Col1'].unique()]
df4s = [df.loc[df['Col4']==x, :] for x in df['Col4'].unique()]

answered Feb 9, 2022 at 0:59

Roppon Picha

33 bronze badges

Collectives™ on Stack Overflow

Split a single data frame into multiple data frames based on a columns value in pandas

2 Answers 2

4 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Related