0

I have a data frame that has thousands of rows in a Jupyter Notebook using Pandas. I am attempting to use a particular column in this data frame to split the df into multiple dfs based on the columns values. If there is a way to do this without specifying the different values in the column explicitly, that would be great.

Col1 Col2 Col3 Col4
dat1 Val1 etc1 set1
dat2 Val2 etc2 set2
dat3 Val3 etc3 set2
dat4 Val4 etc4 set3

An example of one of the variations of my code:

NAM_df2 = NAM_df1.loc[NAM_df1["Col4"] == 'set2']

2 Answers 2

3

Try this:

dfs = [d for _, d in df.groupby('Col4')]
Sign up to request clarification or add additional context in comments.

4 Comments

does this convert my single df into a list? There was an error message when I went to view the dfs using dfs.head(). "list object has no attribute head"
Yes, it does. Is that not what you wanted? I'm sorry, it wasn't perfectly obvious what you wanted. Please clarify :)
My apologies. Asking the question the right way seems more complicated than piecing together an answer sometimes. The goal is to take a single df and split it into multiple dfs based on the value in a particular column. In the example above, based on 'Col4' values I would want to take the 1 df and make 3 dfs from it since there are 3 different values in Col4 to group by
That's what this code does, actually. This code splits the dataframe by all the unique values in Col4, so if Col4 has values 1 2 2 2 4 5 5, dfs will contains 4 dataframes, each containing all of the rows with a particular value (the first will contain all (1) the rows where Col4 is 1, the second will contain all (3) rows where Col4 is 2, etc.)
0

what about

df1s = [df.loc[df['Col1']==x, :] for x in df['Col1'].unique()]
df4s = [df.loc[df['Col4']==x, :] for x in df['Col4'].unique()]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.