Split dataframe in Pandas based on values in multiple columns
Last Updated :
31 Jul, 2023
In this article, we are going to see how to divide a dataframe by various methods and based on various parameters using Python. To divide a dataframe into two or more separate dataframes based on the values present in the column we first create a data frame.
Creating a DataFrame for demonestration
Python3
# importing pandas as pd
import pandas as pd
# dictionary of lists
dict = {'First_Name': ["Aparna", "Pankaj", "Sudhir",
"Geeku", "Anuj", "Aman",
"Madhav", "Raj", "Shruti"],
'Last_Name': ["Pandey", "Gupta", "Mishra",
"Chopra", "Mishra", "Verma",
"Sen", "Roy", "Agarwal"],
'Email_ID': ["[email protected]", "[email protected]",
"[email protected]", "[email protected]",
"[email protected]", "[email protected]",
"[email protected]", "[email protected]",
"[email protected]"],
'Degree': ["MBA", "BCA", "M.Tech", "MBA", "B.Sc",
"B.Tech", "B.Tech", "MBA", "M.Tech"],
'Score': [90, 40, 75, 98, 94, 90, 80, 90, 95]}
# creating dataframe
df = pd.DataFrame(dict)
print(df)
Output:

Split dataframe based on values By Boolean Indexing
We can create multiple dataframes from a given dataframe based on a certain column value by using the boolean indexing method and by mentioning the required criteria.
Example 1: Creating a dataframe for the students with Score >= 80
Python3
# creating a new dataframe by applying the required
# conditions in []
df1 = df[df['Score'] >= 80]
print(df1)
Output:

Example 2: Creating a dataframe for the students with Last_Name as Mishra
Python3
# Creating on the basis of Last_Name
dfname = df[df['Last_Name'] == 'Mishra']
print(dfname)
Output:

We can do the same for other columns as well by putting the appropriate condition
Split dataframe based on values Boolean Indexing with mask variable
We create a mask variable for the condition of the column in the previous method
Example 1: To get dataframe of students with Degree as MBA
Python3
# creating the mask variable with appropriate
# condition
mask_var = df['Degree'] =='MBA'
# creating a dataframe
df1_mask = df[mask_var]
print(df1_mask)
Output :

Example 2: To get a dataframe for the rest of the students
To get the rest of the values in a dataframe we can simply invert the mask variable by adding a ~(tilde) after it.
Python3
# creating dataframe with inverted mask variable
df2_mask = df[~mask_var]
print(df2_mask)
Output :

Split dataframe based on values Using groupby() function
Using groupby() we can group the rows using a specific column value and then display it as a separate dataframe.
Example 1: Group all Students according to their Degree and display as required
Python3
# Creating an object using groupby
grouped = df.groupby('Degree')
# the return type of the object 'grouped' is
# pandas.core.groupby.generic.DataFrameGroupBy.
# Creating a dataframe from the object using get_group().
# dataframe of students with Degree as MBA.
df_grouped = grouped.get_group('MBA')
print(df_grouped)
Output: dataframe of students with Degree as MBA

Example 2: Group all Students according to their Score and display as required
Python3
# Creating another object using groupby
grouped2 = df.groupby('Score')
# the return type of the object 'grouped2' is
# pandas.core.groupby.generic.DataFrameGroupBy.
# Creating a dataframe from the object
# using get_group() dataframe of students
# with Score = 90
df_grouped2 = grouped2.get_group(90)
print(df_grouped2)
Output: dataframe of students with Score = 90.
Similar Reads
How to select multiple columns in a pandas dataframe Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. In this article, we will discuss all the different ways of selecting multiple columns
5 min read
Split Pandas Dataframe by column value Sometimes in order to analyze the Dataframe more accurately, we need to split it into 2 or more parts. The Pandas provide the feature to split Dataframe according to column index, row index, and column values, etc. Let' see how to Split Pandas Dataframe by column value in Python? Now, let's create
3 min read
Unnest (Explode) Multiple List Columns In A Pandas Dataframe An open-source manipulation tool that is used for handling data is known as Pandas. Have you ever encountered a dataset that has columns with data as a list? In such cases, there is a necessity to split that column into various columns, as Pandas cannot handle such data. In this article, we will dis
6 min read
Split a text column into two columns in Pandas DataFrame Let's see how to split a text column into two columns in Pandas DataFrame. Method #1 : Using Series.str.split() functions. Split Name column into two different columns. By default splitting is done on the basis of single space by str.split() function. Python3 # import Pandas as pd import pandas as p
3 min read
Split a String into columns using regex in pandas DataFrame Given some mixed data containing multiple values as a string, let's see how can we divide the strings using regex and make multiple columns in Pandas DataFrame. Method #1: In this method we will use re.search(pattern, string, flags=0). Here pattern refers to the pattern that we want to search. It ta
3 min read