Split Pandas DataFrame by Rows in Python5 Jan 2025 | 5 min read Pandas is a powerful and open-source Python library that is used for manipulating data and is useful in performing data analysis tasks; pandas provide data structures and functions that are very helpful in performing data analysis tasks. Pandas is built on top of the NumPy library, which is well-suited for working with tabular data, such as spreadsheets or SQL tables. The Pandas library is versatile and easy to use, which makes it a powerful tool for data analysis. Data scientists use Pandas to work with structured data in Python. Pandas are used in conjunction with other libraries that are used for data science. Pandas is built on top of the NumPy library, which means that a lot of functions are taken from NumPy. The data generated by Pandas are used to plot the function of Matplotlib, perform statistical analysis in SciPy, and use the machine learning algorithm in Scikit-learn. The function of Pandas Library:
Visualization of data. The pd represents an alias for the Pandas. It is not necessary to use the alias; this alias just helps in writing less code every time and can be used to write the code cleanly. There are two types of data structures provided by pandas:
Pandas Series:Pandas series is a one-dimensional array that is used to hold data of any type (integer, string, float, Python objects, etc.). The axis labels are called as indexes. Pandas series are a type of column in an Excel Sheet. The labels in the Pandas series must not be unique but must be a hashable type. Let's see how to create a series in Pandas. The series can be created with the help of lists, dictionaries, scaler values, etc. Output: Pandas Series: Series([], dtype: float64) Pandas Series: 0 p 1 a 2 n 3 d 4 a dtype: object In the above code, the panda's library is imported as pd, and the NumPy library is imported as np. A series is created with the help of the Series method provided by the Pandas, and a numpy array of characters is created with the help of the array() method in numpy; the array values are passed in the series() method in pandas, and the series is printed. Let's see how to create a data frame in Pandas: DataFrame is like tables in which the values are stored in the form of rows and columns. DataFrame can be created by loading datasets from SQL databases, CSV files, or Excel files. Pandas dataframe can also be created from lists, dictionaries and from a list of dictionaries etc. Example: Output: Empty DataFrame Columns: [] Index: [] 0 0 Data 1 Frame 2 in 3 Pandas Explanation: In the above code, the panda's module is imported, a DataFrame constructor is made, a list is created, the list is passed to the DataFrame constructor, and the data is printed. Many a time, an import error occurs when you try to import the pandas. This happens due to improper installation of the panda's library, and the panda's library is not installed. Let's see how to split Pandas DataFrame: Example: Output: Create DataFrame: Courses Fee Discount Duration 0 Spark 22000 1000 35days 1 PySpark 25000 2300 35days 2 Hadoop 23000 1000 40days 3 Python 24000 1200 30 days 4 Pandas 26000 2500 25days Explanation: In the above code, the Pandas module is imported, and a data frame is created with the help of a dictionary. The column values are split using the local method provided by Pandas. Split Pandas Dataframe by rows using iloc[] split function:The iloc attribute provided by Python helps in splitting the dataframe by rows. The iloc is used to get rows and columns by position or index. Splitting Dataframe by Row:This method is used to get the specific portion based on rows from the DataFrame. Let's see how to split the data frame. Code: Output: Create DataFrame: Courses Fee Discount Duration 0 Spark 22000 1000 35days 1 PySpark 25000 2300 35days 2 Hadoop 23000 1000 40days 3 Python 24000 1200 30 days 4 Pandas 26000 2500 25days ========================= Courses Fee Discount Duration 0 Spark 22000 1000 35days 1 PySpark 25000 2300 35days ========================= Courses Fee Discount Duration 2 Hadoop 23000 1000 40days 3 Python 24000 1200 30 days 4 Pandas 26000 2500 25days Explanation: In the above code, the Pandas module is imported and a data frame is create with dictionary data. With the help of the local method, the data frame is split by rows. Split Dataframe by Columns:The data frames can be split into columns with the help of the local method based on rows. Let's see how to split the data frame by columns. Code: Output: Create DataFrame: Courses Fee Discount Duration 0 Spark 22000 1000 35days 1 PySpark 25000 2300 35days 2 Hadoop 23000 1000 40days 3 Python 24000 1200 30days 4 Pandas 26000 2500 25days ========================= Courses Fee 0 Spark 22000 1 PySpark 25000 2 Hadoop 23000 3 Python 24000 4 Pandas 26000 ===================== Discount Duration 0 1000 35days 1 2300 35days 2 1000 40days 3 1200 30days 4 2500 25days ===================== Explanation: In the above code, the Pandas module is imported, and with the help of a dictionary, a dataframe is created. The data is split into columns based on rows. Conclusion:Splitting rows in Pandas is very important in the context of data analysis. There are various methods by which the Pandas DataFrames can be split into rows in Python. Next TopicSql using python |
Difference Between '_eq_' VS 'is' VS '==' in Python Object examination is a principal part of Python programming, empowering designers to assess the balance and personality of items. In Python, objects are at the centre of everything - factors, information designs, capabilities, and more are objects....
7 min read
A bar chart, the most commonly used type of graph, provides a straightforward visual representation of values. A bar chart displays the values of the many categories, making it simple to compare the values of several categories at once without looking at each in detail. Various...
7 min read
An Introduction to Python Dictionary A dictionary is a type of data structure in Python that lets you store and retrieve information in the key-value pair format. In dictionaries, keys are essential components that facilitate the access and organization of data. An overview of keys in...
9 min read
? In Python, modules are files containing Python code that define functions, classes, and variables. They allow you to organize your code into logical units, making it easier to manage and reuse. Normally, you import a module using the import statement at the beginning of your...
3 min read
Nested loops in Python Introduction Loops are a fundamental concept in programming that allows us to repeatedly execute a block of code. In Python, there are various types of loops, and one powerful concept is the nested loop. Nested loops occur when you place one loop inside another....
7 min read
The Python Imaging Library (PIL) helps your Python interpreter gain additional image processing capabilities. It can open, edit, and save numerous picture file formats. Pillow, the amiable PIL offshoot, has kept the library current and alive by introducing new features and updating it to operate...
3 min read
A confidence interval is a statistical term that specifies the range of values most likely to contain the real value of an unknown parameter. It calculates the margin of error or uncertainty associated with a statistical estimate. In inferential statistics, confidence intervals are widely utilised...
7 min read
The process of extracting data from web pages using various technologies is called web scraping. Many libraries, such as Scrapy, Selenium, and Beautiful Soup, are available for Python and can be used to quickly and effectively extract valuable data from webpages. Having the appropriate tools...
8 min read
In probability theory and statistics, a Cumulative Distribution Function (CDF) is a critical concept. It is a mathematical function that provides the probability that a random variable will be less than or equal to a specific value. The cumulative distribution function (CDF) applies to discrete and...
4 min read
Introduction: In the realm of scientific computing and data analysis in Python, the NumPy library stands as a fundamental tool. NumPy provides support for arrays and matrices, along with a plethora of functions for mathematical operations. Among its many functions, numpy.hstack() holds a significant place for...
3 min read
We request you to subscribe our newsletter for upcoming updates.
We provides tutorials and interview questions of all technology like java tutorial, android, java frameworks
G-13, 2nd Floor, Sec-3, Noida, UP, 201301, India