How to Delete a Column/Row From a DataFrame
Before we start: This Python tutorial is a part of our series of Python Package tutorials. The steps explained ahead are related to the sample project introduced here.
You can use the drop function to delete rows and columns in a Pandas DataFrame. Let’s see how.
First, let’s load in a CSV file called Grades.csv, which includes some columns we don’t need.
The Pandas library provides us with a useful function called drop which we can utilize to get rid of the unwanted columns and/or rows in our data.
Report_Card = pd.read_csv("Grades.csv") Report_Card.drop("Retake",axis=1,inplace=True)
In the above example, we provided the following arguments to the drop function:
- the name of the column to be dropped (Retake)
- An axis value of 1 to signify we want to delete a column
- An inplace value of True to make sure we delete the column from the original DataFrame. If we don’t use the inplace=True argument our drop function will return a copy of the initial DataFrame with the Retake column deleted, which is sometimes not desirable if we are working with a relatively big DataFrame.
We can also use the drop method to simply drop any row of the DataFrame, as well. Let’s say we want to drop the 6th row, which will require us to give 5 as an argument, as well an axis value of 0, which indicates that a row is to be dropped:
We can also provide an array of indices as an argument. For example, if we wanted to drop all rows that had German as the value for “lecture,” and we know that the 5th 11th and 17th rows have German as the lecture value, we could use:
This way of manually writing the indices in an array is not always possible if we don’t know the data structure beforehand. In these cases, it’s better to use a generalized approach. The following code achieves the same functionality as the previous code block that used an array:
Report_Card.drop(Report_Card.index[(Report_Card["Lectures"] == "German")],axis=0,inplace=True)
Since the “drop” function accepts indices, we applied the index function to the Report_Card DataFrame, and then provided arguments to create a Series object that evaluates to True for the locations of all rows containing German as the lecture value.
And finally, if we wanted to delete our entire DataFrame, we can simply use:
Python’s garbage collection will automatically handle the deallocation of our DataFrame.
Now that you know how to delete a row or a column in a DataFrame using Python’s Pandas library, let’s move on to other things you can do with Pandas:
Get Pre-compiled Python Packages For Data Science, Web Development, Machine Learning, Code Quality And Security
If you’re one of the many engineers using Python to build your algorithms, ActivePython is the right choice for your projects Get The Machine Learning Packages You Need – No Configuration Required. We’ve built the hard-to-build packages so you don’t have to waste time on configuration…get started right away! Learn more about ActivePython here.
With deep roots in open source, and as a founding member of the Python Foundation, ActiveState actively contributes to the Python community. We offer the convenience, security and support that your enterprise needs while being compatible with the open source distribution of Python.
Download ActiveState Python to get started or contact us to learn more about using ActiveState Python in your organization.
You can also start by trying our mini ML runtime for Linux or Windows that includes most of the popular packages for Machine Learning and Data Science, pre-compiled and ready to for use in projects ranging from recommendation engines to dashboards.