Altcademy - a Forbes magazine logo Best Coding Bootcamp 2023

How to delete column in Pandas

Understanding DataFrames in Pandas

Before we dive into the process of deleting columns, it's important to understand what we're working with. In Pandas, a DataFrame is like a table that you might create in Excel. It has rows and columns, where each row represents an entry and each column represents a particular kind of information. For example, if you had a DataFrame of pet information, rows might represent individual pets, and columns could include details like 'Name', 'Species', 'Age', etc.

Setting Up Your Environment

To start manipulating DataFrames, you'll need to have Python and Pandas installed. If you haven't done this yet, you can install Pandas using pip, Python's package installer. Simply type the following command into your terminal or command prompt:

pip install pandas

Once installed, you'll need to import the Pandas library into your Python script or notebook:

import pandas as pd

Creating a Sample DataFrame

Let's create a simple DataFrame to work with. This will help us visualize what happens when we delete a column.

import pandas as pd

# Create a simple DataFrame
data = {
    'Name': ['Charlie', 'Lucy', 'Cooper'],
    'Species': ['Dog', 'Cat', 'Dog'],
    'Age': [2, 3, 1]
}

df = pd.DataFrame(data)
print(df)

If you run this code, you'll see a table printed out with 'Name', 'Species', and 'Age' as columns.

Deleting a Column Using drop

When you want to remove a column, the drop method is your go-to tool in Pandas. It's like telling a librarian to remove a book from a shelf – you're asking Pandas to take out a whole column from your DataFrame.

Here's how you can use the drop method:

# Drop the 'Age' column
df = df.drop('Age', axis=1)
print(df)

The axis=1 part is crucial. It tells Pandas that you want to drop a column, not a row (axis=0 would indicate a row). Think of axis=1 as the horizontal axis in a chart, which corresponds to columns.

Understanding inplace Parameter

The inplace parameter is a bit like a decision to either borrow or permanently give away a book. If you set inplace=True, it means you are making changes directly to the original DataFrame – like giving the book away for good. If you leave it as inplace=False or don't include it at all, you're just borrowing the book; the original DataFrame stays unchanged, and you get a modified copy instead.

Here is an example of using inplace:

# This will modify the original DataFrame without returning a new one
df.drop('Species', axis=1, inplace=True)

Now, if you print df, you'll notice the 'Species' column is gone for good.

Deleting Multiple Columns

What if you want to remove more than one column? That's like asking to remove several books from the shelf at once. You can pass a list of column names to the drop method:

# Drop both 'Name' and 'Species' columns
df = df.drop(['Name', 'Species'], axis=1)
print(df)

Using del to Delete Columns

Another way to remove a column is by using the del statement. It's more like taking a book off the shelf and throwing it into a recycling bin – quick and direct.

# Delete the 'Age' column
del df['Age']
print(df)

Notice that when you use del, there's no need to specify axis or inplace; the deletion is immediate and permanent.

Selecting Columns to Keep Instead of Delete

Sometimes it's easier to specify what you want to keep, rather than what you want to remove. Imagine you're packing a backpack for a hike; instead of thinking about what not to bring, you focus on the essentials to pack.

You can do this by selecting only the columns you want to keep:

# Select only the 'Name' and 'Species' columns to keep
df = df[['Name', 'Species']]
print(df)

Using pop to Delete and Use a Column

The pop method is a bit like popping a balloon with a message inside it. When you pop a column from a DataFrame, you not only remove it but also get its contents returned. This can be useful if you want to use the data in that column for something else.

# Pop the 'Species' column
popped_column = df.pop('Species')
print(df)  # 'Species' column is gone
print(popped_column)  # You have the contents of 'Species' column

Intuition and Analogies

Remember, deleting a column is like removing an entire chapter from a book. You're not just erasing a line; you're taking out all the information related to that topic. When you're working with data, think about whether that information is something you can afford to lose or if you might need it later.

Conclusion

In this journey, you've learned several ways to declutter your DataFrame by removing unwanted columns. Whether you choose the drop method, the del statement, the pop method, or simply select the columns you wish to keep, each approach is like choosing a different tool from your coding toolbox. As you become more comfortable with these commands, you'll find that managing and manipulating data in Pandas becomes as intuitive as organizing a bookshelf or packing for an adventure. With practice, you'll be able to look at your data and quickly decide which columns to keep and which to let go, streamlining your data analysis process and making your datasets more manageable and your insights clearer.