Pandas – How to Rename Column names.

You have a dataframe and you want to rename the column names because either the names have extra useless characters or you want to change the names to all lowercase or you want to replace the spaces with underscores so that you can use the dot notation for column selection. There could be multiple reasons why do you want to do that, so let’s take a look.

Solution –

(1) Using df.rename() function –

Let’s create a pandas dataframe to work with.

# import pandas
import pandas as pd

# list of corporation names
corporations = ['Exxon Mobil','Walmart','Chevron','ConocoPhillips','General Electric']

# and their revenues in billion dollars
revenues = [443, 406, 263, 231, 149]

# create dataframe
df = pd.DataFrame({'$Corporation': corporations,
                 '$Revenue': revenues})
df

You can see that the column names has some extra $ character and you want to get rid of it.

We can do that with df.rename. One way to use the rename function is to pass the old and new column names as a dictionary key value pair to the columns parameter.

# rename column names
df.rename(columns={'$Corporation':'Corporation',
                  '$Revenue':'Revenue'})

The rename function also has a index parameter to rename the index labels.

# rename columns and indexes
index = {0:'A', 1:'B', 2:'C', 3:'D', 4:'E'}

columns = {'$Corporation':'Corporation','$Revenue':'Revenue'}

df.rename(columns= columns, index=index)

Another way to use the rename function is to pass the dictionary to it and use the axis parameter to define whether you want to rename the columns or the indexes.


# rename column names
cols = {'$Corporation':'Corporation','$Revenue':'Revenue'}

df.rename(cols, axis='columns') 

axis = ‘columns’ or axis = 1 –> For renaming columns.

axis = ‘index’ or axis = 0 –> For renaming indexes.

By default is it axis=’indexes’ or axis=0.

The rename function also has a inplace parameter to change the column names in place. So far what I did is changes the column names temporarily. If you do not want to use the inplace parameter then you have to assign the renamed dataframe back to a variable for permanent changes.

cols = {'$Corporation':'Corporation','$Revenue':'Revenue'}

# with default inplace=False
df1 = df.rename(cols, axis='columns') 

# with inplace=True
df.rename(cols, axis='columns', inplace=True)

(2) Using df.rename() with lambda function –

You can also pass a lambda function to the columns and index parameters of the rename function. This is useful when you want to rename all of the columns in a particular way like replacing spaces with underscore.

Here, we have some data for clothing store sales.

#read data
url = "https://raw.githubusercontent.com/bprasad26/lwd/master/data/clothing_store_sales"
df = pd.read_csv(url)
df.head()

The column names contains white spaces between words which means you can not use the dot notation for column selection. You have to use square bracket notation. To change the column names we can use the lambda function with the rename function.

# rename column names with lambda function
df.rename(columns = lambda x : x.replace(' ', '_'))

We can also do method chaining to lowercase all the column names along with replacing whitespace with underscore in one call.

# rename column names with lambda function
df.rename(columns = lambda x : x.replace(' ', '_').lower())

(3) Using df.columns –

Another way to rename column names is using df.columns attribute. df.columns returns the names of the columns

df.columns

Now to change the column names all we have to do is assign the new column names in a list to df.columns.

# new column names
cols = ['Customer', 'Type_of_Customer', 'Items', 'Net_Sales',
       'Method_of_Payment', 'Gender', 'Marital_Status','Age']

df.columns = cols

But when you use this method, you have to make sure that the length of the cols matches with the number of columns in your dataframe otherwise pandas will throw an error. And you also have to make sure that the new column names are in the right position as in the dataframe otherwise it will rename incorrectly.

Another way to do the same thing is with list comprehension.

# df.columns with list comprehension
df.columns = [col.replace(' ', '_').lower() for col in df.columns]

If you liked this post then please share it with others and also subscribe to our blog below to learn more about pandas.

Rating: 1 out of 5.

Leave a Reply