
pandas.DataFrame.apply –
Apply in pandas is a series as well as a dataframe method. It applies a function to each elements in the series.
Let’s read a dataset to illustrate this –
import pandas as pd
import numpy as np
url = "https://raw.githubusercontent.com/bprasad26/lwd/master/data/winequality-red.csv"
df = pd.read_csv(url)
df.head()

Apply on a series –
Let’s say that you want to change the quality column data. For all values below 5, you want to rate it ‘low’ , for 5 and 6 you want to rate ‘medium’ and for all data above 6 you want to rate it as ‘high’.
def change_quality(value):
if value < 5:
return 'low'
elif value ==5 or value ==6:
return 'medium'
else:
return 'High'
df['quality_cat'] = df['quality'].apply(change_quality)
df[['quality','quality_cat']].sample(10)

Apply on a DataFrame –
When we apply a function on a dataframe we need to pass the axis parameter.
Let’s only take a subset of data to understand what is going on. For this we will only take 5 rows from pH and alcohol.
df_new = df[['pH', 'alcohol']].head()
df_new

Let’s say that we want to sum all the values in a column.
df_new.apply(np.sum, axis=0)

And if you want to sum the values for each rows, you will apply the sum with axis=1.
df_new.apply(np.sum, axis=1)

You can also use the lambda function with apply.
df_new.apply(lambda row: row['pH'] + row['alcohol'], axis=1)

Related Posts –
1 . Pandas – How map() function works in pandas
2 . Pandas – What does axis=0 and axis=1 mean in Pandas