
The skew()
method in pandas calculates the skew for each column. It is Normalized by N-1. Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean.
Syntax –
dataframe.skew(axis, skipna, level, numeric_only, kwargs)
axis – Axis for the function to be applied on.
skipna – Exclude NA/null values when computing the result.
level – If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a Series.
numeric_only – Include only float, int, boolean columns. If None, will attempt to use everything, then use only numeric data. Not implemented for Series.
**kwargs – Additional keyword arguments to be passed to the function.
Examples –
Let’s create a dataframe to work with.
import pandas as pd
data = {'Apple':[89, 89, 90, 110, 125, 84, 131, 123, 123, 140, 145, 145],
'Orange': [46, 46, 50, 65, 63, 48, 110, 120, 60, 42, 47, 62],
'Banana': [26, 30, 30, 25, 38, 22, 22, 36, 20, 27, 23, 34 ],
'Mango': [80, 80, 90, 125, 130, 150, 140, 140, 135, 135, 80, 90]}
index = ['Jan','Feb','Mar','Apr','May','June','Jul','Aug','Sep','Oct','Nov','Dec']
df = pd.DataFrame(data, index=index)
df

1 . Calculate skew for each columns –
By default the axis is set to axis=0 or index which means pandas will calculate the skew for each columns.
df.skew()
#output
Apple -0.226154
Orange 1.681329
Banana 0.493070
Mango -0.276787
dtype: float64
2 . Calculate skew for each row –
To calculate the skew for each row, set the axis parameter to axis=1 or columns.
df.skew(axis=1)
#output
Jan -0.304469
Feb -0.187888
Mar -0.370370
Apr -0.534560
May -0.242693
June 0.882685
Jul -1.692635
Aug -1.780567
Sep -0.408317
Oct -0.048126
Nov 0.953203
Dec 0.732144
dtype: float64