Pandas DataFrame skew() method with examples

Spread the love

The skew() method in pandas calculates the skew for each column. It is Normalized by N-1. Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean.

Syntax –

dataframe.skew(axis, skipna, level, numeric_only, kwargs) 

axis – Axis for the function to be applied on.

skipna – Exclude NA/null values when computing the result.

level – If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a Series.

numeric_only – Include only float, int, boolean columns. If None, will attempt to use everything, then use only numeric data. Not implemented for Series.

**kwargs – Additional keyword arguments to be passed to the function.

Examples –

Let’s create a dataframe to work with.

import pandas as pd

data = {'Apple':[89, 89, 90, 110, 125, 84, 131, 123, 123, 140, 145, 145],
       'Orange': [46, 46, 50, 65, 63, 48, 110, 120, 60, 42, 47, 62],
       'Banana': [26, 30, 30, 25, 38, 22, 22, 36, 20, 27, 23, 34 ],
       'Mango': [80, 80, 90, 125, 130, 150, 140, 140, 135, 135, 80, 90]}

index = ['Jan','Feb','Mar','Apr','May','June','Jul','Aug','Sep','Oct','Nov','Dec']
df = pd.DataFrame(data, index=index)
df

1 . Calculate skew for each columns –

By default the axis is set to axis=0 or index which means pandas will calculate the skew for each columns.

df.skew()
#output
Apple    -0.226154
Orange    1.681329
Banana    0.493070
Mango    -0.276787
dtype: float64

2 . Calculate skew for each row –

To calculate the skew for each row, set the axis parameter to axis=1 or columns.

df.skew(axis=1)
#output
Jan    -0.304469
Feb    -0.187888
Mar    -0.370370
Apr    -0.534560
May    -0.242693
June    0.882685
Jul    -1.692635
Aug    -1.780567
Sep    -0.408317
Oct    -0.048126
Nov     0.953203
Dec     0.732144
dtype: float64

Rating: 1 out of 5.

Leave a Reply