Pandas DataFrame sem() method with examples

Spread the love

The sem() method in pandas calculates the unbiased standard error of the mean over requested axis. Normalized by N-1 by default. This can be changed using the ddof argument.

syntax –

dataframe.sem(axis, skipna, level, ddof, numeric_only) 

axis – Whether to compute the statistic row-wise or column-wise

skipna – Exclude NA/null values. If an entire row/column is NA, the result will be NA.

level – If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a Series.

ddof – Delta Degrees of Freedom. The divisor used in calculations is N – ddof, where N represents the number of elements.

numeric_only – Include only float, int, boolean columns. If None, will attempt to use everything, then use only numeric data. Not implemented for Series.

Examples –

Let’s create a dataframe to work with.

import pandas as pd

data = {'Apple':[89, 89, 90, 110, 125, 84, 131, 123, 123, 140, 145, 145],
       'Orange': [46, 46, 50, 65, 63, 48, 110, 120, 60, 42, 47, 62],
       'Banana': [26, 30, 30, 25, 38, 22, 22, 36, 20, 27, 23, 34 ],
       'Mango': [80, 80, 90, 125, 130, 150, 140, 140, 135, 135, 80, 90]}

index = ['Jan','Feb','Mar','Apr','May','June','Jul','Aug','Sep','Oct','Nov','Dec']
df = pd.DataFrame(data, index=index)
df

1 . Calculate SEM for each column –

By default the axis=0 which means pandas will calculate SEM for each columns.

df.sem()
#output
Apple     6.660414
Orange    7.354781
Banana    1.701715
Mango     8.035394
dtype: float64

2 . Calculate SEM for each row –

To calculate the SEM for each row set the axis parameter to axis=1 or columns.

df.sem(axis=1)
#output
Jan     14.699065
Feb     13.936612
Mar     15.000000
Apr     22.672946
May     22.829075
June    27.748874
Jul     26.991897
Aug     23.335863
Sep     27.069355
Oct     29.908193
Nov     26.468141
Dec     23.690276
dtype: float64

Rating: 1 out of 5.

Leave a Reply