The std() method in pandas calculates the sample standard deviation over requested axis. In statistics standard deviation is the average amount of variability in your data set. It tells you on average how far each score lies from the mean.
Examples –
Let’s create a dataset to work with.
import pandas as pd
data = {'Apple':[89, 89, 90, 110, 125, 84, 131, 123, 123, 140, 145, 145],
'Orange': [46, 46, 50, 65, 63, 48, 110, 120, 60, 42, 47, 62],
'Banana': [26, 30, 30, 25, 38, 22, 22, 36, 20, 27, 23, 34 ],
'Mango': [80, 80, 90, 125, 130, 150, 140, 140, 135, 135, 80, 90]}
index = ['Jan','Feb','Mar','Apr','May','June','Jul','Aug','Sep','Oct','Nov','Dec']
df = pd.DataFrame(data, index=index)
df

1 . Calculate the standard deviation of a column –
You can calculate the standard deviation of a single column like this
df['Apple'].std()
#output
23.072349974229617
or you can calculate the standard deviation for all the columns like this
df.std()
#output
Apple 23.072350
Orange 25.477709
Banana 5.894913
Mango 27.835420
dtype: float64
2 . Calculate the standard deviation of a row –
To calculate the standard deviation of a row, we need to set the axis parameter to axis=1 or columns.
df.std(axis=1)
Jan 29.398129
Feb 27.873225
Mar 30.000000
Apr 45.345893
May 45.658150
June 55.497748
Jul 53.983794
Aug 46.671726
Sep 54.138711
Oct 59.816386
Nov 52.936282
Dec 47.380552
dtype: float64
3 . Change degrees of freedom –
You can change the degrees of freedom using the ddof parameter. By default it is normalized by N-1. To normalize by N, we need to set the ddof=0.
df.std(ddof=0)
Apple 22.090093
Orange 24.393049
Banana 5.643950
Mango 26.650386
dtype: float64