# How to Calculate the Standard Deviation in Python

### Introduction

The standard deviation is a measure of the amount of variance or dispersion in a set of values. A low standard deviation indicates that the values tend to be close to the mean (average) of the set, while a high standard deviation indicates that the values are spread out over a broader range.

In statistics, two types of standard deviations are commonly used – population standard deviation and sample standard deviation. The population standard deviation is used when an entire population is available, and the sample standard deviation is used when only a sample is available.

This article will guide you on how to calculate the standard deviation in Python. We will explore different Python libraries, namely numpy, statistics, pandas, and scipy, which provide functionalities to efficiently calculate the standard deviation.

### Standard Deviation Formula

The formula for calculating the population standard deviation is:

σ = sqrt[ Σ ( xi - μ )² / N ]

And for the sample standard deviation:

s = sqrt[ Σ ( xi - x̄ )² / (n - 1) ]

Where:

• xi represents each value in the dataset,
• μ is the population mean,
• x̄ is the sample mean,
• N is the size of the population,
• n is the size of the sample,
• Σ is the sum of the values.

The square root is used to bring the units of variance, which are squared, back to the original units of measurement.

## Calculating Standard Deviation in Python

### Using Built-in Python Functions

Standard deviation can be calculated using pure Python by following the standard deviation formula:

import math

# Sample data
data = [4, 2, 5, 8, 6]

# Calculate mean
mean = sum(data) / len(data)

# Calculate variance (average of squared differences from the mean)
variance = sum((xi - mean) ** 2 for xi in data) / len(data)

# Calculate standard deviation (square root of variance)
std_dev = math.sqrt(variance)

print("Standard Deviation:", std_dev)

This method works, but it can be somewhat lengthy, especially for large datasets.

### Using the statistics Library

Python’s statistics library, which was introduced in Python 3.4, provides functions to calculate mathematical statistics of numeric data. It offers the pstdev function to calculate the population standard deviation, and the stdev function to calculate the sample standard deviation.

import statistics as stats

# Sample data
data = [4, 2, 5, 8, 6]

print("Population Standard Deviation:", stats.pstdev(data))
print("Sample Standard Deviation:", stats.stdev(data))

### Using numpy

numpy is a powerful library in Python for mathematical and scientific computing. It provides the std function to calculate the standard deviation. By default, std calculates the population standard deviation. For the sample standard deviation, we need to set the ddof (Delta Degrees of Freedom) parameter to 1.

import numpy as np

# Sample data
data = np.array([4, 2, 5, 8, 6])

print("Population Standard Deviation:", np.std(data))
print("Sample Standard Deviation:", np.std(data, ddof=1))

### Using pandas

pandas is a data manipulation and analysis library in Python. It provides data structures and functions needed to manipulate structured data. The std function of a pandas Series or DataFrame computes the standard deviation. By default, this function computes the sample standard deviation. To compute the population standard deviation, we need to set ddof to 0.

import pandas as pd

# Sample data
data = pd.Series([4, 2, 5, 8, 6])

print("Population Standard Deviation:", data.std(ddof=0))
print("Sample Standard Deviation:", data.std())

### Conclusion

In this tutorial, we have learned how to calculate the standard deviation in Python using several different methods and libraries. The standard deviation is a key statistical measure that shows the amount of variation in a dataset. Knowing how to calculate the standard deviation is a critical skill for anyone working in data analysis or statistics. Python provides several ways to calculate standard deviation efficiently, making it an excellent tool for such tasks.