How to Perform Fisher’s Exact Test in Python

Spread the love

Introduction

Fisher’s Exact Test is a statistical significance test that is used to analyze the association between two categorical variables in a 2×2 contingency table. It is particularly useful when the sample sizes are small and the data does not meet the requirements needed for the chi-squared test. In this article, we will go through the steps to perform Fisher’s Exact Test using Python.

Background and Significance of Fisher’s Exact Test

Fisher’s Exact Test is used in the analysis of contingency tables. While the chi-squared test is used for large sample sizes, Fisher’s Exact Test is preferable for small sample sizes where the chi-squared test is inappropriate. The test is called ‘exact’ because its significance level does not rely on an approximation. It is often used in the analysis of categorical data where the variables are dichotomous and the sample size is small.

Understanding Fisher’s Exact Test

a. Hypotheses

The null and alternative hypotheses for Fisher’s Exact Test are as follows:

  • Null Hypothesis (H0): There is no association between the two categorical variables.
  • Alternative Hypothesis (H1): There is an association between the two categorical variables.

b. Assumptions

  • The data is categorical.
  • The sampling method is simple random sampling.
  • The data is displayed in a 2×2 contingency table.
  • The sample size is small.

c. Applications

  • Analyzing medical clinical trial data where the sample size is small.
  • Investigating associations between two binary classifications.

Loading and Preparing Data

Before you can perform Fisher’s Exact Test, you need to have some data. Load your data from a CSV file, excel, SQL database, or any other source. The pandas library is useful for loading and managing data.

Example:

import pandas as pd

# Load data from a CSV file
data = pd.read_csv('your-data-file.csv')

Performing Fisher’s Exact Test in Python

a. Using scipy.stats

The scipy library provides the fisher_exact function for performing Fisher’s Exact Test.

from scipy.stats import fisher_exact

# Contingency table
# [[a, b],
#  [c, d]]

table = [[8, 2], [1, 5]]

# Perform Fisher's Exact Test
odds_ratio, p_value = fisher_exact(table)

# Output the results
print(f"Odds Ratio: {odds_ratio}")
print(f"P-value: {p_value}")

b. Interpreting the Results

The p-value tells you whether or not the differences between the proportions are statistically significant. If the p-value is below a threshold, usually 0.05, you can reject the null hypothesis and conclude that there is a significant association between the two categorical variables.

Practical Example

Let’s consider a practical example where you have data on the success and failure rates of two different treatments for a medical condition.

from scipy.stats import fisher_exact

# Sample data: success and failure of treatments
# [[Treatment1_success, Treatment1_failure],
#  [Treatment2_success, Treatment2_failure]]

data = [[10, 6], [2, 12]]

# Perform Fisher's Exact Test
odds_ratio, p_value = fisher_exact(data)

# Output the results
print(f"Odds Ratio: {odds_ratio}")
print(f"P-value: {p_value}")

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis - There is a significant association between the treatment types and success rates.")
else:
    print("Fail to reject the null hypothesis - There is no significant association between the treatment types and success rates.")

Conclusion

Fisher’s Exact Test is an essential statistical test for analyzing small datasets and determining the significance of associations between two categorical variables in 2×2 contingency tables. Python, with its scipy library, provides an efficient and user-friendly way to perform Fisher’s Exact Test. This test is particularly useful in fields such as medical research, where researchers often work with small sample sizes. When interpreting the results, it is crucial to consider the context of your data and the assumptions of Fisher’s Exact Test.

Leave a Reply