Fisher’s Exact Test is a statistical procedure used to analyze contingency tables. It is commonly applied when you want to assess the association between two categorical variables, and is particularly useful when the sample sizes are small.
In this article, we’ll provide a comprehensive guide on how to perform Fisher’s Exact Test in the R programming language, beginning with a theoretical overview of the test, and then providing detailed examples using R code.
Introduction to Fisher’s Exact Test
Fisher’s Exact Test is named after Ronald Fisher, who developed the method. It’s a statistical significance test used in the analysis of 2×2 contingency tables. The test is an alternative to the Chi-Square Test and is preferred when the sample sizes are small or when the expected frequencies in any of the cells of a 2×2 table are below 5.
Suppose you have two categorical variables with two levels each. You want to investigate whether there’s an association between these variables. You can represent the observed counts in a 2×2 table as follows:
Here, the test aims to examine whether there’s a non-random association between the two groups and the two categories.
Fisher’s Exact Test calculates the exact probability of observing the given arrangement in a 2×2 table, assuming the null hypothesis that the row and column variables are independent. Unlike the Chi-Square Test, which provides an approximation, Fisher’s Exact Test computes the exact probability, making it suitable for small sample sizes.
Performing Fisher’s Exact Test in R
Now, let’s go through the step-by-step process of performing Fisher’s Exact Test in R.
Suppose you are studying the effectiveness of a new treatment for a disease. You have two groups: one that received the new treatment and one that received a placebo. The results are categorized into success and failure, and the observed counts are as follows:
Step 1: Load Data
First, you need to input the data into R using the
data <- matrix(c(10, 2, 5, 8), nrow = 2, byrow = TRUE) colnames(data) <- c("Treatment", "Placebo") rownames(data) <- c("Success", "Failure")
Step 2: Perform Fisher’s Exact Test
fisher.test function to conduct Fisher’s Exact Test:
result <- fisher.test(data)
Step 3: View Results
Print the results to see the output:
The output will include the p-value, which helps you determine if there’s a significant association between the two variables.
Step 4: Interpret Results
If the p-value is less than the significance level (commonly 0.05), you can reject the null hypothesis, indicating that there’s a statistically significant association between the variables.
You can specify an alternative hypothesis using the
alternative argument. The options include “two.sided” (default), “less”, and “greater”.
result <- fisher.test(data, alternative = "greater")
Fisher’s Exact Test in R also provides a confidence interval for the odds ratio. By default, a 95% confidence interval is calculated, but you can adjust this with the
result <- fisher.test(data, conf.level = 0.99)
Fisher’s Exact Test is a robust statistical method for analyzing 2×2 contingency tables, especially when dealing with small sample sizes or low expected frequencies. Its application in R is facilitated by the built-in
fisher.test function, enabling an easy and precise way to assess the association between two categorical variables.
By understanding the theoretical underpinnings and applying the test through the step-by-step guide provided, researchers and analysts can utilize Fisher’s Exact Test in various fields, including medicine, social sciences, ecology, and more.