Nested Analysis of Variance (ANOVA) is a statistical technique used to evaluate the variance in a hierarchical or nested experimental design. Nested ANOVA is particularly useful when you have different categories of subjects that are inherently related in some way, for example, patients within the same hospital, or students in the same class.
Performing a Nested ANOVA in R can appear daunting, but it is a manageable task if approached systematically. This comprehensive guide aims to provide you with all the information needed to perform a Nested ANOVA in R, from understanding the basic assumptions to interpreting the results.
Understanding Nested ANOVA
Before diving into the steps for performing a Nested ANOVA in R, it is crucial to understand the assumptions:
- Normal Distribution: The dependent variable should be normally distributed within groups.
- Independence: Observations within each group must be independent of each other.
- Homogeneity of Variances: The variance of the dependent variable within each group should be roughly equal.
Preparing Data
To perform a Nested ANOVA in R, you need data that fits the nested model structure. The data should be in long format, where each row represents an observation and each column represents a variable.
Here’s an example data frame:
# Create example data frame
data <- data.frame(
Hospital = rep(c("H1", "H2", "H3"), each = 6),
Patient = rep(c("P1", "P2", "P3", "P4", "P5", "P6"), 3),
Measurement = c(10, 20, 10, 15, 20, 12, 30, 25, 20, 27, 35, 23, 5, 8, 5, 7, 6, 9)
)
Installing and Loading Necessary Packages
You will need the nlme
package for running the Nested ANOVA, and the ggplot2
package for visualization.
Install the packages with:
install.packages("nlme")
install.packages("ggplot2")
Then load the packages:
library(nlme)
library(ggplot2)
Data Exploration
Before proceeding with the Nested ANOVA, it is a good idea to perform some exploratory data analysis.
# Summary statistics
summary(data)
# Checking for normality
shapiro.test(data$Measurement)
# Checking for homogeneity of variances
bartlett.test(Measurement ~ Hospital, data = data)
Performing Nested ANOVA
To perform a Nested ANOVA in R using the lme
function from the nlme
package:
nested_model <- lme(Measurement ~ Hospital, random = ~1|Hospital/Patient, data = data)
summary(nested_model)
Post-Hoc Analysis
Post-hoc tests are essential when the Nested ANOVA indicates significant effects. These tests help you understand the pairwise comparisons between the different levels of your variables.
# Performing Tukey's HSD
nested_aov <- aov(Measurement ~ Hospital + Error(Hospital/Patient), data = data)
summary(nested_aov)
tukey_results <- TukeyHSD(nested_aov, "Hospital")
print(tukey_results)
Interpreting the Results
Interpretation focuses on the p-values obtained for each variable. A p-value less than 0.05 typically indicates a statistically significant effect.
Hospital
: If the p-value for the hospital variable is less than 0.05, then there are significant differences between at least two hospitals.
Visualizing Results
The results can be visually represented using ggplot2
:
ggplot(data, aes(x=Hospital, y=Measurement)) +
geom_boxplot(aes(fill = Hospital)) +
geom_jitter(width = 0.1) +
labs(title = "Nested ANOVA Results",
x = "Hospital",
y = "Measurement")

Conclusion
Nested ANOVA provides valuable insights into the impact of different hierarchical factors on a dependent variable. This comprehensive guide has equipped you with the necessary steps and tools to perform a Nested ANOVA in R. Understanding the assumptions, preparing the data, running the analysis, performing post-hoc tests, interpreting the results, and finally, visualizing the outcomes, are the key steps in this journey.
It is important to use Nested ANOVA appropriately, adhering to its assumptions and interpreting its results with care. This ensures that your conclusions are both meaningful and valid.