Forest plots, also known as blobbograms, are graphical displays designed to illustrate the relative strength of treatment effects in multiple quantitative scientific studies addressing the same question. They are primarily used in meta-analysis studies that statistically combine the results of several independent studies to identify common trends. This article aims to provide a comprehensive guide on how to create a forest plot in the R programming environment.

## Setting Up Your R Environment

There are several packages in R that you can use to create forest plots. However, the ‘meta’ and ‘metafor’ packages are popularly used because they provide a variety of functions to conduct meta-analyses and generate forest plots.

To install these packages, you can use the install.packages() function in R:

```
install.packages("meta")
install.packages("metafor")
```

Once installed, you need to load the packages into your R environment:

```
library(meta)
library(metafor)
```

## The Dataset

In this tutorial, we’ll use a built-in dataset in the metafor package called ‘dat.bcg’. This dataset represents a meta-analysis of BCG vaccine studies, which investigated the effect of BCG vaccines on tuberculosis infections.

Let’s load and explore the data:

```
data(dat.bcg)
head(dat.bcg)
```

This dataset includes several columns, but we will focus on ‘alloc’, ‘pos’, ‘npos’, ‘neg’, ‘nneg’. These columns respectively represent the type of allocation (random or non-random), the number of positive outcomes in the treatment and control groups, and the total number of subjects in the treatment and control groups.

## Preparing the Data for Meta-analysis

Before you create a forest plot, you first need to perform a meta-analysis. In this case, we’ll conduct a meta-analysis based on a random-effects model. Here, the ‘escalc’ function is used to compute the effect size or outcome measure to be meta-analyzed, while the ‘rma’ function is used to fit (restricted) maximum likelihood based random-effects models.

```
dat.bcg <- escalc(measure="RR", ai=tpos, n1i=tneg, ci=cpos, n2i=cneg, data=dat.bcg)
res <- rma(yi, vi, data=dat.bcg)
```

In the ‘escalc’ function, the measure argument is set to “RR”, which represents the risk ratio. This is a suitable effect size measure for the BCG vaccine studies as they are binary outcome studies.

## Creating the Forest Plot

Once the meta-analysis is complete, we can now create the forest plot using the ‘forest’ function from the ‘meta’ package:

`forest(res)`

The basic forest function produces a simple forest plot with studies listed on the y-axis, and the effect size (risk ratio in this case) on the x-axis. Each line represents a different study, the box represents the estimated effect size, and the lines extending from the box (known as whiskers) represent the confidence intervals. A diamond shape at the bottom of the plot indicates the combined effect size of all studies.

## Customizing the Forest Plot

The basic forest plot may suffice for some, but you can enhance it for better visualization. R’s ‘meta’ and ‘metafor’ packages provide several arguments to customize forest plots.

For instance, you can use the ‘sortvar’ argument to specify the variable by which the studies should be sorted:

`forest(res, sortvar=dat.bcg$year)`

The ‘order’ argument allows you to specify the order in which studies are presented:

`forest(res, order="obs")`

The ‘xlim’ argument allows you to specify the limits for the x-axis:

`forest(res, xlim=c(-1, 3))`

## Interpreting the Forest Plot

Reading a forest plot involves examining both the individual study effects and the overall combined effect. The box size represents the weight of the study in the meta-analysis. A larger box indicates a higher weight, and this is usually because of a larger sample size or lower variance.

If the confidence interval for a study crosses the line of no effect (in this case, a risk ratio of 1), it suggests that the study’s effect is not statistically significant. The overall effect (represented by the diamond at the bottom) is considered significant if its confidence interval does not cross the line of no effect.

## Conclusion

In this tutorial, we’ve walked through the process of creating a forest plot in R, from installing and loading necessary packages, preparing the data, conducting a meta-analysis, to generating and customizing a forest plot.

Understanding how to create and interpret forest plots is crucial in evidence-based medicine and many other fields that rely on meta-analysis. However, remember that while forest plots can visually simplify complex data, careful interpretation is always necessary.