The Beta distribution is a continuous probability distribution with parameters α and β, which are usually used as shape parameters. It is defined on the interval [0,1] and is often used in Bayesian statistics, to model random variables that have a limited range. In addition, it’s useful for modeling proportions or percentages.
In this comprehensive guide, we will take you through a step-by-step process of how to plot a Beta distribution in R, a popular programming language widely used in data analysis and statistics. This will include generating beta distributed data, plotting the probability density function (PDF), cumulative distribution function (CDF), and creating a histogram with a density line.
Step 1: Basic Plotting of Beta Distribution
R has built-in functions for the Beta distribution:
dbeta(x, shape1, shape2, ncp, log = FALSE)
: Returns the density.pbeta(q, shape1, shape2, ncp, lower.tail = TRUE, log.p = FALSE)
: Returns the distribution function.qbeta(p, shape1, shape2, ncp, lower.tail = TRUE, log.p = FALSE)
: Returns the quantile function.rbeta(n, shape1, shape2, ncp)
: Generates random deviates.
Here, shape1
and shape2
are α and β parameters of the Beta distribution, respectively. ncp
is non-centrality parameter.
First, let’s look at how you can plot the density function of the Beta distribution. We will use the curve()
function in R, which draws a curve corresponding to a function over an interval.
# Setting up the parameters
alpha <- 2
beta <- 5
# Plotting the beta density
curve(dbeta(x, alpha, beta), from=0, to=1, ylab="Density", xlab="x",
main=paste("Density of Beta(", alpha, ",", beta, ")"))

In this code, we define the parameters of our Beta distribution (alpha
and beta
), then use the curve()
function to draw a curve of the Beta distribution’s density function from 0 to 1. The dbeta()
function returns the density of the Beta distribution for different values of x
.The ylab
, xlab
, and main
options in the curve()
function are used to set the y-axis label, x-axis label, and the plot title, respectively.
Step 2: Plotting the Cumulative Distribution Function (CDF)
The cumulative distribution function (CDF) of a random variable is defined as the probability that the variable takes a value less than or equal to a certain value.
The Beta distribution’s CDF can be plotted in a similar way to the PDF, using the pbeta()
function:
# Plotting the cumulative distribution function
curve(pbeta(x, alpha, beta), from=0, to=1, ylab="CDF", xlab="x",
main=paste("CDF of Beta(", alpha, ",", beta, ")"))

Here, pbeta()
is the function that provides the cumulative distribution function of the Beta distribution.
Step 3: Generating Beta-Distributed Random Numbers
To generate beta-distributed random numbers, we use the rbeta()
function. For example, let’s generate 10000 random numbers from the Beta distribution with parameters α=2 and β=5:
# Generate beta-distributed random numbers
set.seed(123) # for reproducible results
beta_random <- rbeta(10000, alpha, beta)
Step 4: Creating a Histogram with a Density Line
After generating beta-distributed random numbers, we can create a histogram to observe the distribution of these numbers:
# Create a histogram
hist(beta_random, prob=TRUE, breaks=40, main="Histogram with density line", xlab="x", ylab="Density")
# Add a density line
lines(density(beta_random), col="red", lwd=2)

In the code above, the hist()
function creates a histogram, and the lines()
function adds a density line. The prob=TRUE
argument in the hist()
function means that the histogram will represent probabilities instead of counts. The breaks=40
argument sets the number of bins in the histogram.
The density()
function estimates the density function from the data, and lines()
adds this estimated density to the plot.
Step 5: Overlaying the Theoretical Density
Finally, you might want to compare the histogram and the estimated density with the theoretical density of the Beta distribution:
# Overlay the theoretical density
curve(dbeta(x, alpha, beta), add=TRUE, col="blue", lwd=2)

The curve()
function here adds the theoretical density to the existing plot because of the add=TRUE
argument.
Conclusion
In this guide, you’ve learned how to generate and plot a Beta distribution in R, including how to generate beta-distributed random numbers, plot the probability density function and cumulative distribution function, create a histogram, and overlay the theoretical and estimated densities. With this knowledge, you should be well-equipped to work with Beta distributions in R.
Remember, the shapes of the Beta distribution can vary widely depending on the values of α and β parameters. Therefore, when using the Beta distribution in practice, be sure to choose these parameters carefully based on the characteristics of your specific data and analysis.