# How to Calculate Skewness & Kurtosis in R

Skewness and kurtosis are crucial statistical concepts that help us understand the shape and nature of the distribution of our data. Skewness indicates the asymmetry of data around its mean, while kurtosis measures the “tailedness” of the distribution. In this article, we’ll demonstrate how to calculate these measures using R.

## Installing Required Package

We’ll use the ‘moments’ package in R for our computations. If you haven’t installed this package, use the following command in your R console:

install.packages("moments")

library(moments)

Now we’re set to start using the ‘moments’ package.

## Creating Sample Data

For this demonstration, we’ll create a simple dataset ‘data_sample’. Let’s create this dataset as follows:

set.seed(123) # Setting seed for reproducibility
data_sample <- rnorm(1000)

In this example, ‘rnorm()’ is a function that generates random numbers from a standard normal distribution. We’re creating 1000 of these numbers and storing them in ‘data_sample’.

## Calculating Skewness

To compute skewness of ‘data_sample’, we can use the ‘skewness()’ function in the ‘moments’ package:

data_sample_skewness <- skewness(data_sample)
print(data_sample_skewness)

## Calculating Kurtosis

Similar to skewness, we can compute kurtosis using the ‘kurtosis()’ function from the ‘moments’ package:

data_sample_kurtosis <- kurtosis(data_sample)
print(data_sample_kurtosis)

It’s important to remember that the kurtosis function in R uses Fisher’s definition, which subtracts 3 from the original kurtosis measure. So, a perfect normal distribution will have a kurtosis of 3 (or 0 in Fisher’s definition).

## Visualizing Skewness and Kurtosis

It’s often helpful to visualize your data distribution. You can use histograms and density plots for this purpose. Below is how to create a histogram and density plot for ‘data_sample’:

# Creating a Histogram
hist(data_sample, main="Histogram of Data Sample", xlab="Data", border="blue", col="green", xlim=range(-4:4))

# Creating a Density Plot
plot(density(data_sample), main="Density Plot of Data Sample", xlab="Data", ylab="Density", col="blue")
polygon(density(data_sample), col="pink", border="blue")

By examining these plots, you can gain a visual understanding of the symmetry and peakedness of your data distribution, which complements your numerical skewness and kurtosis measures.

## Conclusion

R provides a wealth of tools for data analysis, including the ability to calculate skewness and kurtosis easily using the ‘moments’ package. Understanding these metrics can provide valuable insights into the nature of your data distribution, assist in outlier detection, and inform your data-driven decisions. This guide should help you calculate and interpret skewness and kurtosis in R with your data.

Posted in RTagged