The F1 score is a crucial metric used in evaluating classification models. It is especially useful in situations where data imbalances exist. This measure considers both the precision and recall of the test to compute the score. The precision metric is the number of correct positive results divided by the number of all positive results returned by the classifier, and recall is the number of correct positive results divided by the number of all relevant samples (all samples that should have been identified as positive).

This article will guide you on how to calculate the F1 score in R.

## Installing Required Packages

We’ll be using a few packages in our demonstration, namely ‘caret’ and ‘MLmetrics’. The ‘caret’ package (short for *Classification And REgression Training*) provides functions for training and plotting a wide variety of classification and regression models. On the other hand, the ‘MLmetrics’ package offers machine learning evaluation metrics, including F1 Score.

You can install these packages using the `install.packages()`

function:

```
install.packages("caret")
install.packages("MLmetrics")
```

After installation, load the packages into your R environment using the `library()`

function:

```
library(caret)
library(MLmetrics)
```

## Data Preparation

Before calculating the F1 score, you’ll need to have a dataset to work with. For our purposes, we’ll use the iris dataset that comes with R.

The iris dataset consists of 150 observations on the sepal length, sepal width, petal length, and petal width for three species of iris flowers.

`data("iris")`

Let’s make the problem binary to simplify it, so we’re just distinguishing between setosa and non-setosa. This is common in imbalanced classification tasks, where you’re often distinguishing between the “normal” class and the “anomalous” class.

`iris$Species = ifelse(iris$Species == "setosa", "setosa", "non-setosa")`

## Splitting the Data

Before training our model, we should split our data into a training set and a test set. This allows us to evaluate the model’s performance on unseen data. We’ll use the `createDataPartition()`

function from the caret package to do this.

```
set.seed(123)
trainIndex <- createDataPartition(iris$Species, p=0.8, list=FALSE)
trainSet <- iris[trainIndex,]
testSet <- iris[-trainIndex,]
```

In this case, we’re using 80% of the data for training and the remaining 20% for testing.

## Model Training

Next, we’ll use the `train()`

function from the caret package to train a model. For the sake of simplicity, we’re training a logistic regression model, but the process is the same for other models.

`model <- train(Species~., data=trainSet, method="glm", family="binomial")`

## Making Predictions

Now that we have a trained model, we can use it to make predictions on our test set:

`predictions <- predict(model, newdata=testSet)`

## Calculating the F1 Score

Finally, we can calculate the F1 score using the `F1_Score()`

function from the MLmetrics package. The function requires two arguments: the actual classes and the predicted classes.

```
f1_score <- F1_Score(y_pred = predictions, y_true = testSet$Species)
print(f1_score)
```

## Conclusion

In this tutorial, we walked through the process of calculating the F1 score in R. We started by installing and loading the necessary packages, preparing our data, splitting it into a training set and a test set, training a model on our training data, making predictions on our test data, and finally calculating the F1 score.

Remember that the F1 score is just one of many metrics that can be used to evaluate a model’s performance, and it’s not always the best one. It’s especially useful in situations with imbalanced classes, but in other scenarios, metrics like accuracy or AUC-ROC might be more appropriate. Always consider the specifics of your problem when choosing evaluation metrics.