What is f1 score in Machine Learning?

Spread the love

So far we talked about Confusion Matrix and Precision and Recall and in this post we will learn about F1 score and how to use it in python.

1 . Confusion Matrix – How to plot and Interpret Confusion Matrix.

2 . What is Precision, Recall and the Trade-off?

F1 Score –

F1 Score combine both the Precision and Recall into a single metric. The F1 score is the harmonic mean of precision and recall. A classifier only gets a high F1 score if both precision and recall are high.

Calculate F1 score in Python –

Let’s read a dataset.

import pandas as pd
import numpy as np

# read data
url = "https://raw.githubusercontent.com/bprasad26/lwd/master/data/breast_cancer.csv"
df = pd.read_csv(url)
df.head()
values = {"B": 0, "M": 1}
df["diagnosis"] = df["diagnosis"].map(values)
df["diagnosis"].value_counts(normalize=True).round(2)

Here, we have data about cancer patients, in which 37% of the patients are sick and 63% of the patients are healthy. Our job is to build a model which can predict which patient is sick and which is healthy as accurately as possible.

Train a Model –

from sklearn.pipeline import make_pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split

# split the data into training and test set
X = df.drop("diagnosis", axis=1).copy()
y = df["diagnosis"].copy()

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=26
)

# initiate an rf classifier using a pipeline
clf = make_pipeline(
    SimpleImputer(strategy="mean"), RandomForestClassifier(random_state=26)
)

# train the classifier on training data
clf.fit(X_train, y_train)

# make predictions on test data
pred = clf.predict(X_test)

Calculate F1 Score –

from sklearn.metrics import f1_score
score = f1_score(y_test, pred)
output - 0.9565217391304347

Rating: 1 out of 5.

Leave a Reply