In our previous post we learned a lot of things about Decision trees Classifier. In this post we will learn how to train a Decision tree regressor in sklearn.
Decision Tree Regressor –
Decision tree regression works similarly to decision tree classification, however instead of reducing Gini impurity or entropy, potential splits are by default measured on how much they reduce mean squared error (MSE).
How to train a Decision Tree Regressor in Sklearn?
Let’s read a dataset first.
import pandas as pd import numpy as np from sklearn import datasets housing = datasets.fetch_california_housing() X = pd.DataFrame(housing.data, columns=housing.feature_names) y = housing.target X.head()
Now split the data into a training and test set.
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Next, we will train a Decision tree regressor in scikit-learn and measure the error.
from sklearn.tree import DecisionTreeRegressor from sklearn.metrics import mean_squared_error # create a decision tree regressor model tree_reg = DecisionTreeRegressor(random_state=42) # train it on the training data tree_reg.fit(X_train, y_train) # make predictions on the test set y_pred = tree_reg.predict(X_test) # measure error mse = mean_squared_error(y_test, y_pred) rmse = np.sqrt(mse) rmse
# output 0.7056736565432687