Machine Learning Model Interpretation

Original Source Here

Machine Learning Model Interpretation

Using Skater to built ML visualization

Tree(Source: By Author)

Interpreting a machine learning model is a difficult task because we need to understand how a model works in the backend, what all parameters the model uses, and how the model is generating the prediction. There are different python libraries that we can use to create machine learning model visualizations and analyze who the model is working.

Staker is an open-source python library that enables machine learning model interpretations for different types of black-box models. It helps us create different types of visualization, making it easier to understand how a model is working.

In this article, we will explore Skater and what are its different functionalities. Let’s get started…

Installing required libraries

We will start by installing a skater using pip installation. The command given below will install the skater using pip.

!pip install -U skater

Importing required libraries

The next step will be importing the required libraries. To interpret the model using Skater we first need to create a model.

%matplotlib inline 
import matplotlib.pyplot 
import matplotlib.pyplot as plt 
import numpy as np 
from sklearn.model_selection 
import train_test_split from sklearn.ensemble 
import RandomForestClassifier 
from sklearn import datasets 
from sklearn import svm
from skater.core.explanations import Interpretation
from skater.model import InMemoryModel
from skater.core.global_interpretation.tree_surrogate 
import TreeSurrogate
from skater.util.dataops import show_in_notebook

Creating Model

We will create a Random Forest Classifier and use the IRIS dataset.

iris = datasets.load_iris()
digits = datasets.load_digits()
X = iris.data
y = iris.target
clf = RandomForestClassifier(random_state=0, n_jobs=-1)

xtrain, xtest, ytrain, ytest = train_test_split(X,y,test_size=0.2, random_state=0) clf = clf.fit(xtrain, ytrain)y_pred=clf.predict(xtest)
prob=clf.predict_proba(xtest)from skater.core.explanations import Interpretation
from skater.model import InMemoryModelfrom skater.core.global_interpretation.tree_surrogate import TreeSurrogate
from skater.util.dataops import show_in_notebookinterpreter = Interpretation(
        training_data=xtrain, training_labels=ytrain, feature_names=iris.feature_names
    )
pyint_model = InMemoryModel(
            clf.predict_proba,
            examples=xtrain,
            target_names=iris.target_names,
            unique_values=np.unique(ytrain).tolist(),
            feature_names=iris.feature_names,
        )

Creating Visualizations

We will start by creating different visualizations that will help us analyze how the model we have created is working.

Partial dependence plot

This plot shows us how a particular feature affects the model’s prediction.

interpreter.partial_dependence.plot_partial_dependence(
   ['sepal length (cm)'] , pyint_model, n_jobs=-1, progressbar=False, grid_resolution=30, with_variance=True,figsize = (10, 5)
)

PDP Plot(Source: By Author)

2. Feature importance

In this graph, we will analyze the importance of features in the model that we have created.

plots = interpreter.feature_importance.plot_feature_importance(pyint_model, ascending=True, progressbar=True,
                                n_jobs=-1)

Feature Importance(Source: By Author)

3. Surrogate tree

It is a pictorial representation of the random forest model that we have created. At each step, it is showing the Gini index value, class, etc.

surrogate_explainer = interpreter.tree_surrogate(oracle=pyint_model, seed=5)
surrogate_explainer.fit(xtrain, ytrain)
surrogate_explainer.plot_global_decisions(show_img=True)

Surrogate Tree(Source: By Author)

This is how we can use Skater to create different graphs that help us analyze how a model is performing. Go ahead try this with different datasets and let me know your comments in the response section.

This article is in collaboration with Piyush Ingale.

Before You Go

Thanks for reading! If you want to get in touch with me, feel free to reach me on hmix13@gmail.com or my LinkedIn Profile. You can view my Github profile for different data science projects and packages tutorials. Also, feel free to explore my profile and read different articles I have written related to Data Science.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

via WordPress https://ramseyelbasheer.io/2021/04/28/machine-learning-model-interpretation/