Explainable AI is a key concept in Machine Learning/AI to explain why your model is making the predictions. It helps us understand how good a model is. In this blog, we cover how you can use a game theory-based method called Shapley Values to explain what's happening inside the ML model.
Let's assume you are tasked with developing an ML model to predict credit card defaulters. With the data at your disposal you clean and transform it, after which you properly cross-validate your results. However, the C-suite isn’t impressed because although your scores/results (precision/recall/F-1) are great, they have no clue to figure out why the model is referring to someone as a potential defaulter. You can show them the feature importance scores (which are on a global level), yet something more is desired. In other words, what is required to convince stakeholders is to explain why the ML model might be making any predictions. This would increase their trust in the model and this process of providing explanations is called model explainability.
Explainability is very important if one is working in regulated sectors like healthcare, trade, etc. In such domains, the data science teams not only work on understanding the data and model building but also try to explain why the models made those decisions.
So, what can you do now? To have higher interpretability, you can use some variants of linear models. This would enable you to explain individual predictions as well. But it comes at the cost of performance. You still want to retain similar performance with as little sacrifice of interpretability as possible. This is where certain concepts are borrowed from the field of game theory. Let’s understand it in the following sections.
Imagine you have differently skilled workers collaborating for some collective reward. How should the reward be divided fairly among them? This is what game theory tries to answer. One possible solution is to get/calculate the marginal contribution of every worker.
Before diving into the mathematics of these marginal contributions, let’s consider three workers A, B & C working together on a project of developing a web application. Our task is to find the marginal contributions of every worker in order to fairly compensate everyone. The fair compensation can be derived by calculating the marginal contribution aka payoffs and it’s formula is:
Let’s break down the formula first. Here, the game (collaborative) is the development of the web app and N is the set of the workers i.e. {A,B,C}. The payoff function is defined by v(S) which gives us the payoff for any subset of workers. For now, we want to understand how much A should be paid. To do it, you decide to use the formula above.
Therefore, N = {A,B,C} & i=A. The above formula can be rearranged in this form (an explanation is shown below)
Now let’s turn our attention to the right-hand side (don’t forget the summation) of the formula v(S∪{i})-v(S). What this is telling us is:
Possible subsets without A = {Φ, {B}, {C}, {B,C} } . Reminder, the number of possible subsets of a set with n elements = 2n. Here, it might look like we are not focussing on the orders i.e. we are not concerned with the order in which B & C started their work. However, it should not matter as from A’s perspective it is irrelevant whether B started his work earlier or C. So, you can evaluate the payoff function once, with and without A, and track how much was contributed once A came into the picture.
The payoff function v(S) is nothing but the function learned by the model from data. The difference v(S∪{i})-v(S) can be represented as Δv. We will have four such values for each of the four subsets. Consider the subset {B}, we get ΔvB, A which tells us how much A is contributing to the work given that only B has worked on it so far.
This step tells us to add them after scaling. The scaling term is the term in orange color
What is the need for scaling you might wonder? This is done to average out the effect of the rest of the team members for every subset size while getting the marginal value of A within each subset. It calculates the number of possible combinations of every subset size considering the set excluding worker A. For the subset {B}, ΔvB, A will have the scale value of 1/2.
There is one final scaling aspect and that is |N|. It is the total number of workers i.e. 3. This is inserted to average out the effect of the group size (number of workers). In this way, you can finally get the marginal contribution or the Shapley value for worker A.
How is this all transferable to the ML domain? It turns out the workers are nothing but the features one feeds to the model. And the payoff function calculated for every subset is nothing but the function learned by the model from data during the training phase. To understand it better, let us take the Adult Census Income dataset from the UCI repository. All the information about attributes is explained on their website.
Predict income exceeding $50k/year based on attributes (binary classification scenario). If income exceeds, then it is labeled as 1 else 0.
We have fitted a gradient boosting model using the LightGBM library. Results (precision, recall & F1 score) are shown below for both the classes.
Now, using the SHAP library, we will understand how to generate Shapley values and explain the model predictions for our problem. It is important to remember that this library will give usapproximate valuesand notexact values. Since we have used a tree-based model, we will be using the TreeSHAP implementation for our purpose.
We start by initializing an explainer object with TreeSHAP over the model and then generating Shapley values for our target set.
Theforce_plot()method helps us in visualizing the impact of different features on the prediction. We will be looking at one record (X_test.iloc[0,:]) and taking its corresponding Shapley value (shap_values[1][0,:]) to generate the plot shown below.
Here, the value in bold (-2.28) is the model’s prediction in the log-odds scale. It is important to keep in mind that LightGBM trees are built in log-odds scale and then just transformed to probabilities forpredict_proba(). A negative base value simply means that we are likely to receive a 0 instead of a 1. Features important in making predictions are colored red and blue, with red ones pushing the model score higher and blue ones pushing it lower. The features located close to the boundary of red & blue are the ones with the higher impact, which is proportional to the size of the color bar.
Now, let’s check how Shapley values are distributed across different feature values. Consider the below image. It shows the summary plot where features (Relationship, Age, etc.) are represented on the Y-axes with their values being color coded (red=high and blue=low) and their respective Shapley values on the X-axes. A high Shapley value means it is contributing more towards our event of interest and vice-versa. If we consider the featureCapital Gain, we can infer that high values for it are generally associated with instances of positive classification. Also, you might spot a bias in the featureSex where the value of 1 (Male) corresponds more towards positive events.
Like anything, Shapley's values aren’t perfect. Some of its noteworthy shortcomings are discussed below:
Overall, Shapley's values are immensely valuable when trying to explain ML models keeping in mind all its limitations. It’s not perfect but it works great when applied correctly.