Classifier interpretation

The process of interpreting of classifiers happens generally in two steps:

  1. Calculate SHAP values using cli_explain.py

  2. Use the calculated SHAP values to interpret classifiers’ local and global decisions

cli_explain.py is parametrized by config.yml. While the configuration is exhaustively described in the configuration file itself, here we recapitulate some key takeaways. The tool requires two inputs:

Input format

Please note that the input dataset in hdf format should correspond to dataset that is output of data preparation pipeline. Specifically, it has to be output of records selection, feature scaling, or feature selection steps. Furthermore, the trained model should be trained on data with same features as the dataset.

Supported models

The tool, cli_explain.py, is a simple wrapper around SHAP library. Currently, only TreeSHAP is supported and tested for Random Forest.

Configuration options

Below are explained some configuration options and their interactions.

It is desired to not use the whole dataset. There are two ways to do so:

  • for global interpretability use sample_size of, for example 10 000 samples, specifying negative values is regarded as whole dataset

  • for local interpretability use indices to specify list of indices to interpret Additionally, these two approaches can be combined because first the indices are used to select the data and then the sample is drawn from it.

Additional three options can be specified which are explained in TreeSHAP documentation:

  • feature_perturbation

  • model_output

  • approximate Please note that when interventional method is used for feature_perturbation then only 10 000 samples are used as background data even when sample_size is not specified! This is done for the results to be consistent when using different indices over the same dataset.

The configuration columns_to_ignore specifies which columns to ignore during calculation. This must correspond to the value used during classifier training.

Calculation speed and parallelization

Calculating SHAP values is generally time-consuming process. Thus, as was previously mentioned it is recommended to not use the whole dataset. It is also beneficial to not calculate interaction values if they are not needed (usually only SHAP values are desired).

The calculation of SHAP values can be also parallelized by creating more configurations with different indices and then using the calculated values together.

Output format

The output of cli_explain.py is a single .h5 (hdf) file specified by output_path. It contains 5 keys:

  • shap_values - dataframe of shap values if calculate_shap_values was set to True

  • shap_interaction_values - dataframe of shap values if calculate_interaction_values was set to True

  • values - dataframe of values used if save_values was set to True

  • shap_expected_values - dataframe of expected value for each class

  • targets - dataframe of targets used if save_targets was set to True

  • hashes - dataframe of hashes used if save_hashes was set to True

Using calculated values

The calculated values can then be loaded in a Jupyter notebook and used with SHAP library to explain the classifier decisions. For loading the values we provide a function androidcrypto.analysis.explainability.api.load_explainability_output. For an example of using these values see notebook.