Back to Search Start Over

Exact and Consistent Interpretation for Piecewise Linear Neural Networks: A Closed Form Solution

Authors :
Chu, Lingyang
Hu, Xia
Hu, Juhua
Wang, Lanjun
Pei, Jian
Publication Year :
2018

Abstract

Strong intelligent machines powered by deep neural networks are increasingly deployed as black boxes to make decisions in risk-sensitive domains, such as finance and medical. To reduce potential risk and build trust with users, it is critical to interpret how such machines make their decisions. Existing works interpret a pre-trained neural network by analyzing hidden neurons, mimicking pre-trained models or approximating local predictions. However, these methods do not provide a guarantee on the exactness and consistency of their interpretation. In this paper, we propose an elegant closed form solution named $OpenBox$ to compute exact and consistent interpretations for the family of Piecewise Linear Neural Networks (PLNN). The major idea is to first transform a PLNN into a mathematically equivalent set of linear classifiers, then interpret each linear classifier by the features that dominate its prediction. We further apply $OpenBox$ to demonstrate the effectiveness of non-negative and sparse constraints on improving the interpretability of PLNNs. The extensive experiments on both synthetic and real world data sets clearly demonstrate the exactness and consistency of our interpretation.<br />Comment: KDD 2018

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.1802.06259
Document Type :
Working Paper