Marrying Fairness and Explainability in Supervised Learning

Authors :: Grabowicz, Przemyslaw
Perello, Nicholas
Mishra, Aarshee
Source :: 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22)
Publication Year :: 2022
Abstract: Machine learning algorithms that aid human decision-making may inadvertently discriminate against certain protected groups. We formalize direct discrimination as a direct causal effect of the protected attributes on the decisions, while induced discrimination as a change in the causal influence of non-protected features associated with the protected attributes. The measurements of marginal direct effect (MDE) and SHapley Additive exPlanations (SHAP) reveal that state-of-the-art fair learning methods can induce discrimination via association or reverse discrimination in synthetic and real-world datasets. To inhibit discrimination in algorithmic systems, we propose to nullify the influence of the protected attribute on the output of the system, while preserving the influence of remaining features. We introduce and study post-processing methods achieving such objectives, finding that they yield relatively high model accuracy, prevent direct discrimination, and diminishes various disparity measures, e.g., demographic disparity.<br />Comment: 17 pages, 16 figures. Section 4.3 updated

Subjects :: Computer Science - Machine Learning
Computer Science - Computers and Society

Database :: arXiv
Journal :: 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22)
Publication Type :: Report
Accession number :: edsarx.2204.02947
Document Type :: Working Paper
Full Text :: https://doi.org/10.1145/3531146.3533236