Back to Search Start Over

Mitigating selection bias in counterfactual prediction through self-supervised domain embedding learning with virtual samples.

Authors :
Zhu, Qianyang
Sun, Heyuan
Yang, Bo
Source :
Applied Intelligence; Apr2024, Vol. 54 Issue 8, p6529-6542, 14p
Publication Year :
2024

Abstract

Treatment effect estimation (TEE) is widely adopted in various domains such as machine learning, dvertising and marketing, and medicine. During the TEE, there normally exist selection bias on counterfactual prediction, which results in different distributions of covariates between the treated and control groups. One important challenge in TEE is to mitigate the impact of selection bias, which has attracted a lot of research in recent years. To address this challenge, existing neural network-based methods generally aim to minimize the distribution differences using integral probability metrics. However, minimizing the distribution differences may inadvertently remove outcome-related information during the balancing procedure, which has negative impact on the accuracy of TEE. In this paper, we propose a novel self-supervised learning approach to conduct TEE. Rather than minimizing the distribution differences, we first introduce the concept of virtual samples which have identical covariates as observed samples but with different treatments. In this way, we aim to simulate the scenario where each sample receives both treatment and control. Next, we propose a self-supervised domain embedding learning (SDEL) approach to conduct TEE. In SDEL, we propose to learn both treated and control embeddings for observed and virtual samples, thereby learning the effects of different treatments. To the best of our knowledge, we are the first to introduce the concept of virtual samples and the first to conduct embedding learning in TEE. Building upon SDEL, we propose a feature extraction counterfactual regression network (FE-CFR), in which we propose a feature extraction module (FEM) to estimate the importance of different covariates. Compared with existing TEE methods, our proposed self-supervised learning approach to could improve the accuracy of TEE. Extensive experiments have been conducted on benchmark datasets for TEE, and the results demonstrate that our proposed approach outperforms the compared baseline approaches. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0924669X
Volume :
54
Issue :
8
Database :
Complementary Index
Journal :
Applied Intelligence
Publication Type :
Academic Journal
Accession number :
177897442
Full Text :
https://doi.org/10.1007/s10489-024-05518-7