51. Solving the missing at random problem in semi‐supervised learning: An inverse probability weighting method.
- Author
-
Su, Jin, Zhang, Shuyi, and Zhou, Yong
- Subjects
- *
SAMPLE size (Statistics) , *PROBABILITY theory , *DENSITY - Abstract
We propose an estimator for the population mean θ0=피(Y) under the semi‐supervised learning setting with the Missing at Random (MAR) assumption. This setting assumes that the probability of observing Y$$ Y $$, denoted by πM∗$$ {\pi}_M^{\ast } $$, depends on the total sample size M$$ M $$ and satisfies πM∗=o(1)$$ {\pi}_M^{\ast }=o(1) $$. To efficiently estimate θ0$$ {\theta}_0 $$, we introduce an adaptive estimator based on inverse probability weighting and cross‐fitting. Theoretical analysis reveals that our proposed estimator is consistent and efficient, with a convergence rate of MπM∗$$ \sqrt{M{\pi}_M^{\ast }} $$, slower than the typical M$$ \sqrt{M} $$ rate, due to the diminishing proportion of labelled data as the sample size M$$ M $$ increases in the semi‐supervised setting. We also prove the consistency of inverse probability weighting (IPW)–Nadaraya–Watson density function estimators. Extensive simulations and an application to the Los Angeles homeless data validate the effectiveness of our approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF