Title: Imputation through random forests for missing survey data
Speaker: David Haziza, University of Ottawa
Co-authors: Mehdi Dagdoug (University of Ottawa) and Camelia Goga (Université de Bourgogne Franche Comté)
Date: Friday, March 3, 2023
Time: 3:30 – 4:30 p.m. (coffee starting at 3:00 p.m.)
Location: HP 4351 (MacPhail Room)

Abstract:  Item nonresponse in surveys is usually handled through some form of single imputation. Random forests provide flexible tools for obtaining a set of imputed values. Belonging to the class of non-parametric methods, random forests have the ability to capture nonlinear trends in the data and tend to be robust to the non-inclusion of interactions or predictors accounting for curvature. We lay out a set of sufficient conditions needed for establishing the L2-consistency of an imputed estimator based on random forests. We investigate the performance of variance estimators that account for sampling and nonresponse. We present the results from a simulation study to assess the proposed methods in terms of bias, efficiency and coverage rate of normal-based confidence intervals.