Design-based predictive inference for Survey Sampling
Conference
65th ISI World Statistics Congress 2025
Format: IPS Abstract - WSC 2025
Keywords: cross-validation, model-assisted, probability sampling, rao-blackwellization
Session: IPS 1005 - Building Bridges Between (Official) Statistics and Machine Learning Methodology
Wednesday 8 October 10:50 a.m. - 12:30 p.m. (Europe/Amsterdam)
Abstract
The growing availability of (auxiliary) data makes the use of machine learning models in Official Statistics more and more appealing, even for survey sampling estimation. Of course this brings out the issue of inference. The standard approach in Official Statistics have long been design-based inference.
We propose design-based predictive inference for finite populations, where the models are learned from a probability sample, and the uncertainty of the estimation is evaluated with respect to the sampling design. Note that this means that all variables, both auxiliary and target are treated as constants: the only randomness comes from the random sample.
Unlike design-based model-assisted approach, this approach does not rely on asymptotic properties of the models involved. Another important difference is that it is suitable for inference on individual-level predictions, not only population-level predictions.