Variance of the generalized regression estimator under measurement error
Conference
65th ISI World Statistics Congress 2025
Format: CPS Abstract - WSC 2025
Session: CPS 26 - Measurement Error, Uncertainty, and Estimation Methods in Survey Statistics
Monday 6 October 4 p.m. - 5 p.m. (Europe/Amsterdam)
Abstract
Official statistics published by national statistical institutes are predominantly based on sample surveys in combination with the generalized regression (GREG) estimator. The GREG estimator uses auxiliary information of which the distribution in the population is known to improve the precision of the sample estimates. This is achieved by calibrating the design weights, defined as the inverse of the inclusion probabilities of the sample design, such that the sum over the weighted auxiliary variables of the sample units are exactly equal to the distributions in the population. In most cases the population totals of these auxiliary variables are assumed to be known without error, since they are derived from registers. One exemption is two-phase sampling, where estimates for population totals based on a large first-phase sample are used in the weighting scheme of the estimates based on the second-phase sample. The variance of the GREG estimator under two-phase sampling accounts for the additional uncertainty of using sample estimates derived from the first phase in the weighting scheme of estimates obtained from the second phase. In all other situations the variance of the GREG estimator assumes that the population totals used in the weighting scheme are observed without error.
Besides two-phase sampling there are other situations where population totals used in the weighting scheme of the GREG estimator contain uncertainty, which is ignored in the standard variance approximation of the GREG estimator. Statistics Netherlands uses a structural time series model (STM) for the production of official monthly labour force figures. Quarterly and annual figures are based on the GREG estimator. To enforce numerical consistency between monthly, quarterly and annual figures, the weighting scheme of the quarterly and annual figures contain a table that is based on the published monthly labour force figures. The additional uncertainty of using estimates for population totals, derived from a time series model, is ignored in the variances of the GREG estimator. Since the uncertainty is not related to the probability structure of the sample design, an approach similar to the variance under two-phase sampling is not applicable.
In this paper a new variance approximation for the GREG estimator that accounts the additional uncertainty of using population totals in the weighting scheme that are observed with measurement error is proposed. The method is illustrated with an application to the quarterly labour force figures of the Dutch Labour Force Survey. It is illustrated how the variance of the monthly labour force figures, estimated with a STM, can be incorporated in the proposed variance approximation of the quarterly GREG estimates. Results will be compared with the standard approach where this additional uncertainty is ignored.