Early estimates for short-term business statistics with ML models for statistical units prediction
Conference
65th ISI World Statistics Congress 2025
Format: IPS Abstract - WSC 2025
Keywords: business-survey, machine learning, nowcasting, timeliness;
Session: IPS 799 - Real-World Machine Learning Applications in Official Statistics
Thursday 9 October 10:50 a.m. - 12:30 p.m. (Europe/Amsterdam)
Abstract
Timeliness stands as one of the main concerns regarding survey-based official statistics, despite their quality and the contrasted methodological support to account for their accuracy through design-based inference. Data collection, data editing, and population aggregates estimation take usually too long for increasingly demanding user needs. Resourcing to new data sources (mainly digital) are frequently posed as potential solutions even at the cost of accuracy and, thus, reliability. We suggest to make use of the predictive power of statistical learning models on survey data to reconstruct a synthetic version of the microdata while these production phases are executed, so that aggregation and estimation procedures can similarly be conducted producing early and timely estimates of the population aggregates of interest. We call this early imputation. We present ongoing proofs of concept using the Spanish Industrial Turnover Index survey, improving timeliness by several weeks. We also discuss about related issues as the uncertainty estimation and the potential reduction of response burden.