65th ISI World Statistics Congress 2025

65th ISI World Statistics Congress 2025

Integrating Data Science into Official Statistics

Conference

65th ISI World Statistics Congress 2025

Format: IPS Abstract - WSC 2025

Keywords: alternative data sources, data science, official statistics

Session: IPS 734 - Data Science and Official Statistics: Toward a New Culture

Monday 6 October 10:50 a.m. - 12:30 p.m. (Europe/Amsterdam)

Abstract

The discussion on advantages, disadvantages, limitations, and requirements of using alternative data sources integrated with probability sample surveys informs the debate in national and international statistical systems worldwide. The temptation to replace rigorous and costly data collection approaches with “smarter” ones is increasing. However, evaluating the reliability of statistics produced by elaborating alternative data sources is mandatory. In this work, we analyze the relationship between data science, new data sources, machine learning, citizen science, smart statistics, official statistics and the role of probability sample surveys.
We show that elaborating satellite data through parametric and machine learning classifiers does not always provide accurate statistics in complex landscapes, and machine learning classifiers do not systematically outperform parametric classifiers. Moreover, data collected by probabilistic samples play a crucial role. They should not be replaced by data collected by citizens without clear and strict guidelines in case official statistics have to be produced.