Small area estimation with geospatial data. Lessons from collaborative work with the World Bank and the UK Office for National Statistics
Conference
65th ISI World Statistics Congress 2025
Format: IPS Abstract - WSC 2025
Keywords: machine learning, official statistics, poverty, remote sensing data
Session: IPS 885 - Bridging Academia and Official Statistics: Examples of European Experience
Tuesday 7 October 2 p.m. - 3:40 p.m. (Europe/Amsterdam)
Abstract
Model-based survey estimation is commonly implemented with the aid of population census data. In the most developed countries, censuses are usually updated every ten years. However, censuses are much less frequent in many countries in the global south. Advances in the availability and processing of geospatial data have created renewed interest in their use as predictors in model-based estimation. Geospatial data have been used in small area poverty mapping in countries that lack frequent collection of census data. Despite acting only as proxies of household characteristics, results from using geospatial data are encouraging. Small area estimates using geospatial data are well correlated with design-unbiased direct estimates and with “gold standard” model-based estimates that use up-to-date census data. In addition, using geospatial data offers an approach to estimation in off-census years.
We start by presenting poverty mapping results from collaborative work with the World Bank in Mozambique. Estimates using geospatial data are compared against estimates produced with the most recent 2017 census (gold standard) and the old 2007 census in Mozambique. The geospatial-based estimates track the industry standard results well, but this is not the case for the estimates based on the 2007 census data. The application in Mozambique illustrates the importance of model building and selection when using geospatial data and the potential pitfalls when using old census data.
In collaboration with UK Office for National Statistics, we transfer our experience from working with geospatial data in Mozambique to the UK context. We produce research estimates of income deprivation for middle super output areas and local authority districts using data from the UK Family Resources Survey (FRS) integrated with geospatial data. Estimates are compared to estimates using the Empirical Best Predictor (EBP) with the latest UK census microdata. We critically assess the results from the two case studies. Our research findings inform us about the merits of using alternative data sources in producing official statistics.