Using geospatial gridded data to produce small area population estimates in England and Wales
Conference
65th ISI World Statistics Congress 2025
Format: IPS Abstract - WSC 2025
Keywords: geospatial, population, small area estimation
Session: IPS 961 - Use of Geospatial Methods for Small Area Population Estimation
Tuesday 7 October 10:50 a.m. - 12:30 p.m. (Europe/Amsterdam)
Abstract
The Office for National Statistics (ONS) is undergoing a transformation of how we produce population and migration statistics for England and Wales, by expanding the data sources and methods used to produce outputs. The aim is to achieve more frequent, timely and inclusive statistics about the population and its characteristics. In this talk, we give an overview of how geospatial methods and data have been considered to produce small area population estimates.
The use of geospatial methods and data for small area population estimation has been developed by several organisations such as WorldPop. Broadly, there are two geospatial methods: top-down and bottom-up. Top-down methods take known, high-quality aggregated population estimates and break these down using geospatial data at small area geographies. In contrast, bottom-up methods take a sample of population data at the target small area, model this against geospatial data, and then use these models to predict populations for the out-of-sample small areas. These predictions can then be calibrated to known, high-level population totals.
For both approaches, we make use of various types of geospatial information, including publicly available satellite imagery that measures night-time lights radiance, classifies land cover and use, as well as data that captures transport network availability and accessibility. We also make use of the rich administrative data available in the ONS, including the admin-based population, housing and address data. These data sources on their own do not have sufficient coverage for producing statistical outputs, however they show significant promise for use in geospatial modelling approaches.
The richness of these data sources allows us to summarise data at varying geographical scales, from established statistical boundaries (Lower-level Super Output Area, LSOA) down to 100m grid squares. Part of this work was to investigate the potential benefit of producing population statistics at 100m grid squares rather than directly at the target small area boundary. A challenge with modelling at boundary level is that boundaries like LSOAs are based on population size rather than physical. At these boundaries, the physical size of areas will vary and are much larger than the typical grid area used for population estimation. Summarising geospatial data at these levels of geography may not capture the detail in the geospatial data and risk the granular nuances being averaged out. On the other hand, gridding geospatial information is expected to better capture the properties of geospatial data, allowing for stronger relationships between geospatial information and population to be established.
Furthermore, creating gridded estimates offers flexibility as a building block to aggregate up to a range of desired levels of geography. This talk summarises our work to date considering these geospatial approaches and importantly whether they produce estimates of sufficient quality as compared to current methods used for small area population estimation in England and Wales.