Cost-optimal sampling in augmented surveys using reinforcement learning
Conference
65th ISI World Statistics Congress 2025
Format: CPS Abstract - WSC 2025
Keywords: reinforcement_learning, sampling design
Session: CPS 24 - Small Area Estimation and Spatio-Temporal Modelling
Monday 6 October 5:10 p.m. - 6:10 p.m. (Europe/Amsterdam)
Abstract
Survey costs are mainly driven by the number of enumerators and the kilometers traveled to reach the survey locations (including costs for overnight stays, per diems etc). In this paper, we re-define survey sampling as a reinforcement learning task and send enumerators as agents across a country's map using actual travelling times from routing services to optimise the accuracy of small-area poverty estimates augmented with satellite imagery given a fixed budget. We compare our approach to other state-of-the-art sampling approaches such as stratified two-stage cluster sampling. As a real-world evaluation setup, we choose the two census periods 2013 and 2023 in Senegal. We observe improvements in the accuracy across a bandwidth of key indicators, both in-sample as well as out-of-sample. We believe this approach can help statistical offices around the world to get more value for money and thus improve both statistical products as well as accountability.