65th ISI World Statistics Congress 2025

65th ISI World Statistics Congress 2025

Computationally efficient spatio-temporal disease mapping for massive data

Author

DL
Duncan Lee

Co-author

  • C
    Craig Anderson
  • Q
    Qianruo Zhang

Conference

65th ISI World Statistics Congress 2025

Format: IPS Abstract - WSC 2025

Keywords: big data, disease mapping, spatio-temporal analysis

Session: IPS 809 - New Avenues in Disease Mapping

Wednesday 8 October 10:50 a.m. - 12:30 p.m. (Europe/Amsterdam)

Abstract

Spatio-temporal disease modelling is most often based on Bayesian hierarchical models, where the spatio-temporal structure in the data is captured by sets of random effects. These random effects are most often modelled by Gaussian Markov Random Fields, with examples being conditional autoregressive models in space and autoregressive models in time. The simplest and most parsimonious approach to spatio-temporal modelling is the separable model originally proposed by Knorr-Held in 1998, which assumes that disease risk can be modelled by spatial and temporal main effects. This model is computationally efficient to fit because it only contains N + K random effects, where N is the number of time periods and K is the number of spatial areal units. However, the cost of this efficiency is a lack of flexibility, because each areal unit is assumed to have the same temporal trend in disease risk which is most often unreasonable. This model has therefore been extended to incorporate non-separable structures in space and time in many different ways, but they typically include at least N x K random effects, and are hence computationally much more demanding to fit. Furthermore, when the data sets are massive then these more flexible models may not even fit at all. This talk therefore proposes a computationally efficient model and fitting strategy that yields a non-separable spatio-temporal structure, but is much faster to fit than existing models is and therefore feasible for modelling massive spatio-temporal data sets. The model is applied to a new motivating study of asthma, coronary heart disease, and chronic obstructive pulmonary disease prevelances at the Lower Super Output Area level in England (K=32,754) for N=13 years.