65th ISI World Statistics Congress 2025

65th ISI World Statistics Congress 2025

A Picture is Worth a Thousand Definitions: Validating Company Data with Satellite Images and Street View

Conference

65th ISI World Statistics Congress 2025

Format: CPS Abstract - WSC 2025

Keywords: satellite imagery

Session: CPS 72 - Enhancing Data Quality and Analysis through Spatial and Geolocation Techniques

Monday 6 October 4 p.m. - 5 p.m. (Europe/Amsterdam)

Abstract

The statistics department of the Bundesbank provides high-quality company data to analysts, researchers and the public. This data is sourced from various channels and requires thorough processing to ensure consistent and reliable data quality. In order to achieve the desired level of data quality, several quality checks are conducted. The accuracy of the sectoral and regional classification of a company is crucial for meaningful analyses and valid conclusions. However, a questionable sector code might point to other incorrect company data.

Together with Darmstadt Technical University, the Research and Data Service Centre of the Bundesbank explores the potential of substituting or augmenting manual quality checks by leveraging data sources, such as satellite images, and automated techniques. By combining data containing several representations of a company’s activities, which might include text descriptions and images about its products, production facilities and office locations, we expect to learn valuable information to determine its economic sector more accurately.

The overarching objective of this research is to validate the contextual information contained in the company data provided by the Bundesbank with images from Google Street View and satellites by applying multimodal natural language processing (multimodal NLP). This method allows for processing of textual data from company data in combination with information de-rived from images.