A Picture is Worth a Thousand Definitions: Validating Company Data with Satellite Images and Street View
Conference
65th ISI World Statistics Congress 2025
Format: CPS Abstract - WSC 2025
Keywords: satellite imagery
Session: CPS 72 - Enhancing Data Quality and Analysis through Spatial and Geolocation Techniques
Monday 6 October 4 p.m. - 5 p.m. (Europe/Amsterdam)
Abstract
The statistics department of the Bundesbank provides high-quality company data to analysts, researchers and the public. This data is sourced from various channels and requires thorough processing to ensure consistent and reliable data quality. In order to achieve the desired level of data quality, several quality checks are conducted. The accuracy of the sectoral and regional classification of a company is crucial for meaningful analyses and valid conclusions. However, a questionable sector code might point to other incorrect company data.
Together with Darmstadt Technical University, the Research and Data Service Centre of the Bundesbank explores the potential of substituting or augmenting manual quality checks by leveraging data sources, such as satellite images, and automated techniques. By combining data containing several representations of a company’s activities, which might include text descriptions and images about its products, production facilities and office locations, we expect to learn valuable information to determine its economic sector more accurately.
The overarching objective of this research is to validate the contextual information contained in the company data provided by the Bundesbank with images from Google Street View and satellites by applying multimodal natural language processing (multimodal NLP). This method allows for processing of textual data from company data in combination with information de-rived from images.