Machine learning for anomaly detection in business administrative data for statistical purposes in Australia
Conference
65th ISI World Statistics Congress 2025
Format: IPS Abstract - WSC 2025
Keywords: 'statistical, anomaly-detection, machine learning
Session: IPS 799 - Real-World Machine Learning Applications in Official Statistics
Thursday 9 October 10:50 a.m. - 12:30 p.m. (Europe/Amsterdam)
Abstract
The Australian Bureau of Statistics (ABS) is assessing the use of machine learning to identify anomalous data in business administrative datasets used for statistical purposes.
Unsupervised methods can identify unexpected anomalies, which is useful for new or evolving datasets where there is limited information about what an anomaly looks like. The unsupervised methods considered in this work provide anomaly scores that can be used in combination with significance measures to better-target manual review. Patterns in detected anomalies can also help inform design of edit rules. This work also considers a number of supervised methods to predict a correct value for creating significance scores. A number of challenges are explored to enhance our understanding of how to assess performance of methods detecting unexpected anomalies, how to use an ensemble of methods, and the situations where these methods outperform traditional approaches.