The Integrated Economic Accounts in R: The Role of Open-Source Technology in Modernising and Enhancing Official Statistics Workflows
Conference
64th ISI World Statistics Congress
Format: IPS Abstract
Keywords: automation, official-statistics, open-source
Session: IPS 241 - Rethinking data governance in official statistics: the central banks’ experience
Monday 17 July 2 p.m. - 3:40 p.m. (Canada/Eastern)
Abstract
This presentation describes the predominantly open-source toolchains and workflows developed by the Economic Statistics Department (ESD) of the South African Reserve Bank (SARB) to source, compile, and disseminate integrated economic accounts (IEA) statistics. The impetus for developing a code-based system came in 2017, when it was realised that the spreadsheet-based system to compiling sectoral account statistics was extremely manual intensive and inherently fragile as most calculations were link based. Additionally, the size and dimensions of the sheets became unmanageable, and there was no native capability for applying version control.
The consequent move to an internally developed, code based, open-source IEA solution has resulted in statistics production workflows that can be considered replicable, transparent, and flexible. We note that such solutions provide a degree of customisation that makes it possible to align (and react) to stakeholder requirements more effectively. Moreover, we are of the opinion that the overall cost of producing sector account statistics, even when considering the development cycle and maintenance that is associated with open-source solutions, would be lower compared to a software ecosystem consisting entirely of proprietary products. Another benefit has been the alignment of interest and associated responsibility of ESD staff that develops and maintains the software solution, an element which is frequently missing when software solutions are outsourced or acquired off the shelf.
We also describe the challenges involved in adopting open-source tools in a central bank environment, where security and stability is of utmost importance. We have faced significant administrative barriers in our journey, partly due to the blurring line between what is considered the “traditional” IT domain and the business domain, but also due to the enterprise architectural design of the institution. The issue of skills shortages is also important to mention. The traditional avenues of recruiting have not provided the expected outcomes, and the focus has therefore turned inward, to target a small group of individuals with an interest in software development and/or existing programming skills. Lastly, we discuss the issue of change management, an element which is oftentimes overlooked when discussing the disruptive nature of technology, and how an aversion or refusal to adopt new technology should be approached.
Looking forward, we aim to migrate some of our on-premise infrastructure to a multi-cloud environment. We believe this initiative will realise additional cost savings, quicker development cycles (through automated build and deploy functions where data scientists only focus on writing code) and a more stable, reproducible environment. This will also enable quick access to cloud based big data tools and other services which are not easily available through on-premise infrastructure.