Download PDF

Enhancing Public Confidence in Analytic Quality and Privacy Protection for Public-Use Statistics

Organiser

John Lamont Eltinge

Participants

Dr John Lamont Eltinge (Chair)

Dr Lilli Japec (Presenter/Speaker)

Methods for communicating and enhancing public confidence in official statistics - some examples

Mr Keven Bosa (Presenter/Speaker)

Use of random forests to improve small area estimation

Dr Lars Vilhuber (Presenter/Speaker)

Reproducibility and transparency in academia, and implications for statistical agencies

Mr Marcel Matthew van Kints (Presenter/Speaker)

Maintaining public trust while increasing the use of integrated data for policy development and evaluation, and research: Experiences from Australia

Ruobin Gong (Presenter/Speaker)

Balancing usability and privacy in statistical data dissemination: Some normative and pragmatic considerations

Conference

65th ISI World Statistics Congress

Category: International Association of Survey Statisticians (IASS)

Proposal Description

Major changes in data collection (including declining response rates for sample surveys, and increasing availability of non-survey data sources); in methodology for modeling and integration of multiple data sources; and in changing stakeholder expectations, have led to deep reconsideration of the analytic quality and privacy protection of public-use statistics. Issues of analytic quality include bias and variance of high-profile estimators; and validation and sensitivity analysis for related models. Privacy protection concerns include identification risk and attribute risk for individuals represented in survey or non-survey data; and also require consideration of the stakeholder utility of privacy-protected data releases. This session reviews a range of practical concepts and methods to enhance public confidence in the analytic quality and privacy protection features of public-use statistics. The scope includes both standard published tables, graphs and maps that display confidentiality-protected statistical estimates; and microdata disseminated through tiered-access procedures.

Paper 1 gives an overview of the impact of the new survey landscape on methods and strategies used by national statistical institutes (NSIs) in order to strengthen public confidence in official statistics. It places special emphasis on communication issues and practical examples. The remaining papers provide in-depth coverage of specific dimensions considered in the introductory overview. Specifically, Paper 2 considers methods to preserve public trust while increasing the volume of disseminated information. Combining administrative and survey datasets, and making them available to policy analysts and researchers, enables new insights into complex economic and social policy questions. And the use of these data poses new privacy, ethics and trust questions for NSIs. This paper examines integrated data through the lens of maintaining public trust and reflects on the experiences in Australia over recent years. Paper 3 explores practical approaches to issues of transparency and reproducibility in statistical work by academic and governmental groups. Based on the experience from reviewing and approving 2,500 replication packages for publication in economics journals, the lessons learned are mapped into a more general statistical publication process, with special reference to non-public data and other secrets. Paper 4 discusses enhancement of stakeholder confidence by balancing utility and privacy protection in the dissemination of statistical information. It examines the adoption of differential privacy, a class of formal privacy definitions as the disclosure limitation standard for public-use statistics. It sketches an imperfect mapping between the explicitly formulated and quantified dimensions of privacy protection and aspects of data utility that encompass both the scientific and the social values served by the dissemination of usable and reliable statistical information. Paper 5 highlights methods to improve the quality of estimates produced for small geographical areas, which are of crucial importance for many statistical-information stakeholders. The use of random forests is proposed to modify customary Fay-Herriot (F-H) area-level models, with emphasis on cases involving small sample sizes in survey domains and departures from standard linearity assumptions. Properties of the proposed method are assessed through a simulation study, which can help both technical specialists and general stakeholders understand nuances and trade-offs in data quality that are inherent in this work.

65th ISI World Statistics Congress

Enhancing Public Confidence in Analytic Quality and Privacy Protection for Public-Use Statistics

Organiser

Participants

Conference

Proposal Description

Submissions