Can sensitive data be FAIR? The Annodata Framework
Conference
64th ISI World Statistics Congress
Format: IPS Abstract
Keywords: metadata, microdata
Session: IPS 245 - The present and future of access to granular administrative data
Wednesday 19 July 2 p.m. - 3:40 p.m. (Canada/Eastern)
Abstract
Empirical data usage penetrates all fields of social science research and policymaking. This research increasingly uses (linked) micro data which often are sensitive or confidential. With increasing granularity and linkages, data protection and privacy protection issues get complex and often there is a restricted access to this kind of data. ‘Restricted access’ covers a multitude of arrangements: there are different types of access regimes, and access itself may be determined by some local combination of legal prerequisites, purpose, nationality, skills, funding, sponsors or timing.
The FAIR principles capture best practices for research data use, and organizations provide much information to describe their data access procedures. However, this information is invariably unstructured text and requires an understanding of the local context. Without a formal, machine-readable framework to describe access regimes consistently, the findability and accessibility of the data is substantially limited. We establish a metadata framework (‘annodata’) to formalise description of access and use regimes for microdata.
The annodata framework benefits empirical research through improved reproducibility by rendering micro data used in publications findable and accessible. Data users benefit from a level playing field with clear data usage terms and efficient data access. Data owners benefit through reduced redundancy in data governance processes and clear compliance with legal and audit norms. Finally, journals and researchers can easily identify the necessary steps for reproducibility.