Revoloutionising personal data

Our research into experimental data is helping decision-making for national and international policymakers and improving secure data access worldwide.

Geos-RG-PopulationHealthandPlace-CrowdOfPeople

Integration of administrative and population data sources such as medical records and household composition is pivotal for policymaking.

Authorised use of linked whole population data, though potentially extremely powerful, has been limited in many settings by informational gaps, data protection regulations and privacy laws.  

Our work included developing a set of measures enabling enhanced, legally compliant and secure access to sensitive personal data for professional training purposes and evidence-based policymaking. This enables training the next generation of professional information and data scientists.

Our underpinning research into experimental data has informed evidence-based policy by UK government agencies, police forces and the National Health Service

Administrative Data Research Scotland (ADR Scotland)

ADR Scotland is a partnership combining data specialists in the Scottish Government with academic researchers' expertise. Together, we are transforming how public sector data in Scotland is curated, accessed, and explored to deliver its full potential for policymakers and the public.  

Professor Chris Dibben is Co-Director for ADR and Director for SCADR.

See below some highlights on our work on sensitive personal data and enhancing data infrastructure.

Through ADR Scotland, we have developed the legal concept of functional anonymisation.  For a dataset containing personal information, it is treated as legally anonymous if it is 'not reasonably likely' for a person's identity to be deduced from the data. This redefinition means that very detailed information can be released to analysts legally.

The concept of functional anonymisation has been used as the core legal justification for releasing data for analysis in Wales, England and Scotland. Organisations such as the UK Anonymisation Network and the EU funded 'Data Pitch' use shared data to generate impact rely on the concept on functional anonymisation. 

SafePod network

SafePods are a novel concept of ‘embassy’ safe data infrastructure we have developed. Safepods are away from main data centres but have the same controlled environment, allowing the data to be treated as functionally anonymous.

As SafePods are rolled out across the country, travel times are significantly reduced and data accessibility is increased. This leads to more data analysis being possible.

'Embassy safe research spaces' have been developed for national organisations such as the UK Office for National Statistics (ONS) and the Welsh Secure Anonymised Information Databank. This has led the Economic and Social Research Council (ESRC) UK Data Service to significantly extend the locations where sensitive data can be accessed.

You can find out more about the SafePod network on the Scottish Centre for Administrative Data Research (SCADR) website.


During the COVID-19 pandemic, it was paramount for government and health agencies to have detailed information quickly, such as who was living together. 

We developed tools to enable administrative data research, such as linking medical data to property locations. This enabled a better understanding of settings, such as care homes or households, that enabled vital COVID-19 research and understanding for the Government.

With ADR Scotland, we developed new methods for assembling data so it can be used to explore key characteristics of populations. For example, we developed the CHI-UPRN Residential Linkage (CURL) tool. This enabled people to be grouped into 'households' and the nature of these to be understood, such as whether it is a care home – which are not recorded in UK administrative data. The entire Scottish population was probabilistically linked to their exact residence. 

The ADR Scotland measure of households, enabled by the CURL tool, meant that Public Health Scotland could provide Scottish Ministers with information on transmission from hospitals to care homes. The CURL tool was also used to inform the key Scottish Government report 'Discharges from NHS Scotland Hospitals to Care Homes'. 

Professor Chris Dibben was part of the Scottish Government’s initial COVID-19 Taskforce. 


We have developed state-of-the-art 'administrative data enhancements' that have led to a wider range of linked personal administrative data being made available. Secure and private by design, these innovations have been critical to the development of increased national-level provision of data in Scotland.  

These 'administrative data enhancements' are also being used in the wider UK and internationally, organisations such as the UK Office for National Statistics (ONS) and Statistics Canada.   

Some examples of our underpinning research, led by Professor Dibben, includes:

 eDatashield

It is crucial to combine datasets to enable comparative studies, though legal restrictions often limit data sharing across borders.

To address this, software and statistical process using the eDatashield protocol was developed, allowing remote, non-disclosive analyses of sensitive data. It exchanges summary statistics between agencies, enabling exact statistical results.

This approach has been adopted across the UK, facilitating comparative research previously restricted. It permitted comprehensive research across all three UK Census Longitudinal Studies, such as the Scottish Government's Glasgow Centre for Population Health using it in 2016 to compare ill-health in Glasgow with similar places in England.

You can find out more about the eDatashield on the Scottish Centre for Administrative Data Research (SCADR) website.


Synthpop allows the widespread release of otherwise sensitive data. It mimics the real data and preserves the relationships between variables but is safe to release because the data is ‘artificial’. Creation of the Synthpop software package involved resolving a number of significant methodological issues such as novel methods for estimating the utility and privacy of the synthetic data.

The 'Synthpop' package considerably simplified producing safe and high utility synthetic versions of otherwise sensitive private data. It was made available to practitioners in 2014 and has been downloaded over 23,000 times across 129 different countries.  

Users of Synthpop include:

  • Institute for Employment Research, Germany
  • Labor Dynamic Institute, Cornell University
  • Open Source Policy Center, American Enterprise Institute, USA

It is also used in the private sector, as the simplicity of Synthpop and the quality of the data generated make it a great resource for industry.

You can find out more about the Synthpop on the website.