Synthetic data may be one way the healthcare sector can fulfil the great potential of data sharing. The Novo Nordisk Foundation is supporting a new research project that is working on a method of using original data to generate synthetic data without compromising data security.

Denmark and the other Nordic countries have some of the best and most complete health data in the world. These data have considerable potential to enable the healthcare sector to detect diseases early, improve diagnosis and create individually tailored treatment. However, this potential cannot be realized easily because of the great difficulty in sharing the compiled health data and thus using them for such purposes as research across areas and national borders.

There is good reason to restrict the sharing of health data, as they are basically personal and thus sensitive. However, the inability to share data poses a problem for the healthcare sector in finding new treatment options by analysing the large quantities of health data, for example, in collaboration with the other Nordic countries. A new research project based at the University of Copenhagen will explore a possible solution to this problem by developing and refining a method that can use original data to generate synthetic data sets. The Novo Nordisk Foundation is supporting the project with a grant of DKK 7.5 million.

“Data must be used for the purpose for which they are collected. This is a good starting-point, because we need to know what our data is used for. Further, the healthcare sector urgently needs to develop new solutions, but this requires sharing data more flexibly. Synthetic data can help meet this need because they are based on original data cleared of any details that could be traced back to the original data and thereby the people who provided them,” says Henning Langberg, Professor, Department of Public Health, University of Copenhagen, the recipient of the grant from the Foundation.

Open-source access will ensure quality

The project, called Synthetic Health and Research Data (SHARED), is a proof-of-concept project intended to show that a method can be found that can actually transform original data into synthetic data in a way that makes it impossible to trace the data back to the sources. The synthetic data are created by running an original data set through a mathematical program that adds noise on the data set to ensure that the synthetic data cannot be attributed to specific individuals while maintaining a dispersion and context that makes them statistically valid. This enables data to be shared – without compromising data security.

“An elaborate and secure model capable of generating synthetic data can help to harness the great potential inherent in deriving new contexts from our common health data in a safe and secure way. The results of the project can influence both disease prevention and treatment, not only in Denmark’s healthcare sector but throughout the Nordic countries,” says Niels-Henrik von Holstein-Rathlou, Head of Biomed, Novo Nordisk Foundation.

Together with partners in Finland, Henning Langberg will work to develop a mathematical method that can transform the original data into synthetic data and to test the methods and models developed in a test battery that enables them to test how well the synthetic data are like the original data.

“Our major challenge is to include as many parameters as possible in the synthetic dataset without losing the contexts between data. In addition, it is important for us to have an open-source approach to developing the method so that the academic community can ask relevant questions about the method during the project. This is essential when working in such a sensitive and regulated area as health data,” explains Henning Langberg.

About the Novo Nordisk Foundation

The Novo Nordisk Foundation is an independent Danish foundation with corporate interests. It has two objectives: 1) to provide a stable basis for the commercial and research activities of the companies in the Novo Group; and 2) to support scientific, humanitarian and social causes.

The vision of the Foundation is to contribute significantly to research and development that improves the lives of people and the sustainability of society. Since 2010, the Foundation has donated more than DKK 20 billion (€2.7 billion), primarily for research at public institutions and hospitals in Denmark and the other Nordic countries as well as research-based treatment and prevention of diabetes. Read more at www.novonordiskfoundation.com.

Further information

Christian Mostrup Scheel, Senior Press Officer, phone: +45 3067 4805, cims@novo.dk

Henning Langberg, Professor, University of Copenhagen, phone: +45 2612 7913, langberg@sund.ku.dk