The collection, linking and use of data in biomedical research and health care: ethical issues


Published 03/02/2015

Front cover
Hands point to big screen with DNA sequence code

Since the last decade of the 20th century, developments in biotechnologies, health care systems and computing have led to a dramatic growth in the volume and variety of data about people’s health and biology.

More data is being generated than ever before, including:

  • Electronic medical records
  • Genome sequences
  • A wide variety of biomarkers
  • Body and brain scans
  • Data from clinical trials or observational studies
  • Lifestyle information collected directly by individuals

Advances in data science mean that there are now also more ways to collect, manage, link and analyse health and biological data in order to generate information for research and other purposes.

A new attitude towards data

Developments in data science that allow researchers to manipulate and ‘mine’ huge data sets have led to the emergence of a new attitude towards data, where it is seen as a valuable resource that may be re-used, linked, combined and analysed indefinitely, for a variety of purposes.

There are significant opportunities for using data to improve medical practice,produce more efficient services, generate new knowledge and drive innovation.

Data initiatives

The focus of the report is ‘data initiatives’, which we define as projects involving one or both of the following practices:

  • Where data collected or produced in one context, or for one purpose, are re-used in another context or for another purpose. This may result in the data taking on a different meaning and significance. For example, biomarker data may be used to inform someone’s treatment, but may also be used for the development of therapies, the allocation of resources, or the planning of services, moving between health care, research, financial and administrative contexts.
  • Where data from one source are linked with data from a different source or many different sources. For example, where data from a disease registry are linked to data about the location of discharges of environmental pollutants to examine or monitor any link between them.

Data initiatives exist at different scales. They may be large – at the scale of a national biobank, health system or international research collaboration – or small – on the scale of an individual research project.