There is a growing number of human data catalogues. As the European Health Data Space (EHDS) is developing its vision for a joint European descriptive metadata catalogue to help enable secondary use of the data in these catalogues, the experience of developing the ConcePTION data catalogue can be of great relevance. A recent opinion review shares several recommendations on what is needed to make human data catalogues work better together.
Human data catalogues are like libraries for researchers, making it easier for them to find, share and analyse this data for scientific and medical studies in a standardised manner. This allows the continued advancement of various different fields of scientific and medical research. In recent years, there’s been a push to make research data more “FAIR” – Findable, Accessible, Interoperable, and Reusable. This is to boost scientific progress. However, different fields have their own specific data needs, and with so many data catalogues around, navigating them can become very complicated.
The ConcePTION data catalogue in and of itself is a collection of different data sources in different data banks, hosted by different organisations operating in different fields. It is a good example of the complexities involved in streamlining information across data providers and in data catalogues. From their experiences in ConcePTION and other projects, the authors of the opinion review provide a list of recommendations.
Among their recommendations, the authors suggest letting motivated catalogue teams work independently and encouraging their cooperation through some basic standardisation. This could involve clear rules for sharing data, using unique identifiers to link records between catalogues, defining common data elements, setting up systems for data sharing, and keeping clear records of data sources and contributors. They also recommend creating spaces where catalogue developers can collaborate and share resources. This can be done through international networks like OpenAIRE and the research data alliance, as well as domain-specific groups like BBMRI and ELIXIR.
“We need a more cohesive and collaborative approach to human subject data cataloguing. By harmonizing underlying concepts, encouraging autonomy within catalogue teams, and investing in standardization efforts and cross-domain interactions, the research community can create a more integrated and accessible ecosystem for human subject data, where strengths and limitations of data for secondary use can be assessed. It’s a lot of work but will have great benefits in the end, as more robust evidence can be generated with yields more benefits perhaps especially to patients, and in the case of ConcePTION to pregnant and breastfeeding women,” says Rosa Gini, head of the Pharmacoepidemiology Unit of ARS Toscana, Italy, co-lead of ConcepTION’s work package 7, and one of the authors of the opinion review.
By Anna Holm Bodin
Swertz M, Enckevort E van, Oliveira JL, Fortier I, Bergeron J, Thurin NH, et al. Towards an Interoperable Ecosystem of Research Cohort and Real-world Data Catalogues Enabling Multi-center Studies. Yearb Med Inform. 2022 Aug;31(1):262–72.