FRDN highlights from IDW 2023
FRDN members attending International Data Week 2023: A Festival of Data, 23–26 October 2023, in Salzburg share their highlights.
Ziad Choueiki, UGent
The session that stood out the most for me was by Zefan Zheng, a PhD candidate at Max Planck Institute of Empirical Aesthetics, Germany. In his presentation Making research FAIR from the start - a researcher's perspective, he touched upon the still prevailing culture of emphasizing publications over all other outputs of research such as pre-registrations, data, code, and protocols. He argued that when the focus is on open science rather than the publication, the benefits achieved are far-reaching. The PhD researcher will have a range of outputs, beside the final publication, that are described, open, citable, creditable and reusable (Figure 1). This is in stark contrast to the classic approach where only the published paper is in the spotlight (Figure 2). From a wider scientific perspective, this FAIR approach recognizes all elements of the research journey which allows a quantitative comparison of open science engagement on an institutional level.
I loved this presentation because it resonates with my own attempts at conveying the benefits of open science and FAIR to the researchers at UGent. The benefits of FAIR research from the start is that you are likely to have more outputs than a traditional research flow. A publication is not always guaranteed but the components of a potential publication (i.e., pre-registrations, data, code, and protocols) can be created while conducting research. Many junior researchers graduate without a publication which might lead to challenges in securing future employment. However, with proper implementation of the FAIR research workflow any aspiring researcher is given the opportunity to show their academic discipline and rigor since every element of their research is properly described, open, citable, creditable and reusable.
IDW2023 was my first conference as a Data Steward. After two years on the job and interacting with the wider Flemish landscape on RDM via the Knowledge Hub and the FRDN network days, I was quite pleased to realize that the knowledge and practices circulating in Flanders are on a par with the wider discipline, if not ahead of the curve in some respects. Granted this is my own personal opinion, but it felt refreshing that among the sessions I have attended I was always nodding my head in recognition of something that we already have or do, or raising my hand to add a snippet of information or comment.
Figure 1: Proposed FAIR research workflow*
Figure 2: Traditional research workflow*
*Reference: Zheng, Z., & Chen, X. (2022). The pitfalls of traditional workflows – with a silver lining. DataCite. https://doi.org/10.5438/8TA1-ER95
Willeke De Haan, KU Leuven
While the description of data becomes more and more standardized using many (community) standards and repositories available, not many standardized ways to describe samples from different disciplines are available. The Describing Chemical, Physical and Biological samples digitally: Seeking harmonization session by the Research Data Alliance showed the need of more harmonized sample description. Examples that were given were microplastic samples in oceans that were described by volume of water filtered, by the surface of collection or by the volume of filtered plastics making the studies hard to compare. Another example from chemistry showed that the age of samples and storage condition is often not recorded whereas a chemical stored at room temperature for a couple of months may be very different from a freshly prepared sample. This session showed some standards and initiatives and proposed an RDA Working Group project to develop best practices for sample data model specifications.
Maybe we should also discuss in our community whether our institutes have guidelines for the description of samples and how they are aligned with each other and international standards. With the biobank we have regulations for human material, but I’m not sure about all other biological and chemical samples that are collected and used.
Other sessions that stood out for me were the Active Data Management Plans and Data Management Plans sessions discussing the future of data management plans. Currently in most institutes the data management plan is just a document generated at the start of the project that does not evolve or is monitored. But there are many initiatives to make the DMP a document that evolves through its entire lifecycle ensuring that data is appropriately managed, archived, preserved and available for re-use. Some examples of institutes were shown where the DMP is already more integrated with the data flow with their challenges and future plans. To make the DMP a really useful document for our researchers we should keep an eye on these initiatives and start to move from the pdf downloaded from DMPonline to a more active DMP
Dieuwertje Bloemen, KU Leuven on FAIR, GORC, TRUST, CARE and other acronyms
As we all know there are many acronyms circulating in the landscape of RDM (ooh, another one) and that was no different at this conference. We are all familiar with the FAIR principles and some might already know or might have heard of the TRUST and CARE principles. But GORC was a relatively new one to me.
As someone who works with KU Leuven’s institutional data repository, RDR, on a daily basis, the TRUST principles (Transparency, Responsibility, User focus, Sustainability, and Technology) have been a great guiding principle to further develop the repository and its documentation and policies. The principles are similar to the FAIR principles, but rather than describing best practices for research objects, it describes best practices for repository infrastructures and how they can improve their set-up and documentation. The CARE principles (Collective benefit, Authority to control, Responsibility, and Ethics), on the other hand, have a similar focus as the FAIR principles, and can even be considered an extension of the FAIR principles when the research objects have to do with indigenous data and their governance.
The GORC acronym rang a bell for me, but I didn’t know the details, so I decided to join the RDA session on GORC (Global Open Research Commons). The GORC RDA WG aims to create a framework of recommendations and to create a roadmap to implement commons on an institutional, national, cross-national or global level in such a way that these commons initiatives have an interoperable common core. As they say: “The purpose of the GORC-WG International Model (IM) is to provide a framework and common language to stakeholders around the world who are committed to developing interoperable research services for the public good. The target audience for the model is anyone that is involved in the planning, development, operation, funding or use of a research commons” (The Global Open Research Commons International Model, Version 1).
The model outlines the essential parts of a commons, this includes infrastructure, standards, but also human capacity and governance structures to name a few. The overall parts that they deem essential in the GORC model are shown in the figure below. The white elements are the human/social elements, the bottom blue elements represent what people interact with and the central dark blue element represents the importance of standards at the core of any commons. All of these are categories within which the GORC model defines a series of attributes that are important to the establishment or improvement of a commons. As the chairs of the WG, who were also the chairs of the RDA session, noted, the GORC model can be used by anyone, from policy makers to data stewards to researchers to get an insight of the current state of affairs and what work could still be ahead to improve the landscape of research data management best practices, infrastructure and human resources.
Other fun acronyms worth mentioning that were used throughout the conference, though not all are currently internationally endorsed:
- RDA IG’s, WG’s and BoFs (Research Data Alliance Interest Groups, Working Groups, and Birds of a Feather)
- CODATA: Committee on Data of the International Science Council (ISC)
- WDS: World Data System
- CDIF: Cross-Domain Interoperability Framework
- PARIS principles: Processable, Askable, Reliable, Incorporable, and Suppliable.
- VALID framework (Veracity, Agency, Longevity, and Integrity in Datasets)
Relevant sessions at the IDW23 conference related to CARE, TRUST and GORC were Beyond FAIR: Reusing Chemical Data Across-disciplines with CARE, TRUST, and Openness, Indigenous Data Sovereignty, FAIR and CARE principles in practice, case studies of implementation and Global open research commons: what it takes and the road to get there. Reflections and discussion on creating a commons attribute model and a commons integration roadmap.
Veerle Van den Eynden, KU Leuven
I was impressed by the presentation and poster A data-driven approach to improve distributed FAIR data in a cross disciplinary organization on how the Helmholtz Association in Germany have built a pilot dashboard to monitor open and FAIR research data in their organisation. The dashboard shows datasets published alongside journal publications. The data analyzed in the dashboard starts with harvesting literature metadata from the libraries of their research centers. Via SCHOLIX links and data repositories used by their researchers (known from surveys they do) they identify data publications linked to those journal publications. Then the F-UJI framework is used for automated FAIR assessment of the data found. Knowing how difficult it is to build a picture of all the datasets our researchers publish in data repositories worldwide, it is worth looking at this methodology, to see what we can learn from it.
To me visually the nicest poster at the conference was Building the Data Stewardship Profession at UCL, by James Wilson. Also the content is excellent, showcasing the diversity of services, governance and projects provided by the Centre for Advanced Research Computing (ARC) at University College London. The ARC data stewards support researchers to produce FAIR research data and also conduct research themselves in data intensive methods. This poster can inspire us all in developing services and tools in our institution, and in being creative when showcasing our work.