The longest-running longitudinal survey and analysis on open data
Open data saves lives. The global pandemic has highlighted beyond anything that came before it the importance of data sharing in solving the big challenges of our time. COVID-19 data may be the most visualized data in history and it was made publicly available on a daily basis to people all over the world. The urgent need to better understand and treat the virus in 2020 brought unprecedented collective and collaborative action from all research stakeholders on an international scale to bring down barriers to research and speed up analysis and testing. These efforts, combined with support from governments and industry, resulted in not one but many vaccines made available by the end of the year. This gives us a glimpse of what incredible research outcomes are possible when we start with collaboration to address a common threat. Imagine how much more we could do, how many more lives we could save, if research data was routinely made open and shared. So, why isn’t data sharing the norm? The answers lie in the harmony needed between policies, infrastructure, and practices.
Over the course of the six years we’ve been running the State of Open Data survey, we’ve had over 21,000 responses from researchers from 192 countries, providing detailed and prolonged insight into their motivations, challenges, perceptions, and behaviors toward open data.
This year, the survey set out to continue monitoring the levels of data sharing and usage as done since the outset in 2016, and also focuses on a few key topics including what motivates researchers to share data and the perceived discoverability and credibility of data shared openly.
4TU.ResearchData is an international data and software repository composed of 8,000+ science, engineering and design datasets that is run by a consortium of technical universities in the Netherlands.
Whilst the technology underpinning 4TU.ResearchData is provided by Figshare, a team of dedicated staff members are responsible for managing and maintaining various aspects of the data repository, highlighting the importance of human infrastructure to support researchers with data publication.
As the world has risen to the challenge of the COVID-19 pandemic, researchers and the public alike have developed a greater appreciation for accurate and reliable open data sources. From the National Institutes of Health’s Open-Access Data and Computational Resources to Address COVID-19 to the local data sources that inform our nightly news updates, open data have become a more important force in our lives than ever before.
The Sustainable Digital Scholarship service was launched at Oxford in February 2021 to offer support and guidance to researchers and provide access to a managed repository for storing research outputs and to showcase digital research projects. Projects are predominantly connected with the field of Digital Humanities; however, our support is by no means limited to one discipline. The primary aim of service, as the name suggests, is to ensure research data is sustainable. What we mean by that very much aligns with the FAIR principles.
Scholarly publishers have a fundamental duty in upholding research quality, from editorial expertise to managing the peer review process. Research data is a growing part of Springer Nature’s policies, systems and workflows and a key component of the ambition that research outputs should be openly available and reproducible. In order to uphold the quality of data alongside that of the related literature, we are building on the specialist support developed for data articles, developing processes more widely applicable across our journals.
The Open Data movement has slowly grown with 2,700+ repositories available in Re3data. Approximately 1500 are in the life sciences and 68 are COVID-related entries. So, where do we stand on sharing data since the COVID-19 pandemic arrived in 2020? This is the sixth State of Open Data report that Digital Science has published and this year’s survey results and analysis for 2021 reveal shifts in how life science researchers are viewing open data.
Since its launch in 1999, we have invested significant resources in J-STAGE to keep it up to global e-journal standards and good practice, by adding new features from manuscript submission to peer review process to the dissemination of contents. Over these 20 years, however, the scholarly publishing environment has so rapidly evolved that we felt the need to revisit J-STAGE policies and operations in order to adapt to the changing standards and practice.
Creating a data repository was part of the action plans reflecting recommendations in these three areas.
The following are a few common problems or challenges that survey participants said they faced with sharing datasets in this year’s State of Open Data survey. We have provided some tips and examples of how to overcome these challenges based on our experiences at the University of Pretoria.
The 2021 State of Open Data survey provides valuable insights into data sharing globally. Though it can’t capture what researchers everywhere think of data sharing, this survey of nearly 4,500 researchers offers helpful perspectives, some reasons to be hopeful, and some key takeaways that can support discussions on how open data can help validate research and combat scientific misinformation.
The decision to share data and the mechanisms necessary to support sharing don’t exist in a vacuum. In many ways, the problems of how to share data are reflective of both the culture of science and of current logistical challenges playing out across research globally. How can we move to a more open world?
Making data open is not of itself a panacea for public support but it can certainly help
Prof Ginny Barbour
Queensland University of Technology