register for our webinar

register to access our webinar

The role of funders in publishing and sharing research data, publications, and conference materials

The role of funders in publishing and sharing research data, publications, and conference materials

The role of funders in publishing and sharing research data, publications, and conference materials

play the webinar

Play the webinar

play the webinar

Play the webinar

Register for the webinar

(registration may be required)

The role of funders in publishing and sharing research data, publications, and conference materials

August 3, 2022

Mark Hahnel

In this webinar, hear from Figshare Founder and CEO Mark Hahnel on what significant role funders play in enabling their researchers — intramural and extramural — to make their research outputs citable, shareable, and discoverable. We’ll discuss some of the use cases other funders and government agencies have for using Figshare for data, Green Open Access publications, conference outputs like presentations and posters, and more.

Transcript

Please note that the transcript was generated with software and may not be entirely correct.

0:05

Hi, everyone. Thanks for coming today. To this webinar titled, The Role of Funders in Publishing and Sharing Research Data, Publications, and Conference Materials. And my name is Megan Hardeman.

‍

0:16

I'm the Product Marketing Manager at Figshare and I'm joined by Mark Hahnel, who is the CEO and founder of Figshare who is to be delivering the webinar today. And a few pieces of information before I hand over to Mark. And that is that we are recording this webinar and we'll send it around to all of the registrants afterward.

‍

0:36

And there will be some time at the end for questions, so this question functionality and goto Webinar And there's a chat as sort of section. So you can put your question in either place, and we'll get to them at the end. So, please do feel free to ask questions.

‍

0:54

All right, I'll hand over to Mark.

‍

0:58

Thank you, Megan. Hi, everybody.

‍

1:03

Hopefully you can see my screen, and you can hear me loud and clear. I'm sure Megan will let me know if that is not the case.

‍

1:15

Thank you. OK, so as Megan mentioned, this is going to be talking about the role of funders in publishing and sharing of research data, publications, and conference materials.

‍

1:26

There's gonna be a larger focus on the data side of things, because that's where started. But I also wanted to share a few examples of different organizations and how they're making this content available.

‍

1:42

And as I mentioned, this was where we started.

‍

1:45

So for those who don't know, we have a generalist repository platform, which we internally called FIG share dot com. Fixture is a cloud based platform for storing, sharing, discovering research outputs.

‍

2:00

So, way back, when, 10 years ago, we've been going for 10 years, We've got 10 years of experience in the space. We started.

‍

2:08

it was originally built for myself to share my own research outputs, and over time, we've seen, as research gets more computational. When I talk about data, I'm talking about files, right? What do you do with the files that are products of your research? If you're ever talking to a researcher in there saying, I don't have data. You can say, well, what are the files? You produce, and more often, they do have data. It just might be a Gaffey map, or a spinney dinosaur to three-d. file. Or a Jupyter notebook, as we can see here. And they may not consider that data, because it's not a dataset.

‍

2:47

And so, with this talk today, I'd like to talk through a few of the incentives in this space. A few of the reasons why I think funders play a big role, where I think this really huge opportunity for the funders to play more of a role, and how we have fixed that can help with that.

‍

3:05

But also the broader digital science, the kind of thought process around all of this.

‍

3:12

So, with the last 10 years, we've really moved from, here is the publication two, and here is the, the citation count that it has, all the impact factor, the journal that it's published in, which obviously isn't a measure of the, the the actual individual article itself.

‍

3:31

You may have an article published in a very high impact factor Journal that actually gets no impact is never re-used, and therefore, the impact factor isn't a great measure that but what we think of a digital science is more of this.

‍

3:47

What is the broader landscape problem, individual papers, and individual metrics, to all of the research outputs, and all of the different ways to measure the impact that that's having?

‍

4:00

And another area that we'll be starting to look into is one of our sister organizations, or petr's, is happening with data management plans. So watch this space.

‍

4:12

There's going to be a lot of stuff coming out with regards to how researchers are tracking or making their data available, and what the data accessibility statements within articles all saying.

‍

4:27

So, lots, lots of thought around the bigger picture around datasets.

‍

4:31

So, you know, why, why is data becoming an issue, or a prominent thall piece for funders? And so we saw this earlier in the year, from Nature.

‍

4:44

The NIH has issued a seismic mandate, and there was lots of conversation post this about a post, this publication about how much of a mandate, it is, all of those things.

‍

4:55

And this is to do with the fact that the NIH is saying that, as of the 23rd of January 2023, if we fund you, not only should you be making all of your papers openly available, you should also be making all of the files that backup that paper, the datasets, should be made publicly available.

‍

5:15

This isn't a new thing, so this is powered by JISC in the UK. There's something called Sherpa Juliet. Luckily, it's got a fantastic names. He is very easy to find Sherpa Juliet. But here you can see, I just felt it on.

‍

5:32

Funders that require data archiving, and they have an awful lot of funders in the catalog that suggest data archiving or recommend data archiving. But there's 54 around the world now that do require data archiving, and this is a global policy thing changed, in the last 10 years that we've been working as FIG share, we've seen it go from pretty much 0 to 54.

‍

5:59

If we look at the number of papers in the published literature that come out year on year, the link to dataset DOI's in repositories like the free FIG share dot com spinodal, or driers, or friends that other generalist repositories. Then we see.

‍

6:18

Not only is the amount of links growing exponentially, year on year, but we also see the funding behind those papers. This is some data from our sister Company Dimensions.

‍

6:28

You can see that the number 1 1 was the NSF Number two, is the NN SFC in China.

‍

6:41

And so you can see, it's a global thing, if we look at the sorting based on the number of publications that are linking to each of these general repositories. And this is just a snapshot, right? This is just a little view. It's not taking into account everything. That goes on gen bank or all of these other subject specific repositories. So there's real growth in the amount of data that's being made available. There's real growth in the amount of policies that are coming out from different funders, and so.

‍

7:09

one thing we need to think about is, how do we solve this logistically?

‍

7:13

The other things to think about is, to not get away from, is why data and sharing of data is such a monumental and hugely important shift in research, and this is This is for any treatment, but this is just referencing the idea obviously, through Covert.

‍

7:31

We've seen this happening in real terms that when laboratory's are collaborating, they can work, they can get farther faster, if you're looking for a vaccine.

‍

7:41

Then, if these laboratories of building on top of the work that's gone before, then we can get moved further fast. So, this works in all aspects of research. It's just Cove. It is very good example of getting a vaccine.

‍

7:53

We needed to build on top of what had come before, and we don't want to break these links in any way, so we don't want to have my data is available upon request. We don't want to have that the paper is behind a paywall. That all slows everything down, and so the efficiency of research can be greatly improved by sharing all of the research outputs openly. So there's a more holistic view for all researchers who are working in a field.

‍

8:18

The second thing, which has been a fantastic change in the world, in terms of opening research outputs and having new insights on top of them.

‍

8:30

So last year we saw Alpha Fold, which is an AI project from DeepMind, the google owned AI company.

‍

8:43

They released, and an update on looking at protein structures, predicted protein structures.

‍

8:51

That has been a long researched area in the molecular biology field, and last week, they shared all of the structures that they have, the AI has generated to predict protein structures, and you can see the actual dramatic effect here.

‍

9:10

So they've shared these all openly. These can be used by everyone.

‍

9:14

This is new data built on top of old data, and previously, experimental protein data bank biology, has come up with 190,000 structures.

‍

9:27

They previously had an alpha fold release that got it to a million structures, and then overnight, they've come from, um, a million to 200 million structures olmedo openly available.

‍

9:39

And so, the, they've gone 200 fold in terms of the amount of protein structures that we can predict, based on taking data that was openly available, openly available, from ..., and the protein data bank.

‍

9:54

And using the machines to look for the patents, and I think this is really what's going to drive the biggest change in research in the next 50 years is making data available, having machines, sprinkling some AI or machine learning on top of it, and finding new Insights that Machines are better at looking forward than humans.

‍

10:17

So, in terms of thinking about the, how this has evolved over time, and how all of the different excitement, we do a report every year, called The State of Open Data. You can Google the state of open data and find it. And this is a survey of researchers, along with them.

‍

10:33

Thoughts from around the world and some senior stakeholders to understand what they think of the important things, whether it's in the publish, academic publishing world, fund, the world, the academic institution world, or the researcher point of view.

‍

10:48

And what we found in the State of Open Data last year, is this more concern about sharing data sets than ever before.

‍

10:54

Of course, is this push towards more data being made openly available and the obvious benefits for it?

‍

11:01

But there's also more concern about people misinterpreting data and things like fake news coming about and and people using information for their own objectives, as we saw a lot during Covert on the Plus side of things.

‍

11:17

In terms of the research, your points of view, we know that linking papers to their supporting data in a repository is associated with a 25% increase in citations. This was a study on half a million papers, so it's no small N numbers. So there are advantages, advantages for researchers to share the data.

‍

11:37

And we can, as organizations working in the field, we can show them this and show that it can be good for their career. But if we look at these datasets, this is the state of open data, the state of open data we've been running for a few years now, it's the longest longitudinal study in the space.

‍

11:55

If we look at the general trends.

‍

11:57

So you can see that in 20 18 we had people who were concerned about misusing, datasets being misused, but that's grown year on year.

‍

12:06

So the problem is getting worse 43% in 20 21.

‍

12:12

Then some of the other things that we see the concerns about blockers in people sharing their data.

‍

12:21

People never have enough time, that's being consistent. I don't know how we can improve for researchers. But you know, some of the other things, I don't know if I have permission from my phone during my institute.

‍

12:30

So you are a fund. You need to be explicit because still a quarter of researchers don't know if they're allowed to do this.

‍

12:38

A lot of people who are unsure about copyrights and data licensing, this is a new area for a lot of researchers so they don't know who to go to. They come to FIG share. They go to the publishers and say, I'm not sure what I'm supposed to be doing about data licensing. And there's obviously, there's always not receiving appropriate credit or acknowledgement.

‍

12:58

Even though I I just highlighted that making your data available is associated with having more impact for your research, which obviously, we know is the lifeblood of a researcher's career.

‍

13:10

So in order to better work with organizations who have experience in this space, FIG share has for awhile now being building organizational data repositories.

‍

13:26

And this is because, uh, a generalist repository is good. If you have no other option, the FIG share dot com There's a note of the world is where academics can come. upload some files, add some metadata, make them publicly available, get a DOI.

‍

13:43

But, know, there's six million outputs on fixture. We can't handhold everybody through that because the, the, the Sustainable Mobility Model doesn't work that way.

‍

13:52

So, how can we get it so that more researchers have more handholding through their dataset publications. So, here's just a snapshot of some of the universities that we've been working with. We've worked with over 100 universities around the world to provide a data repository.

‍

14:09

Um, it's all on their own domain, and it's all branded the way. You can see you've got Newcastle in the UK, Denmark here, Carnegie Mellon over in America and and Ziva Hub is the University of Cape Town Data Repository.

‍

14:22

So we provide whitelist repositories that tick all the boxes for these organizations in terms of the expectations on the funder side of things.

‍

14:34

This is another question from the state of open data, where we asked, should funders make sharing of research data path that requirements for awarding grants? 70, nearly 70% said Yes, and then should fund us withhold funding or penalized in other ways researchers who cannot share their data. To third said yes. So two thirds of your researchers are saying you should be having a mandate because we want to even the playing field, right? I don't want to share my data, if my peers don't have to share their data. It gives them this in unfair advantage, that they'll have more research data to play with, and we're all competing for grants at the end of the day.

‍

15:10

So what happens if you have a mandate from a research funder, but no place for researchers to make their data available? Because although we work with hundreds of universities in providing data repositories, if you're lucky enough to be at one of these universities, then you have a place where you can make your data available.

‍

15:29

And usually, the support staff to help you make your data available. But what about everybody else?

‍

15:35

So, a great example of this is in North America.

‍

15:39

The, um, there is a federal requirement that all data, it has to be made openly available, where it can be, obviously there's caveats on the old data can be made available. The European Commission has a great lineup as open as possible, as close as necessary.

‍

15:56

So, in the repository's, we provide, we provide a lot of functionality on only available to certain IP ranges.

‍

16:03

This data is open as of a certain point of time in the future, So you have embargoed content Here, you can see, it lives on a dot gov domain. So, obviously, there's all of the added jumping through the hoops and an associated with with hosting on that level.

‍

16:21

And so, this is where fixed has spends a lot of its time in building repository functionality to help institutions, to help funders, to help publishers comply with these policies with that. And goldmine goal in mind, that this is very important for moving, shifting the needle on the efficiency of research.

‍

16:40

And so, when I say jumping through hoops, there's a couple of options.

‍

16:45

You can build your own repository, or you can have an organization like Fixture or others?

‍

16:52

Give me some policy compliant infrastructure in order to take care of the basic publication of these outputs, the kind of like turn it on and it's a turnkey platform.

‍

17:07

Things like ISO 27,001 certification is something that fixture has.

‍

17:14

Obviously we have the Pandemic Action Recovery Plan that goes along with that, which I wish we didn't need to implement because obviously we weren't affected directly by the pandemic but it's good to see that these these different policies exist for a reason.

‍

17:31

The other things we talk about, things like single sign on, and integrations, web accessibility rules, WCAG, 2.1, AA, section 508, in 3 0, 1, 5 4 9, our global policies around web accessibility. So when we say open content, it what is open for everybody.

‍

17:53

And obviously, there's a few other things, like, we make sure we play well in the space.

‍

17:58

And so that if you built something yourself, you don't have to plug into all of the existing.

‍

18:04

You are going to need to plug into all of the existing, great work done by other communities, like make data count Like, OK, like data site. And so we help do that. We're also very focused on region specific storage, as there were a lot of data sovereignty rules that we need to play with.

‍

18:23

So the platforms we build, also have an API, and the the thing about so on the left-hand side here, you can see our API documentation. So this means that you can pull all of the metadata, all the files out at any point. You can computationally talk to your repository. So you can set it up. So it publishes data automatically.

‍

18:44

You can also pull out the metadata in certain ways, through the ...

‍

18:49

EMH feed, which deals with a lot of different metadata standards. So you can pull all of the metadata into other systems.

‍

18:57

And obviously things like the statistics. You can pull out and pull into other other organizational infrastructure.

‍

19:04

You might want to know what cities are downloading your chemistry data this month, Things like this.

‍

19:13

Then on the right hand side here, you can see the breakdown of the different parts of the, of the infrastructure.

‍

19:19

We do have a large tech stack, and this is this is this stuff, this is why we have a large development team, is because you need things to just work, you need to make sure that your content is persistent, the thing and all of the different subjects and sections are functional all time. So, obviously, we have standard SLAs with uptime and things like this.

‍

19:40

In terms of the integration with the ecosystem of research, this is, this is one of the interesting ones, which is, we use hod linking of, one of the metadata fields you can fill in.

‍

19:57

When you're publishing your search, Is hard linking to Fund ID is one of our sister organizations dimensions has all of the funding information for them, large funders in the world, and the majority of small ones, too.

‍

20:12

And so, you know, researchers can link through and hard linked to their grants.

‍

20:18

This is really important For me, I feel, because one of when we talk about how we're going to get researchers to comply with policies and mandates over time is, how are you going to check this?

‍

20:31

And so, I think, we'll get to a nice point in the next five years, which is, you have a data management plan. You said you were going to make your data available.

‍

20:40

If we search on that grant code, we can see that you've made some papers openly available, but you haven't made any of the data openly available. And we'd like you to do so. Here's a place where you can make your data available. And I'll come back to this in a minute, because I think that the grant information is really a key part.

‍

20:56

The funders should be pushing their researchers, to make their data openly available, and link it to that grant information.

‍

21:06

In terms of dois Digital Object Identifiers, these are the the things that researchers are familiar with, in terms of citing persistent links.

‍

21:16

And the goto place, the goto community, in the, um, academic data publishing world is data site, you might be familiar, cross-reference papers. They say, it's the community for data.

‍

21:30

If you're a funder and you want to support them, I'm on the board and they will be doing my job, if I, as a board member, if I didn't promote them. But we use them, they're not for-profit. And they have some good data site and they have some good open infrastructure. And we plug into it in order to provide DOI's so that the Department of Homeland Security has their own DOI's. If you don't have a day site membership, we can provide it to.

‍

21:57

But one of the simple ways in which plugging into existing infrastructure stuff instead of having to build it yourself is a great idea.

‍

22:06

Then, there's other things that come along day by day. We are and actively develop products with actively.

‍

22:12

It's an active moving space, right? So, Google Dataset search, if you're a funder, and you tell your research, is to make their data openly available through your platform, you want to make sure it's indexed in all of the right places, as well. So, as opposed to having to do that yourself, we do it out of the box, for Google Dataset Search, a lot of other organizational searches, as well.

‍

22:35

So, I mentioned the DHS, I mentioned our institutional repositories, But this means that we also have a lot of other funders who are working with us for different ways in order to make certain types of content openly available.

‍

22:48

And I just wanted to walk you through one of them, and how this all worked, in terms of metadata quality.

‍

22:56

Because I think, if we want to get the alpha folds of this world and the deep mines of this world, querying your data, so that your funded research can have more bang for its buck, which, I think, is a big motivator for funders. Metadata. quality is something that I think is the next wave of things to improve.

‍

23:17

So here's some data.

‍

23:20

The I pulled just this week in terms of grant information, collection on non traditional research outputs. So on the left hand side you can see subsection of the free FIG share dot com.

‍

23:36

Um, datasets, outputs that are available for the researchers have made for fix it available through FIG share dot com. So on the left-hand side here you can see the number of outputs.

‍

23:47

Then the second column is the number of those outputs that I've put some think into the metadata field for funding information.

‍

23:58

Right, which is about 10%.

‍

24:00

And then the amount of research outputs that have put grant codes into those funding information boxes, which is about 1%, which isn't great. You know, 99% of those out, new outputs, does not have a a grand code that we can link it to.

‍

24:20

And so, my thinking was that by providing infrastructure for organizations around the world, whether it's the University of Arizona or University College London here, where I am, to handhold their own repository, it's their repository. They have librarians.

‍

24:44

Can we improve the quality of metadata?

‍

24:46

I think metadata. Quality is often improved for those repositories, but what we don't find, is that the funding and the thunder information, the grant info is any better for, if we look at a similar size subsets of the outputs in funder and institutional repository is powered by fixture. We see a similar story, is still about 1% actually link to the grant codes.

‍

25:13

And so, how can we fix this? And is it fixable?

‍

25:17

The thing I want to highlight here is that when we say we work with hundreds of organizations, not all of them have the same funding bandwidth for people to handhold people, handhold people through that, these publication processes, right? So, some of them, I mentioned Carnegie Mellon before. I'm sure if I should just looks at the Carnegie Mellon numbers. They have a large support team, and they will be putting in grant info a lot more than some of the repositories that don't have the support teams.

‍

25:47

So, I wanted to highlight one way in which we looked at doing this. This was a pilot for the NIH Fix Errors.

‍

25:54

Now, working with the NIH, as part of its generalist repository Ecosystem Initiative, where they are providing funding for the generalist repositories to improve interoperability between those repositories. So we don't have silos. So that's fantastic that the NIH has pushing for this.

‍

26:17

But previously, a couple of years ago, we did a pilot.

‍

26:19

Where they wanted to look at if they were going to mandate the researchers to make their data available.

‍

26:24

Obviously, if there is a, um, subject specific repository, like genbank, make your data available there.

‍

26:33

But if you don't have a home for a dataset if there isn't a subject specific repository, they wanted to be able to say, you could make it available here and you will get a better, less level of support than something like ... will fix. Share that free FIG share where you can upload files and make content available but there's no one helping you through it.

‍

26:54

And so, this was why they wanted to create this repository, obviously, fair, feasible, accessible, interoperable, and re-usable data was a big thing.

‍

27:05

The usage metrics tracked openly through things like data to make data count, signing an NIH specific DOI, but they also wanted the expert guidance, some metadata improvement, documentation with custom NIH metadata.

‍

27:20

Obviously, they had specific metadata that I want to be capturing, and so this curation of content meant that researchers would submit to the NIH pilot repository, that we would have research it. We would have people on our team who would, with library experience in the data space, check thematic metadata based on the NIH requirements. And then, it'd be made publicly available in the repository with a DOI, et cetera.

‍

27:51

And so, for their specific review process, you can configure this.

‍

27:55

Obviously, different funders will have different needs, but, you know, these were the checks, that all team approval.

‍

28:02

So files match the description, descriptive title, and you can, you can read through all of these, in terms of, a license has been applied.

‍

28:12

Funding information is specified in length, and repeat related publications are linked.

‍

28:18

So this was a conversation with the submitting authors to make any revisions to improved fairness and providing the training and guidance on best practices for data sharing and using the repository. And so this is what we did in terms of subject specific repository and encouraging it.

‍

28:36

Sorry, in terms of subjects specific, repository specific metadata.

‍

28:42

So obviously on the generalist FRE FIG share dot com You wouldn't put in select and I see because I might be a researcher in Morocco who's funded by the Moroccan Science Fund wondering Watson. I see but if you are funded by the NIH, you are probably familiar with your funding. I see.

‍

29:01

And so what we found from that is on the left hand side here, is that of the datasets made available, all of them had funding information, and hard links to grant information was nearly all of them as well.

‍

29:15

And we see this as well with another thing we've been working on in terms of, we have a subsection of the repository, the free.

‍

29:23

The, the FIG share dot com Generalist repository: We have a subsection where if you have a large dataset, obviously, the generalist repository, we can't support one terabyte files, because it wouldn't sit fit, fit well with the free sustainability model.

‍

29:42

But, if you are an organization and you have an organizational repository, power bi fixture, dilemma itself is individual files up to five terabytes. So you can saw large, huge amounts of data, the system itself will actually support it.

‍

29:56

So we have been piloting for last year, ... Plus, which is if you have large data and it doesn't have a home, you don't have a university repository. Or you're funded doesn't give you a place to publish it.

‍

30:08

You can submit it to ... Plus, as long with a hosting, it's a paid service, but along with hosting the data will also curated. And here, you can see, as well, that throughout that, we see that the outputs for funding info in the outputs of grant info is much higher than the percent that we see in the, in the generalist repositories.

‍

30:29

The other thing that we found as well is we saw that there was a good mix of extramural and intramural research being published. If you are interested in knowing more of the details on the NIH, fix our pilots. If you Google that, you can find that, otherwise, we can send them around afterwards.

‍

30:46

And we saw that just in the short time, it was live, that we had 22 to 28 institutes, centers and offices publishing through the repository, So it is, it was relevant for all of the different areas.

‍

31:01

We could have done more promotion in different areas as well to make sure that everyone, the uptake, would have continued.

‍

31:06

And the other thing that we found was that files were compared to similarly time publish. Research on the generalist FIG share dot com. We find, found the offering that was a need for bigger file sizes.

‍

31:20

So the files that were six times the size, they've got three times as many views, and 1.4 times many downloads.

‍

31:28

This is because my hypothesis has to be backed up, is that the curation lead to better metadata, which may make things more discoverable on average, just as a, you know, more is not always better, But titles that are more descriptive and descriptions that are more descriptive, generally meant that things were more discoverable, and people would be people go to find data through Google for the large part, right?

‍

31:53

So, or the search engine of their choice and so, making it more discoverable, that way really helped the re-use, which I think is a big thing for funders, is making sure that the data is being re-used more, and then being able to track that re-use.

‍

32:09

Another good consequence is this idea about where we can fill the gap.

‍

32:15

Is the unsure about copyright and data licensing. If you say to a researcher, hey, do you want to make your data available as CC by NC four, we would recommend that you use the less liberal more Liberal license, like maybe CC zero ....

‍

32:31

They think you were talking nonsense to them, they probably wouldn't understand, they're focused on their research. And so, you can see why a third of researchers are still in short and sure about this the same as it was four years ago.

‍

32:43

And so by having somebody who can help have a conversation curate the data, we find that once the researcher has made data available once, if they know what they're doing, they'll do it again and again and again, and they don't need to be told once about licensing.

‍

32:57

So that's a really encouraging thing.

‍

33:00

So in terms of the data side of things, I'm going to touch on a couple of other areas.

‍

33:05

Just to wrap us up, the future of fair open data, if I was a funder, I'd be wanting to support customizable review and Curation workflows for more discoverable, re-usable, and trackable, open data. So please get in touch If you'd like to have a conversation about that. We'd be happy to advise, even if you're building yourself on the standards, to adhere to the organizations, to interact with the data sites of this world, et cetera, et cetera.

‍

33:32

Because ultimately, we want more efficient research.

‍

33:37

We want faster discovery.

‍

33:40.

.., obviously, is the obvious one there.

‍

33:43

Moving further, faster, But also we wanted to just, you know, move into the next paradigm of research, which is the humans and the machines working together to make things, to find output, find outcomes that are not discoverable by the humans or the machines on their own, like alcohol.

‍

34:00

So, when we talk about open data for government agencies, funders, the ones we work with, is that over the out of the Box repository infrastructure, to get the organization very far very fast, with policy and standards compliant.

‍

34:14

And then customisable for community needs, if you need custom metadata for a subject specific repository, for your research, if you're involved in just one or several fields of research, that you need custom metadata for, or specific file types that you want to be made available, we can help support that.

‍

34:35

So, I mentioned that again, tracking impacts across all research outputs and this is really as a, as a funder.

‍

34:43

How can you be, uh, aware of all of the outputs that you would fund.

‍

34:49

Both internally and externally intramural and extramural but also within the organization and we so we work with a couple of different funders to to focus on some of these areas.

‍

35:01

Yeah.

‍

35:02

one of which is really interesting for me is the Wellcome Trust who have research outputs that they just previously, you know, put as a PDF in a blog post or something like that.

‍

35:15

And it's it's areas of research that has impact.

‍

35:20

And so here you can see on their search page You can search the content by alt metric attention school, which is tracking that DOI around the web. You see it's got a high attention score here.

‍

35:32

So it's it's being picked up in lots of non traditional areas, such as Wikipedia, social media networks, to see where it's being talked about. There's a good use case here that that citation isn't the only metric for a lot of research.

‍

35:50

If you've written a policy paper for nurses and you want to see how well it's being picked up, you can search across nest Twitter accounts and things like this to see what's happening.

‍

35:59

So we'd like to see that happening here, But you can also see that the article processing charge span file has had six citations. So if they just released it on a blog post, they wouldn't be able to see where people are talking about their research in the traditional literature as well. So, you can click through and see the papers that the traditional paper survey cited this particular dataset.

‍

36:22

In terms of the content that they are sharing, It's as well as their own research on their own outputs.

‍

36:27

They also have things, you know, uh, co creating africa. UK, research management solutions. These white papers, accelerating open access and plan as the final project report.

‍

36:44

Society, publishers, accelerate, accelerating open access and planners, presentations, so as well as the white paper and the project report. What are the presentations and who has downloaded them, and where is it being picked up upon?

‍

36:58

But you can see here that this white paper here, again, had a good, um, it's had some good impact, lots of useless downloads, some citations in the altmetrics score.

‍

37:09

You can click through on that all matching score and dig into what was just saying about who is, who is talking about?

‍

37:17

This stuff is published by you as a funder, where you might want to be able to start those conversations. Right? So this was a welcome to Open access policy review. And you can see the different places where it's being blogs about speed and some book reviews. So you might want to talk to those authors.

‍

37:35

You can reach out to different people to see, to find experts who are, have an opinion on the content that you're publishing.

‍

37:45

And if we talk about open access, this is the last area.

‍

37:49

I found it really, really interesting to see in those 10 years that fixture has been going.

‍

37:56

We've also seen, in another area of the industry, the move from, um, just 40%. Sorry.

‍

38:05

There's 20% of publications being, um, open access to over 50% in 20 20. So these are research articles, traditional publications, that are made openly available to everybody, that as being a big driver for this. more equitable research.

‍

38:23

More people having access to the content, and building on top of it.

‍

38:27

The other side to it though, is a lot of this is gold open access where you have.

‍

38:36

Researchers paying to publish the funders are usually the ones who are paying to publish, which is a model that ends up with ultimately more open access. So people like me who are not at an academic institution anymore can read the research, which is fantastic, move forward.

‍

38:54

So I'm a big proponents of open access publishing, and then this purple line here is green open access, which is I've had to publish behind a paywall, but as a, as a researcher, I can make the author copy of my paper available.

‍

39:12

And so, whilst it's not grown at the same rates, it also does contribute a large proportion of this open access publishing.

‍

39:22

And so, as a funder, you might want to have a copy of all of the papers that you funded as an institution, as a research group, as somebody who was interested in the open research outputs, and not everybody has a place to do this.

‍

39:38

And so we do provide repositories, for organizations, to have a copy of every paper that's coming out of their organization. This is an example from the, the Francis Crick Institute over up the road from me in London where you can see they have a copy of all that Green and all of that gold open access papers in one place. So they can see all of the outputs. You click through, you see the metadata, You can say to it, and then you can pop a full page and read it in whatever screen reader you're using.

‍

40:06

So, it's a very powerful tool for getting to 100% green open access or providing researchers with a way to get to 100% green open access. So, 100% open access for all of the research. So that's all of the different examples I wanted to talk about today. As I mentioned, if anybody has any questions, I'm happy to answer them. Now. if anybody wants to follow up and hear more feel. Free, to e-mail me at market FIG share dot com And as I said, whether it's about getting a repository for your own systems.

‍

40:35

Whether it's data or anything else or whether it's about getting more information about the different areas in which you. that you should be talking to local locally. We work globally. We work with organizations in North America, We work with organizations in Continental Europe, the organizations in Australia. So, we work with a lot of the different organizations and the infrastructure that you should be talking to. So, that's everything for me.

‍

41:02

I'll stop showing my screen.

‍

41:04

And I'm happy to answer any questions if there aren't.

‍

41:09

OK.

‍

41:11

Yes, so if you do have any questions, please pop them in the chat, or the question question box area.

‍

41:19

one thing I might mention also gives people a chance to type their questions out, is that Mark mentioned the state of open data 2021 survey and report.

‍

41:32

The 2022 survey just closed and we've seen some of the sort of key takeaways.

‍

41:41

There's going to be quite a lot on data management plans.

‍

41:46

And respondent Survey respondents attitudes towards data management plans and the roles that funders plan and start off the back of the NIH policy, go into effect in January.

‍

42:00

Sort of how, what, what's the sort of vibe of researchers when it comes to Data Management plans now?

‍

42:06

And what kind of areas can help them meet those requirements? So, keep your eyes peeled end of September for this year's survey and report to come out.

‍

42:20

Very much looking forward to Just kidding.

‍

42:31

Millimeter sinews training, See if there's any questions that have come through.

‍

42:41

Yeah.

‍

42:42

Yeah, there's a question around peds.

‍

42:46

So my point would incentivize funders to provide pads for their funding.

‍

42:52

Yeah, for their funding in the grant if we have asphyxia, we have a list of our, um, organizational beliefs because we we do want to drive change in the space and one of those is persistent identifiers for everything. So PID is for everything.

‍

43:15

And we tend to move as an organization with the infrastructure when that picked up at a point.

‍

43:21

So we have, obviously, for different research outputs, we have different persistent identifiers in DOI's for datasets. We can also support handles so you don't have to duplicate ... of DOI's for green open access.

‍

43:38

We have, um, persistent identifiers in all kids, and, and our OR is the latest one that we are supporting.

‍

43:50

We're adding support for, and so this, which is Research Organizations, so you can get to a point where this person, you get a graph of, this person with, this funding did this, and so dimensions, which the sister organization has persistent identifiers for the funding information.

‍

44:09

But, I think it needs to come from the funders, and they need to be thinking about what is it that they want to be tracking.

‍

44:16

You can go further down the line, I think we'll get to the point where you have persistent identifiers for machines and things like this, but, if you look at the big, persistent identifiers, the big things that would be in that graph, what would you want to know about, Right?

‍

44:31

It would be, which person, from which organization, with which funding, in which research? And what was the impact for it?

‍

44:40

So those are the things I think we need persistent identifiers globally for, and the the other one was saying is on a project level or on a group level, we're seeing persistent identifiers coming through as well.

‍

44:53

So I think the answers to that question is, as a funder, what are the questions we want to answer? And what do we need to do in order to make that happen? And so persistent identifiers for funding is an essential one.

‍

45:10

Here, there's a question around data produced in disciplines where, I mean, you sort of touched on this at the beginning, where maybe data isn't as maybe it's not the right term, or maybe the researchers in that area don't often see there.

‍

45:32

Output says data.

‍

45:34

And so there's a question around, what is it likely that these disciplines don't produce data?

‍

45:42

Anne, but maybe building on that, sort of what kind of role might funders have in educating if they are funding research in these fields, how they educate around what data is, and data sharing?

‍

45:57

Yeah, I mean, I think the big gaps here are around the education for researchers that are going to have to do a lot more, I think, educating researchers that there is an expectation that they don't just publish the papers or the monographs, that it is.

‍

46:10

What is the products of the of the money we spent? Basically, I'm a funder, I gave you a million dollars, you gave me to pay this, lovely. That is never gonna go away.

‍

46:19

That is the, the story of what you did, but I want to see the outputs.

‍

46:25

I want to see the dance recital. So I can, you know, it's a it's a visual thing as opposed to a text document thing.

‍

46:33

I think people producing files in the format that do what they were generated: I'm publishing them in the format that they are generated is something that lowers the accessibility level. So that is a you know, it used to be that you had to convert files to publish in certain papers from something to a PNG and it. You are taking data away.

‍

46:54

You was hazy taking information out, those files.

‍

46:57

and I think that's the thing that we want to move away from is supporting workflows to allow researchers to publish things in the format they were created. That's why we do the previews of the file formats, accepts any file format, and preview it in the browser.

‍

47:12

And just explaining to researchers that that is the expectation that we paid for you to publish research.

‍

47:21

And research means so many things.

‍

47:24

The caveat on the other side of it is, and you might have so much more impact.

‍

47:29

We have these different types of files that you produce. I don't think this is an academic field that doesn't produce files.

‍

47:37

I thought about this a lot over the last decade, and you see it as well as some surprises in the amount of, obviously, the life sciences.

‍

47:46

We, we focus a lot of development on working for, you know, correct ways to open data, so this dataset isn't available, but you can request access to it, we support that for the more medical life sciences. But you also see the amount of social sciences and humanities data, the telling stories with different types of files, and, you know, whether it's maps or anything else.

‍

48:14

There's a lot of it.

‍

48:17

Thank you.

‍

48:19

Think there are any other questions, as Mark said, if anything pops into your head, and feel free to get in touch. And we'll put our contact details in the follow-up e-mail with the webinar recording as well.

‍

48:30

Said, that just leaves me to say thank you so much for coming today to the webinar.

‍

48:34

And thank you very much to Mark for delivering it, And I hope everyone has a great rest of the day.

‍

48:39

Thank you.

‍

View transcript