EOSC EDEN: Adding Forever to FAIR

Blog Matthew Addis

Arkivum is excited to be a member of the recently launched EOSC EDEN project. The 8M€ project, which is funded by the European Union, started on the 1st January 2025, and seeks to advance digital curation and preservation across the European scientific research community.  Arkivum is one of multiple organisations from across Europe who are collaborating in EDEN to help ensure that Europe’s valuable scientific research outputs remain accessible and usable for generations to come. 

EOSC: The European Open Science Cloud 

Before looking at EDEN, let’s take a step back and look at EOSC, which is the European Open Science Cloud.  As stated by the EOSC Association, “the ambition of the European Open Science Cloud, known as EOSC, is to develop a ‘Web of FAIR Data and Services’ for science in Europe.”  The key concept here is FAIR.  FAIR stands for Findable, Accessible, Interoperable and Reusable and is a set of principles that helps to ensure data can be found, trusted, used and reused – whenever needed and by whoever needs it – including by machines as well as by humans. As you can imagine, achieving the FAIR principles includes many challenges, not least around metadata, persistent identifiers, discovery, ease of access, good data management, funding and more.   

FAIR Today Doesn’t Mean FAIR Forever. 

But that’s just the start.  If data is FAIR today, that doesn’t mean it will automatically be FAIR forever.  FAIR has a time dimension.  Over time, FAIRness erodes and decays if data and services aren’t sustainable and if data isn’t properly preserved.  Formats become obsolete, links and references break, services fail, bad actors hit services with cyber-attacks, disasters happen, money runs out, data is lost.  Or the data isn’t physically lost then it might as well be if it can no longer be found, or understood, or trusted.   The Digital Preservation Coalition (DPC) looked at what is needed to make data FAIR over the long-term in EOSC in the splendid FAIR Forever project. 

TRUST Your Repository.

Long-term access to FAIR data is where Trustworthy Digital Repositories (TDR) come in.   To ensure data remains FAIR it needs to live somewhere where it is properly looked after and is made available to the communities who use it.  This is the role of a data repository.   Just as data has its FAIR principles, so too do the repositories that hold this data.  They need to support Transparency, Responsibility, User focus, Sustainability and Technology – otherwise known as TRUST.  Repositories can be national or international services, they can be discipline specific, they can be small or large – but they all have a job to do and that’s to be custodians of data for future generations.  That’s not an easy task in today’s volatile and uncertain digital world! 

EOSC Eden Project Logo

The Role of EDEN.   

Long Term Digital Preservation (LTDP) is a key element of providing a Trusted Digital Repository.   This is where EDEN fits in.   EDEN looks to establish common standards for digital preservation and data curation across EOSC.   

Digital preservation is a well-established discipline.  For example it’s over 25 years since the first drafts of the Open Archival Information Systems model (OAIS) were released. Last year saw the third iteration of OAIS as an ISO standard.  The FAIR principles have been around for over a decade.  The need for research data to be FAIR and to be held in repositories that follow TRUST is clear.   

So what’s the problem?  Just combine FAIR, TRUST and LTDP and you’re good to go?  

If only it was that simple!    

On the one hand, good practice for doing LTDP is increasingly well understood (The NDSA Levels of Preservation and DPC RAM and two of my favourite examples).  We’ve got certification of Trustworthy Digital Repositories through Core Trust Seal (CTS), which includes digital preservation in its core requirements.  There’s more on the way with the development of levels of curation and preservation in CTS to help ensure the approach taken by a TDR is proportionate and cost effective for a given community, the type of content it holds, and need the importance and value of continued access to that content.  That’s all positive. 

On the other hand, TDRs inevitably suffer from what pejoratively is called ‘garbage in, garbage out’.  If the quality of data going into a repository isn’t what it needs to be then the repository will have a hard time preserving and managing that data over time.  The repository’s ‘designated community’ of users may not have what they need to subsequently understand, use and trust that data in the future.  And if the repository keeps the ‘wrong data’, then the content in the repository may be incomplete, it may not have the value it should, or the repository could spend time and effort on holding data that no one ever wants or uses.  Decisions on data quality, value and use also change over time.  Data needs to be regularly re-appraised as well as when it was first acquired. 

Repositories and the types of data they hold are numerous.  At the time of writing, there are over 3000 repositories listed in re3data, which is a Registry of Research data Repositories.  They cover different subjects, content types, disciplines, geographies and institutions.   Yet there are very few practical standards for exchanging data in a common way for the purposes of long-term archiving and preservation.  In the OAIS model, this is the world of information packages (SIPs, AIPs, DIPs).  This hampers ease of ingest, i.e. getting data into repositories and archives.  It hampers ease of transfer, for example migration and exchange of data between repositories.  And it hampers ‘supply chains’ of specialist services and providers that provide ‘back end’ preservation and archiving capabilities used by repositories. 

That last point about supply chains and preservation and archiving services is important.  Many repositories are not conceived or operated with long term preservation in mind.  They may not be equipped, or even consider it in their remit, to ensure data they hold will be accessible and usable over decade or longer timescales.  Different approaches can be used to address this problem.  This includes allowing researchers to elect to find and use services that do provide the necessary archiving and preservation capabilities, making it easy for repositories to embed or connect to suitable archiving and preservation services in order to extend their own capabilities, or enabling easy transfer of data to other better equipped repositories or to specialist long-term archiving and preservation services when the time comes.   These all involve the need to better understand, find, select and use appropriate archiving and preservation services within EOSC. 

All Good Things Come in Threes. 

This leads to three areas that will be tackled by EDEN: 

  • Create a framework and practices for identifying candidates for long-term digital preservation based on data quality, benefit and use, including re-appraisal over time.  This is about selection, appraisal, data lifecycle management and more generally assessing the FAIRness of data over time.  
  • Define standards and protocols for the exchange of data between services for LTDP, including the need for interoperability, package specifications and APIs.  This is about a common approach to the transfer of data, for example sending it to and from preservation and archiving services so that LTDP can be better embedded and adopted in EOSC.   
  • Establishing attributes and registries that can be used to describe the LTDP capabilities of repositories, archives and other services.  This allows discovery and assessment of LTDP capabilities, for example when looking for repositories for depositing data or looking for services that can add LTDP.   EDEN will provide reference implementations for a range of LTDP services. 

 The above are largely about ensuring that the right data is in the right place at the right time and is looked after in the right way so that it remains FAIR over its whole lifetime.  And quite right too! 

The targets above are just some of the outputs of EDEN.  Other areas of the project are looking at discipline specific requirements, accelerating uptake of LTDP in EDEN through the production of toolkits, providing training and support, and establishing a network of LTDP services that covers repositories, archives and service providers. 

Next Steps 

The EDEN project has only just started, but work is already well underway.    Arkivum is focussed on some of the more technical aspects of EDEN such as systems requirements and specifications, which will then lead into development of reference implementations and demonstrating how LTDP preservation services can be delivered into EOSC. 

Rather than go into detail right now, I’ll blog more about EDEN and our role as we go along.  Not least, this post is probably long enough as it is, although perhaps not unreasonably so given that EDEN is an 8M€ project and will take place over the next three years so there’s plenty to cover.   

There’s also no shortage of terminology to get to grips with too (FAIR, TRUST, TDR, OAIS, LTDP, EOSC and EDEN is just a start), so hopefully this blog has laid some of the ground work for future posts. 

It will be fascinating to see how the EDEN project develops and the impact it makes.  It will also be a great learning and collaboration opportunity to work with so many fantastic partners.  There is always something new to learn in the world of digital preservation.  Understanding and demonstrating how LTDP can be applied to good effect in the European Open Science Cloud will be no exception!  

Footnotes 

I did say that this post was probably long enough already, but just to make doubly sure here’s a few more points! 

  • I’ve used the term repository a lot in the blog above.  I’ve used the term repository to covers holding of data for both the short and long-term. Inside EDEN, there is a more nuanced distinction.  Trustworthy repositories are places to deposit and access content on the relatively short term, e.g. a few years or maybe a decade.  They may or may not include digital preservation capabilities.  However, trustworthy data archives are places to hold content for the long-term, they do provide LTDP services, and they conform to the OAIS model.  In the text above, I use repository to cover both – partly for simplicity and partly because end users just want places to put their data, to find other people’s data, and to know that the data will be looked after and remain FAIR for however long is needed. 
  • EDEN is one of two new projects funded by the European Union that are collaborating on digital preservation and trusted digital repositories for EOSC.   The other project is called FIDELIS.   FIDELIS will establish a European network of FAIR-enabling trustworthy digital repositories that will live on after both projects finish.  The collaboration between EDEN and FIDELIS is probably worth a separate blog post of its own. 
  • This is the second EOSC project that Arkivum has been involved in. Between 2020 and 2022, Arkivum along with Google Cloud Platform completed all three stages of the ARCHIVER project. This project sought to design, develop and test petabyte scale digital preservation solutions with scientific research organisations including CERN, EMBL-EBI, PIC and DESY.   ARCHIVER was recognised in 2022 by receiving the Digital Preservation Coalition (DPC) award for Collaboration and Co-operation (sponsored by the International Council on Archives (ICA)).   CERN led the ARCHIVER project and is now a partner in EDEN alongside Arkivum.   

EOSC EDEN Grant Agreement number: 10118801 is funded by the European Union. Views and opinions expressed are however those of the author only and do not necessarily reflect those of the European Union or the Agency. Neither the European Union nor the granting authority can be held responsible for them.

Arkivum image

Matthew Addis

Matthew is CTO and Founder of Arkivum, responsible for technical strategy. Matthew previously worked at the University of Southampton IT Innovation Centre. Over the last fifteen years, Matthew has worked with a wide range of organisations in the UK, Europe and US on solving the challenges of long-term data retention and access.

Get in touch

Interested in finding out more? Click the link below to arrange a time with one of our experienced team members.

Book a demo

SHARE

Related resources

Interested in finding out more?

Message us via our contact us page or book some time in with one of our experienced team. We’ll arrange an initial exploratory discussion to better understand your requirements, and whether the Arkivum solution will help you solve your challenges.