Arkivum 101 Series: Preservation Versus…

Blog Tom Lynam

A common question we get asked at Arkivum is how digital archiving and preservation is different to other typical storage approaches. In this article I’m going to explore some of the common approaches we’re asked to differentiate a solution like Arkivum’s against.

 

How is digital preservation different?

At a high level, the simple answer is that a digital preservation solution is purpose built to actively maintain data that needs to be retained for a long period of time. It ensures that that data is accessible, available and usable for as long as it is needed. In addition to this, for organisations working with GxP environments, I would argue that the other solutions outlined below are not compliant nor align to the ALCOA+ principles for retaining digital content for longer periods of time.

The other solutions I will discuss in this post have not been designed to archive or preserve data, and have been built for other purposes, which I will outline below. This is not a post to critique other technology, but instead highlight what they have been designed for, and identify typical gaps in regard to long-term retention capability.

There is also no definitive point in time that the requirement for preservation kicks in, as it will differ by organisation and the detail behind the digital content that they need to retain. I’d highly recommend looking at a risk-based approach to assess your data and understand whether or not a preservation solution is right for your organisation. I’m not going to cover this in any more detail here, but this webinar recording is a good starting point if you’re interested in finding out more about this topic.

 

File Sharing Platforms (e.g. SharePoint)

A common approach we see (and often used as a stop gap) is to store data within a file sharing platform such as SharePoint or Dropbox. These are designed as the name suggests, to organise and share data which is in use day to day.

  • Perhaps the most critical omission is digital preservation capability. Primarily this is automatically maintaining long-term preservation copies of every record. This process would need to be managed manually within a file sharing platform, in addition to staying up to date with digital preservation good practice.
  • While many of these systems are built on large cloud infrastructure which provides inbuilt integrity checking and stores data in multiple locations, there is often a lack of transparency on how the data is stored. Additionally, there is limited reporting for users to evidence that integrity is being maintained throughout the data lifecycle. For example, within SharePoint it’s hard to view checksums or obtain verification that a file hasn’t changed while stored.
  • For GxP compliant organisations, it is often impossible or extremely difficult/costly to validate these systems, as they are not purpose built for retaining GxP data.
  • In some instances these systems will also update files once first uploaded, changing property information inside the file.
  • Non-GxP systems such as file sharing platforms are also likely to frequently and automatically update without prior notice or opportunity to validate in advance.
  • Finally, it is also important to note that consideration must be given to the capture of audit trails and metadata, ensuring that these systems meet organisational requirements.

The majority of file sharing sites have not been built with GxP focused organisations in mind, and hence not only lack a lot of specific preservation capability, but also standard functionality expected by the industry.

 

Retaining within a Source System

It is often tempting to leave data within an existing source system before moving into the archive (and often they provide an ‘archive’ option). This is potentially misleading, as archival in this instance is usually the revoking of access and locking down the system.

In this way the systems have no inbuilt preservation capabilities such as the process of automatically maintaining long-term preservation copies of every record (as mentioned above). As alluded to above, successful digital preservation (i.e. guaranteeing legibility and use) is not simply a technology challenge, but needs ongoing expertise and supporting processes to achieve.

In a similar way to file sharing sites, many source systems are built on the same cloud providers which provide integrity checks and store data in multiple locations. And in the same way, they face the same challenges. There is a lack of transparency and difficulty in evidencing data integrity checks and health within the systems.

In our experience it is more cost-effective to consolidate long-term data into a single repository instead of retaining data within multiple source systems. In addition, a consolidated archive is much more accessible to stakeholders, not only make it easier to demonstrate compliance, but makes it possible to unlock future value from long-term data (and in turn avoid vendor lock in later down the line).

It is also worth considering the benefits of migrating and archiving data as close as possible to the end of a project, all those familiar with the data are still with the organisation and the project. In this way the migration process is smoother, with any issues that arise more easily resolved. In some instances where third parties have been involved, and if left too late, it can be impossible to rectify data integrity issues years later.

A final thought on leaving data within source systems, is validation. While these systems will likely be validated, to ensure they were fit for their original purpose, they won’t have been validated to archive and preserve data (and have preservation capability as referenced previously). Unless the system has been validated against these requirements and capability, then it is not a validated archive/preservation system.

 

Document or Quality Management Systems (DMS/QMS)

DMS and QMS systems present similar challenges that I have already addressed above, although there are some unique considerations worth discussion here. These systems are often specifically designed for GxP compliant organisations and so have been validated and provide more functionality that is expected by the industry.

In some instances, if correctly validated as an archive, they may even be fit purpose for retaining data for several years. What they tend to lack (as with all of our other approaches) is digital preservation capability to maintain data over the long-term. If we look back to the beginning of this article, this will depend on the requirements of the organisation and the specifics related to the dataset that needs to be retained.

For organisations where data does need to be kept for decades, it is important to ensure that you are considering, qualifying and validating solutions to ensure they include (as already stated) that they have digital preservation capability, (i.e. identification of file formats, automatically maintaining long-term preservation copies of every record, supporting processes and expertise etc.).

As mentioned, (although very much worth reiterating!) digital preservation is something which must be actively maintained over the entire lifecycle of the data, including staying up to date with the latest good practice guidance, and adjusting accordingly.

These systems also fall into the same trap of struggling to report on and evidence data integrity over the entire lifecycle of the data.

 

Backup Systems

A final comparison to cover today is backup systems. Although less common a question, I think it is good to address the difference between the two as it is something that can cause confusion.

There are several different approaches to backing up data, but generally speaking their primary purpose is to support an organisation with disaster recovery. In the case of a disaster, an organisation is able to revert to a previous backup and minimise the amount of lost data and disruption to day-to-day operations.

Backup systems therefore do not have archiving or preservation capability (e.g. data integrity checks, preservation copies, audit trails, metadata etc.) and focused on backing up systems like those mentioned in the previous sections.

 

Fit for purpose digital preservation for long-term data

Digital archiving and preservations solutions are purpose built to ensure data is accessible, available and legible for as long as it needs to be retained for (and in some cases forever). When comparing to other systems which store or even claim to archive data, it’s important to balance the requirements of your organisation with the capability for preservation with these systems.

Once your requirements and risk appetite are fully understood, it is possible and important to assess and validate perspective suppliers against these requirements.

I’ve not mentioned it previously within this article, but consideration must also be given to inspection readiness. Much of what I have discussed directly supports inspection readiness and preparedness; your solution for long term retention should allow inspectors to access content directly and ensure it is easy data integrity is being maintained in line with regulations and guidance such as the ALCOA+ principles.

In summary, suppliers and solutions not focused on long term data preservation will struggle to meet these requirements, and organisations using these solutions to retain data are ultimately taking a higher risk approach to long term compliance and alignment to ALCOA+.

 

If you’re interested in reading more about this topic, I’ve listed some other resources below which might be of interest:

  • Arkivum 101 series post on Safeguarding and Preservation.
  • eBook: A guide to archiving and preserving essential GCP records and data.
  • eBook: A guide to assessing third-party archiving solutions.
Arkivum image

Tom Lynam

Tom is the Marketing Director at Arkivum. He joined the business in January 2020 tasked with driving new business growth and building the brand into new sectors such as Pharmaceutical and Life Sciences. He has over 12 years’ experience in several diverse marketing leadership roles across technology and professional services organisations.

Get in touch

Interested in finding out more? Click the link below to arrange a time with one of our experienced team members.

Book a demo

SHARE

Related resources

Interested in finding out more?

Message us via our contact us page or book some time in with one of our experienced team. We’ll arrange an initial exploratory discussion to better understand your requirements, and whether the Arkivum solution will help you solve your challenges.