As a digital preservation provider, we are often asked about alternate tools to safeguard pharmaceutical documentation. One in particular we’re regularly asked about is Dropbox.
Today I wanted to take the opportunity to address why this isn’t an appropriate archiving solution, especially in relation to eTMF documentation, GCP and similar regulated data.
First and foremost, the primary function of Dropbox is to aid collaboration. It is a repository for content authoring, review, approval, collaboration, and day-to-day management of active content. As such, different authors, project managers, stakeholders and organisations can access permitted files, make amendments, download, upload and more.
But it has not been designed to archive and preserve data, particularly the type of data stored by pharmaceutical and life sciences organisations which is often heavily regulated.
Below I’ll explore some of the key capabilities that Dropbox does not offer when compared to a purpose-built digital archive.
1 – An ‘archive’ folder isn’t an archive
Firstly, let’s distinguish between a digital archive and a folder within a repository (such as Dropbox) titled ‘archive’.
As mentioned above, Dropbox has been designed as a collaboration tool for active or live content used on a day-to-day basis. It does not offer the capability that should be inherent within any digital archive. By comparison a digital archive is a separate repository designed to safeguard and provide guaranteed access to the data, for as long as it needs to be retained. Some of the features of an archive should include:
- Data stored in multiple different geographic locations, ensuring that if anything were to happen to it or the hardware storing it, it can be easily retrieved and duplicated again.
- Checks on all data entering the archive in the first place – if bad data goes into an archive or archive folder then when it comes time to access that data it will still be bad. A digital archive will automatically check every file upon entry into the archive to ensure that it is not corrupted and has the right associated metadata (we’ll cover metadata in a later section).
- Automated checks on the integrity of every file throughout the retention period to identify any loss or corruption of the data. Corrupted data can be easily deleted and replaced with uncorrupted versions of the same file from the second or even third storage location.
- Secure managed access only for those who should be able to.
These are just a handful of features of a digital archive but are absolutely essential to guaranteeing the long-term safeguarding and protection of highly regulated pharma data such as the eTMF and other clinical/GCP data.
I’ll now look to explore some of the additional features that a compliant and inspection ready archiving and digital preservation must have.
2 – Digital preservation to guarantee readability and usability
Digital preservation is the process of guaranteeing that data cannot only be accessed but is also readable and usable whenever it is accessed in the future. As software and hardware evolve, it is important to ensure that each file is maintained in a format that can be read and used by whatever device is used to access it in the future.
Dropbox has no preservation capability because of its focus on live data. What you upload will be the same version you access in 10 years’ time (or however long it is) and with that comes the possibility that your hardware and/or software will not be able to successfully read it. In short, the old format could be obsolete or the file corrupted.
A digital preservation solution will ensure that every file is automatically maintained in a preservation format, in addition to the original copy. This maintains the integrity of the original file in line with many regulations, in addition to guaranteeing future readability of that document.
Digital preservation is absolutely crucial to ensure that your files and documents are both readable and usable for the entire retention period as stated by regulators. For example, in January 2022 EU regulation 536/2014 comes into effect, stipulating a minimum of 25 years retention period for the eTMF. Ensuring that you have a digital preservation solution with supporting processes in place will be essential for compliance.
3 – Leveraging metadata for easy and quick search
A tool such as Dropbox doesn’t support the same search functionality as a digital archive.
A major component of this is metadata – or more plainly – data about your data. This information helps to categorise files by providing easy to find tags, which can be searched for across datasets that can include tens, if not hundreds of thousands of files.
To illustrate metadata, let’s take a photograph as an example. An organisation can store a digital copy of a photo and have accompanying details to search by, including when it was taken, who by, where it was captured and perhaps even some details about what it is of. Years into the future, someone can easily find relevant photos by searching for photos from:
- A particular photographer
- A specific location
- A range of different dates
- Different subject matter and so on…
This is achieved in a digital archive via metadata extraction which is applied to each and every document uploaded to the archive. This means ensuring that every file has the following metadata associated to it:
- Descriptive metadata (title, author, date of publication, description etc.)
- Technical metadata which describes technical properties of a digital file or the particular hardware and software (where the data resides, the structure of the data etc.)
So, for example in the context of the eTMF and clinical trial data, it should be a simple process to quickly find a specific document requested during an inspection or valuable information related to the repurposing of a drug… all because of metadata.
By comparison, while Dropbox captures some limited metadata, it can be:
- Incomplete by not requiring certain metadata being attached to individual files with each upload
- Limited in scope by not capturing a full range of descriptive and technical metadata
- Or even potentially misleading by, for example, confusing the creation date of a physical and digital document.
By using Dropbox to archive this valuable data runs the risk that it cannot be quickly found when it is needed the most.
4 – Capturing the complete audit trail
Audit trails provide a complete history of access and amendments made to each document with failure to comply potentially resulting in severe fines, delays to manufacturing and reputational damage. Audit trails are a critical component of data integrity. A complete audit trail of documentation and data is a pre-requisite for several regulations, including the FDA 21 CFR11.10 and ICH E6 (R2).
“Use of secure, computer-generated, time-stamped audit trails to independently record the date and time of operator entries and actions that create, modify, or delete electronic records. Record changes shall not obscure previously recorded information. Such audit trail documentation shall be retained for a period at least as long as that required for the subject electronic records and shall be available for agency review and copying,” [FDA 21CFR11.10].
In regard to Dropbox, their audit trail capabilities are limited to logging “user and administrator log-on and file sharing activities for Dropbox Business users”. They can make these logs available to administrators in “industry standard read-only format”.
However, these features are not enough for a complete audit trail which should include details such as:
- Who has logged into the system and when.
- At the individual file level a record of access, downloads and whether it has been modified including changes to the metadata and retention rules.
- A log of the reason for any changes to any of the files and associated metadata
- The audit trail is maintained regardless of any changes to the file (for example across preservation and original copies).
- Simple and complete reporting on the audit trail.
Again, using Dropbox can be a high-risk approach when seeking to achieve a compliant archive. While there is some limited capability (i.e. in this case there is some limited audit trail captured), it does not provide the same level of functionality and detail as required by many regulators.
A fit-for-purpose digital archive
While in the short-term using a tool like Dropbox may be perceived to be the easier and cheaper approach to store pharma and life sciences data such as the eTMF, it is simply not fit for the purpose of long term archival and preservation of data.
To ensure that your data is safe, accessible, findable, readable, and compliant for the entire retention period, you must have the right archiving and preservation technology, processes and expertise in place.
I hope that this article has helped to highlight some of the key differences between a tool such as Dropbox and a dedicated archiving and preservation solution. If you would like to speak to one of our team about how we can help archive your data, you can contact us here. Our vendor neutral archive provides peace of mind for our regulated customers, with no hidden fees and no data lock-in.
Also, feel free to subscribe to our emails at the footer of this page to ensure you don’t miss any of our content on all thing’s digital preservation.
20 May, 2021
The difference between data backup and data archiving, and why it matters to you
Data is your organisation’s most valuable asset – it’s paramount that you protect it. If you are faced with large amounts of precious data which you will…
22 Jun, 2021
Is your data safe in the cloud?
Regardless of industry or organisation size, a misconception we often hear is that an on-premises solution for data storage is safer than the cloud. We are speculating…
01 Jul, 2021
What is digital archiving?
When we work so close to something, it is sometimes possible to overlook the simple things. This is why we’ve decided to create a two-part blog series…