Challenges with metadata - simplifying complexity for Life Sciences data - Arkivum

Archiving & Preservation / 21 Sep, 2018

Challenges with metadata – simplifying complexity for Life Sciences data

In my previous blog of the series, we explored what metadata is and how it is used to add context to unstructured data or files, adding value where data is unable to be searched and governed, or even when you are not sure what is contained within the data.

In this article, I will further explore potential challenges when dealing with metadata.

Differing semantics and ontology

When you are dealing with large amounts of data stored in departments, subsidiaries or even data input systems, there is bound to be a disconnect between how one party or system describes a particular piece of data.

Taking a very simple example when dealing with patient data and when referring to patient gender there could be a number of different ways to record this piece of data. M and F or Male and Female or 1 and 2 are just a few of the possible variations that could occur. The data could also be described as Gender, Sex, Male or Female and so on.

Within a business scenario, the way that a finance manager describes profit vs loss is going to be vastly different to how a sales person describes it. The challenge here comes when you are trying to put the data together, if the syntax and description is not standard, the data will not be combined.

Data is often organised in a schema, described as the skeleton structure that represents the logical view of an entire database. It defines how the data is organised and how the relations among them are associated.

However, especially in regulated markets such as Life Sciences and Financial Services you have to have a way of linking schema together, which can present a particular challenge when dealing with data which is not standard in its contents.

The solution in the industry

The solution to this is commonly PCDM (Portland Core Data Model); PCDM allows you to branch across schema and apply business rules across a business set and link it through a data model by providing common semantics and ontology. So when you want to capture data across your whole lab, you can still ingest everything regardless of source and link it up to get the full picture through a single source of truth.

This approach allows you to be more systematic and intelligent in the management of your data, allowing you to scale over time and keep all of your content in one place.

Our VP of Product Sinéad McKeown explains PCDM further in this short video:

Examples of use

An example of a good use case for the PCDM structure within the Life Sciences sector is when an organisation needs to migrate Trial Master Files (TMF) from an Electronic Content Management system to an archive. In order to be able to do this effectively and create a searchable archive, you need to define the metadata in standardised content and consistent information. This can be achieved using the PCDM standard.

Arkivum’s platform allows our customers to govern and get new insights into their valuable data, ensuring it is usable, searchable and governed in one place fully supporting PCDM.

Read our short product sheet here for some more information:

Download the Product Sheet

Daniel Hickmore

To receive our latest news and blogs straight to your inbox, please enter your email address.

Follow us on