Looking to save time in compliant PDF preparation?
Join our session with practical examples
on 5 December

Creating the right environment for structured and unstructured data to co-exist

Reading Time: 4 minutes

There’s a lot of talk about structured information and how it will solve all our problems with regulatory submissions.

If you look at initiatives such as IDMP and EMA’s Digital Application Dataset Integration Project (DADI) or further back, FDA’s Structured Product Labeling, there are good examples of health authorities getting at least some information into a structured format. 

But, as anyone who works in the pharma industry knows, products are often registered globally and many regulators are a long way from structured information. In addition, a lot of the information is way too complex to put entirely into a structured format, such as clinical summaries and clinical trial reports, which contain details that can’t be so neatly contained. 

So, while I joke that my regulatory affairs colleagues, my boss and the regulators think those of us in regulatory operations are a bunch of IT nerds managing “one-click” submissions using seamless automation, we are still spending a lot of time doing manual data entry in old-fashioned tools.

Is that dream still a long way off, and will it ever be realized? 

I have to admit, when I first started in the industry after studying medical informatics, I found it somewhat demoralizing that what we were still essentially processing paper – because that’s what an eCTD is really, an electronic representation of paper with page breaks and margins.

We’ve moved from paper to simple electronic processing – using Word and PDF – to where we currently are, with digital presentation of the information in structured databases, utilizing certain data models and ontologies (such as IDMP and master data management programs at large pharma companies). 

Using innovation to tackle manual processes

Where I believe we can make improvements is in tackling the manual processes that continue to bog down regulatory functions.

For example, you might start with structured data, such as stability data, which is then transferred into a Word document (now no longer structured) and then the report is manually typed into a database by a regulatory expert (to make it structured).

We can eliminate many of these manual processes using natural language processing and other AI-enabled tools to automate the transfer from structured to unstructured and back to structured.

The technology landscape to enable this more seamless process and address the problem of duplicate workflows is rich, with a wealth of innovations already in use and many more emerging.

Some technologies currently in use include structured authoring, database markup languages designed to define and document database structures, cloud native platforms that simplify the management and storage of data and natural language processing.

Next generation technologies – as highlighted by analyst firms such as Gartner – will enable more predictive approaches to managing data.

These include hyperautomation to streamline the integration of systems and natural language generation, which translates structured data into human-readable text, allowing companies to rapidly generate data-driven narratives – not only in Regulatory, but as a holistic approach in the Pharmaceutical industry. 

We know that unstructured data is here to stay – at least for the foreseeable future – and that submissions, even with eCTD 4.0, will still be based on PDF documents, despite a push toward metadata and structure information.

So regulatory operations departments will still have to create, process and track eCTDs. But it doesn’t have to be as hard and manual as it currently is

And, perhaps one day, reg ops professionals will be able to simply work our IT magic, press a button to send submissions and spend our days at the beach. 

About the Author: Timm Pauli is Vice President and Head of R&D Informatics at PharmaLex, where he combines his informatics and systems expertise with deep knowledge of regulatory affairs and regulatory operations.

Did you enjoy this article?
Find more in our free LinkedIn newsletter.