Automating The "Work" In Paperwork: The Role of AI in Clinical & Regulatory Documentation

March 21, 2023

13 minute read

We sat down with the representatives from clinical operations, regulatory operations and technology provider to discuss the challenges faced by the Life Sciences companies in the context of document management. Luis Aguilar, VP, Business and Clinical Operations from Candel Therapeutics and Dr. Joerg Stueben, Head of Regulatory Information Management and Senior Expert, Boehringer Ingelheim, covered some of the major obstacles their organization experience when it comes to ever-changing requirements, and the provided their perspectives on the emerging best practices trends. Sriram Parthasarathy, Chief Product Officer, representing Adlib Software, rounded up the discussion by providing his insights into the role the AI is beginning to play in the Life Sciences field.

"Every Month Matters!"

- Dr. Joerg Stueben,
Head of RIM and Senior Expert, Boehringer Ingelheim

You can read the interview transcript below, or watch the full interview here.

Marcus Evans Life Sci WEBINAR - Conversation Flow Supporting Deck - FINAL SOCIAL_Page_01


What are the current inefficiencies associated with data and document management during clinical trials?

Luis Aguilar, VP, Business and Clinical Operations, Candel Therapeutics:

We have a Phase I, a couple of Phase II studying immune oncology, and we've finished kicking off another Phase III. It's a very large pipeline for our company size. We've gone through a lot of pain, like migration from legacy infrastructure, and much of it has been in process for over the last 20 years.

We moved from what were academic-based databases and entries to what people consider “paper” CRMs although everything is digital. It required us to have a secondary group of clinical data specialists to ensure the data entry was verified and validated.

Now we've moved into adopting other technologies:

  • EDCs, for the new studies that we have right now;
  • Exploring partner utilization, like CROs for not what you'd say the typical items but select aspects, whether it be SAP verification or talking about endpoint adjudication committees;
  • ETMF system;
  • CTMS systems;

This really compiled everything that was being done manually to allow us to have the scope we have now. Still, it requires quite a bit of effort and we still have a CRA Clin Ops group that we drive for some of the studies. But as we look towards the next phase, we'll expand our utilization of the CROs past what we're doing now - a consultative approach - into some actual work and extension into some sites.

Ongoing, we have the compliance issues of making sure we have GDP training, GCP training, ICH familiarity guidelines. We have to do that internally for our inspection readiness. But we're also pushing that out to sites as we're finding more and more issues in our queries around GDP items. We're not requiring them to have certification on GDP, but we do want to make sure that they have as many access lines as they can to the guidelines.


What are the complexities of systems and processes in managing global regulatory operations?

Dr. Joerg Stueben, Head of Regulatory Information Management and Senior Expert, Boehringer Ingelheim:

We are working in various totally different therapeutic areas. In the past we used to have a very rigid regimen of setting up such projects and we are basically realizing that the requirements are changing. And the increased complexity can be best adopted by having a more flexible approach. So we are working now in Asset Teams where the individual functions who contribute to the development path are contributing and organizing themselves.

From that perspective, we are much better in addressing critical process and understanding where certain things are needed. And by becoming more flexible, we could reduce buffers, we could streamline our work and are able to bring our medication much quicker to the patients.


What are some of the new technologies making their way onto the market to help address these issues?

Sriram Parthasarathy, Chief Product Officer, Adlib Software:

As Luis pointed out, typical trial is done across many sites. And each site scans thousands of documents, and they could upload them in many different ways. They can also upload them in many different formats, almost a hundred different file formats.

The challenge is:

  • Some of these documents could be digital documents, like Office document;
  • Some of these could be scanned images, like fax documents or even printed document;
  • Someone has to go through the process of OCR, extracting that text and making those documents searchable, which is a very important challenge.

Secondly, to submit to FDA, all these documents have to be transformed into a very specific format with custom headers, custom footers and watermarks to create the documents of records.

And lastly, once we have these documents, being able to file these documents in eTMF and eCTD, which has very specific taxonomies. For example, eTMF, the Trial Master File, has the hierarchical structure for top level zone, second level sections and third level artifact. To be able to read thousands of documents manually, identifying which folder to copy to, classifying them and copying them is a manual task, that takes a significant amount of time and errors.

That is where the automatic classification using AI becomes very important, where it can read all these documents at scale, automatically identifying which category it should belong to, and automatically copying those documents to the right folder. For eTMF submission makes it very, very important and makes it easy for that file to proceed in a faster way.


What are some of the changes taking place in the regulatory operations today?

Dr. Joerg Stueben, Head of Regulatory Information Management and Senior Expert, Boehringer Ingelheim:

If you look into the pandemic situation, particularly on the submission of the COVID vaccines, we saw something like a rolling submission where information and documentation were sent to the agencies before the entire package was ready.

By doing so, these medications or vaccines could be approved very quickly. So from my perspective, that would be an ideal target operating model going forward.

Another thing which I would like to see more of is inter-agency cooperation. There are certain projects running where different agencies are participating by exchanging their information and taking a cumulative work sample. The idea is to upload documentation in a cloud and then decide which agency should review those, because as a global company you would like to submit everywhere.

I'm involved a lot in the discussions of IDMP. When we looked into the current processes, we saw that the same information is reviewed by company and the agency again and again. I think it's a lost lot of wasted time. If we could combine those review instances, this would be a much quicker process.

No rigid submission windows: We see this in some areas where you can only submit in certain timeframes. I mentioned agile teams beforehand, so why can we not submit everywhere when we are ready? If you have to work towards a timeline which the agency sets and you miss it, you basically lose a month, and not just we as a company, but all of the patients lose a month, right? We could do quicker by being more flexible.

And finally, delivering efficiency against structured formats. If you think of the IDMP standards, the tendency appears to submit not documents anymore but information, data, content, which is more or less structured. Usually to be submitted in certain formats, which are not necessarily human-readable but machine-readable, and can then be translated, so to speak, back to something an assessor can read.

So why cannot, for example, an assessor or an agency review this structured information? We already submit this structured information after we have submitted documents. This sort of inefficiency exists in the current process.

We have a recent case where, in the US the FDA manages substances in the new substance system, GSRS, the Global Substance Request System. And this information is basically taken by the European Medicines Agency and imported in their substance management system. And of course, the other way around. And from there, it's populated into systems like CTMS, the Clinical Trial Management System. For some of us, it's surprising to see somebody is entering something in the US and it ends up in the European trial system.

It's great, it's fantastic. It saves a lot of time. But we also must be aware that the information which we enter in the first place must be absolutely correct and of super quality because it goes through a flow of information. The agency calls it interoperability. There's a huge benefit in this and of course we must be aware of quality and data aspects, which we must address.


In your experience, what are some of the best practices of managing clinical trials more efficiently?

Luis Aguilar, VP, Business and Clinical Operations, Candel Therapeutics:

There’s segmentation between regulatory quality, our data management group and then Clin Ops group, and it's very important for us to work in parallel. I mean, everything has a cost. If you want to add a data field to the EDC, there's a cost to that because you have to figure out how to retrospectively get that data with a bunch of inquiries and spend more time and resources. It's a question of is it a “nice to have” or do you need to have it?

Sometimes to speed up deployment into the field, you can do a Note to File as a clarification memo on your next amendment. You can do a Letter of Amendment to anticipate you're going to have a future amendment, but change a little bit of what was in the existing amendment to just speed the process. This allows us to be a little quicker on the data items we need to gather and the data which we thought we might need to gather, but we don't.

"We spend a lot of time influencing strategies and strategic interactions with trial sites instead of only training on the guidelines and the EDC formats. We're actually working not just on gathering the data from the study, but capturing mindshare."

- Luis Aguilar,

VP, Business and Clinical Operations, Candel Therapeutics


And that's a different approach! It's beyond what you'd necessarily expect from a vendor CRO. But how do you just change the paradigm a little bit? Which is a lot of what we're focusing on right now to start moving faster. I think to Joerg's point, the agile approach is getting the groups to work in real time. When I'm talking to groups, I'm talking about regulatory data management, quality and also Clin Ops.


How are technology providers adopting to the new requirements in the clinical and regulatory operations?

Sriram Parthasarathy, Chief Product Officer, Adlib Software:

One interesting challenge we have seen when talking to a number of publishers is in the process of submitting documents to the FDA. In a typical trial, they go through submitting more than 20,000 documents. It's a very manual process where they go through each document, identify what text to be hyperlinked as per the specific rules from FDA, and they have to hyperlink to the right document. Imagine going through all the documents and adding 20,000 hyperlinks manually. That's a very challenging task. And the problem with that is sometimes the link is broken or they may have incorrectly inserted the wrong link to the wrong document. Or maybe after they inserted the link the document no longer exists or got deleted.

This is where automating the process of going through the list of documents, going through each document, identifying what text to be hyperlinked, identifying which document to actually hyperlink to, and inserting the hyperlinks while following the exact rules FDA has for hyperlinking makes it significantly effective. It reduces the manual time that publishers have to spend to manually do those hyperlinks. It also eliminates manual errors where no document is left with incorrect hyperlinks. This is going to be extremely important for all the publisher that take significant amount of time to review and publish the documents for the submission.


What are some of the emerging changes in clinical operations look like?

Luis Aguilar, VP, Business and Clinical Operations, Candel Therapeutics:

One of the things, and it may be a new norm, is that we find a great shortage of resources. Not only that we face, but also the sites that we're working with. Research groups and teams have just been completely decimated.

Obviously, our clinical trial is useless without getting the data. Sometimes you're behind on some data entry and there are no extra hours in the day for these people. We're trying to figure out a way to hybrid some of those data entries where we can scope it out and just do it in a separate database for our own analysis later. In a way, what we're trying to do is to take some of the burden off of the site.

Submission-ready formatting is what we're trying to work with vendors on. In some situations there are teams or software providers or CROs that that have a faster ability to do this. We’re being selective on what we're choosing to use these external partners for. Like Joerg said, how can we get this to patients faster? Because months matter.

The automation of Hyperlinks is a key item, because it's not a simple process of just creating a hyperlink and assuming it'll work. How can one detect this across thousands and thousands of documents? Intelligent hyperlink detection will sure simplify the process. Very often you submit, have the submission kickback, and delay your drugs by quite a bit of time.


What is your perspective on the emerging changes within the global regulatory operations processes?

Dr. Joerg Stueben, Head of Regulatory Information Management and Senior Expert, Boehringer Ingelheim:

Luis has said it very nicely: Every Month Matters. I think our common goal must be to be as quick as possible. And while I think that the future really is in data, I'm very well aware that for several years to come we still need to work with documents. This will not be an abrupt change, but a gradual change.

In terms of data, one thing is to use standardized terms. This seems to be so easily said, but if you look carefully at what you do in your companies, you may find that in medical and in development and in operations they use different terms for the same thing. So harmonize it. I think the IDMP standards are a very good way of going in that direction.

Secondly, I would like to use automation. We heard automated hyperlinking and hyperlink checks. We also do formatting checks with the intention of being “First Time Right”. In former times, the documents come from the authors and, as Sriram said, we were going through lots of checks, sometimes in various circles, sending it back to the authors. A better thing is to have automated tools and check the documents right where they are created, so when we get them, they are ready. “First Time Right” for us is a very important thing because it saves a lot of time. We have lots of scientists, even Ph.D. scientists doing [the document verification] work. Is it not a waste if they spend the time creating documents, formatting documents, doing quality checks? They should do the work they are actually paid for, which is the scientific expertise they bring into the entire process. We would like to see artificial intelligence being used wherever it is possible to make things easier. Doing robotic process automation to automate processes which can be automated.

"Is it not a waste if [Ph.D. Scientists] spend their time creating documents, formatting documents, doing quality checks? They should do the work they are actually paid for, which is bringing scientific expertise into the entire process."

- Dr. Joerg Stueben, 

Head of Regulatory Information Management and Senior Expert, Boehringer Ingelheim


From a predictive analytics perspective, where is the certain need? Can we forecast when certain things make sense and when they don't. For example, can we review the first results of a clinical trial and validate if it makes sense to go that route or not.

Can we analyze the ideas or the questions which authorities ask us after we have submitted? By looking into the data which are publicly available from other companies or looking into our past documentation, we can have an understanding where our documentation is not ideal so we can correct this before we submitted. All to prevent questions from the authorities again with the aim to become quicker in bringing medication to the patients.

And lastly, Follow The Sun. I think many large companies do this. If you have areas and colleagues working in different time zones, submissions can be handed over from time zone to time zone for other teams to work on. This approach makes the publishing process much quicker.


What is the role of Artificial Intelligence in clinical and regulatory document automation?

Sriram Parthasarathy, Chief Product Officer, Adlib Software:

Some of the challenges that Luis and Joerg were talking about are very relevant. The key is finding opportunities in this space to make it easier to conduct the trial by reducing all the manual work that currently goes on.

That's an opportunity that we always look forward to automating so that critical resources are spending time on intelligent activities, not manually doing specific tasks. One example I gave was Hyperlink Automation. Another example is, if you look at the EDM submission, clinical research coordinated from a site or multiple sites may upload thousands of documents into a folder or via a portal or sent in by email. After that, a clinical research associate takes those documents and typically spends anywhere from 8 to 10 minutes to read each document to identify what category it belongs to, and then copy it to the right folder. That is a very manual task but there is an opportunity to automate it so that we can reduce manual work.

This is where we are thinking: can we read those documents at scale, understand the layout of the document, understand what data is in each document, and based on that the AI system can correctly classify [these documents] to the right zone, the right section and the right artifact. Not only can it classify, BUT the AI system can also copy those documents to the right folder within the eTMF. And importantly, sometimes the system may not correctly know which category a document belongs to, so it can flag the document for the user to review. The user can take a look at the automatically classified documents, and verify and confirm that this is the right structure or change the structure, so the system can learn and get better and better and better.

eTMF Navigator Screenshot Mockup-02 eTMF Navigator Screenshot Mockup-01

The advantage of this process is, instead of doing all that manual processing and classification work, clinical associates take a minute or two to review what has already been done and confirm that it is done correctly. This significantly improves their technical efficiency, reduces manual effort and completely eliminates or reduces manual errors.

This enables faster submission and, as Joerg and Luis are talking about, speeds up the trial significantly.

Watch the full interview here.

Don’t forget to share this post