Clinical Data Review

Introduction: What is clinical data review?

Clinical data review describes the process of validating clinical trial data to ensure it meets the principles of ALCOA+ set forth by the FDA:










These principles are applied to data originating from various sources, whether paper-based, eSource (from electronic sources such as eCOA, EDC tools), hybrid, or other sources such as medical images and laboratory equipment read-outs. The purpose of clinical data review is to ensure that trial data is accurate, complete, and consistent. This ensures that any conclusions drawn from the data – which can go on to inform patient care guidelines or help get new drugs approved – are scientifically sound and valid. Clinical data review increasingly happens in real-time, which also has implications in safety monitoring and adverse event reporting.

Thus, data review in clinical trials serves to ensure patient safety, transparency, research integrity, regulatory compliance, and scientific validity. In the midst of accelerating technological adoption in the clinical research space, clinical data review practices are changing rapidly. In this article, we will discuss clinical data review from a perspective of past, present, and future, highlighting the ways in which data review has evolved up to today’s current landscape, and then looking at emerging trends shaping the future directions of clinical data review. First, let’s briefly review clinical data management, health data standards and formats, and regulations and data privacy considerations that are relevant when dealing with personal health information

What is clinical data management?

Clinical data management, or CDM, refers to the entire lifecycle of data that is used in or generated as part of a clinical trial. For a given trial, CDM procedures are usually outlined in a data management plan (DMP), which is typically overseen by a clinical data manager or CDM team. We have written extensively on this topic, so check out the following articles for a deeper dive into data management in clinical trials:

Clinical Trial Data Management | Power

Clinical Data Management (CDM) and Clinical Data Services | Power

Data standards and formats

In attempts to facilitate data sharing and gleaning of advanced insights from multi-modal health data, standard healthcare data formats have been set forth, for example by HL7 and CDISC. Unfortunately, we are currently still far from international standards for healthcare data, which hinders smooth clinical data integration – especially in international studies – and in general limits the ease with which health data from various sources can be processed and analyzed. With the emergence of big data and the explosion in the capture and availability of health-related data worldwide, there is increasing interest in standardizing health data formats to allow for high-resolution insights to be gained from diverse datasets.

For more information on clinical data standard formats, see: Clinical Data Integration | Power.

Health Level Seven (HL7) and fast healthcare interoperability resources (FHIR)

Two of the most important designations when it comes to health data are Health Level Seven (HL7) and fast healthcare interoperability resources (FHIR). HL7 provides a framework for the processing of digital health information. FHIR was developed by HL7, and is a standard specifically designed to facilitate sharing of electronic health records.

Follow these links for more information on HL7 and FHIR.

Clinical Data Interchange Standards Consortium

The Clinical Data Interchange Standards Consortium – CDISC – is a global non-profit headquartered in the US aiming to enhance trial quality and clarity by establishing high-quality clinical research data management standards. Their Clinical Data Acquisition Standards Harmonization (CDASH) model for data collection was designed to optimize the traceability and transparency of study data for regulatory bodies and trial data reviewers, by setting forth the Study Data Tabulation Model (SDTM) format for clinical and healthcare data organization.

Data integrity and data privacy regulations

Briefly, upholding data integrity is essential in clinical research. This involves ensuring the accuracy, completeness, and consistency of data, complying with applicable legal regulations, and adhering to patient confidentiality and data privacy regulations. The primary regulatory body in the US is the FDA, who have adopted parts of the ICH GCP as part of their regulatory frameworks. The FDA also enforces adherence to regulations governing the use of personal health information (PHI), namely the HIPAA Privacy Rule and the HITECH Act. Clinical trial data is considered confidential. For more information, see the following article: Data Privacy in Clinical Trials: Standards, Definitions, and Best Practices | Power.

The clinical data review process

Regardless of the specific workflow and data management tools chosen by the sponsor, the data review process should be outlined explicitly in the data management plan (DMP) and/or as part of the trial protocol. Since every trial is unique, the data review process will be tailored to the particular case report forms (CRFs), data types, data collection methods, and software tools used.

Clinical Data Review: The Past

Traditionally, clinical data review involved manual examination and analysis of paper-based, handwritten case report forms (CRFs) and other paper study documents and records. Monitors or clinical research associates would visit sites and meticulously review collected data, comparing it against predetermined criteria and checking for accuracy, completeness, and consistency. This process was time-consuming, labor-intensive, and highly prone to human error. Additionally, dealing with large volumes of paper-based data made it relatively challenging to analyze and interpret findings efficiently.

As the use of electronic clinical trial management systems (CTMS) and other software tools increased, hand-recorded/paper source data was transcribed into an electronic database, adding a potential source of error in the transcription process. A common framework for data review is known as “source data verification” or SDV, which refers to the (typically exhaustive) review of source data (the original source document of a given data point). However, it became increasingly clear that SDV is not only time- and resource-intensive, representing a sub-optimal use of limited resources, but exhaustive SDV also appeared to be unnecessary in many cases.

In a 2020 publication from Pharmaceutical Outsourcing, the authors stated about the shift away from traditional data review frameworks:

“...traditional paper-based clinical data review models, where questions are defined at the study level and developed into study-level data listings and reports, are not fit for purpose, because by the time reports are developed the trial has moved on and the questions have changed.”[1]

As we will see in the next section, traditional data review methods such as SDV have been slowly but consistently replaced by other, more dynamic forms of data review, often tied closely to newer clinical trial monitoring models such as real-time monitoring and risk-based monitoring.

Clinical Data Review: The Present

The current landscape of clinical data review is predominantly driven by electronic systems that facilitate the collection, storage, analysis, and visualization of study data. Data is increasingly captured directly in electronic format (skipping the manual collection of data and its subsequent transcription), known as direct data capture (DDC) or electronic data capture (EDC). EDC platforms enable real-time access to electronic source data and documents, originating from wearable devices, different types of data collection tools, and from multiple sites across different geographic locations. Data can be securely (and automatically) transmitted electronically between sites, investigators, CROs, sponsors, and other stakeholders, reducing potentially significant time delays and inaccuracies/errors associated with manual processes. A clinical data management system (CDMS) is now a commonplace term and an integral part of data workflows in clinical trials.

Integration of clinical data review within centralized and risk-based monitoring

Nowadays, data review is more of a continual process that is closely intertwined with the broader study monitoring operations. With multi-source data available in near real-time and in electronic format in a unified database, automated alerts, validation checks, field constraints, and other aids can be programmed into the database to alert monitors to discrepant or missing data, or to potential adverse events indicated by outlier data for health endpoints. These technological features have streamlined many aspects of data review and opened up new possibilities.

From a regulatory perspective, the emphasis has also shifted away from exhaustive/comprehensive data review (of all study data) to focusing on critical data. The prioritization of verifying data that is most critical to patient safety and trial quality is reflected in the trend toward risk-based monitoring strategies. Continuous remote monitoring systems can provide nearly immediate insights into patient safety or protocol compliance, allowing for prompt intervention and more adaptive and flexible trial management.

Data analytics techniques have also found a place in data review and study monitoring, with advanced statistical methods enabling researchers to explore complex relationships within datasets quickly. Different data visualization tools can be used to set up interactive dashboards that allow central monitors to explore trends, outliers, and other features of the data in near-real-time, and with visual aids to further facilitate the aspect of manual oversight.

The motivations underlying and impulsing these shifts toward real-time, unified data is reflected perfectly in the following quote from the same Pharmaceutical Outsourcing publication quoted above:

“It has never been more important than it is today to be able to comprehensively but easily review the clinical data collected during a clinical trial. Clinical review demands now require data to be analyzed and interrogated immediately to provide actionable intelligence to a variety of audiences. These audiences are assessing many targets with the data, including medical monitoring to assess subject safety, reviewing trial progress, making protocol decisions (e.g., dose escalation), analyzing data anomalies, risk-based monitoring and/or reviewing overall trial governance within a rapidly evolving clinical trial space.”[1]

Process enhancements enabled by AI and ML

At the same time, recent years have seen an even steeper shift as artificial Intelligence (AI) and machine learning (ML) technologies and tools have been uptaken in clinical research. Many clinical management software providers now incorporate AI or ML algorithms in some way or another. AI algorithms can be set up to automate repetitive tasks like anomaly detection or adverse event identification, by accurately and rapidly analyzing large volumes of both structured and unstructured data. ML models have been used to leverage historical trial information to predict outcomes, and to detect patterns that may have implications for future studies or even patient safety in the ongoing trial.

We have written about the emerging uses of both AI and ML in clinical research; check out the following articles if you’re interested in learning more.

AI in Clinical Trials: How Are AI and ML Impacting the Clinical Trial Process? | Power

Machine Learning Clinical Trials | Power

Evidently, the evolution of clinical data review is not going to stop here. Keeping with the topic of emerging trends such as AI and ML, we share some thoughts about the future directions of clinical data review in the next section.

Clinical Data Review: The Future

The future landscape of clinical data review is likely to continue to be further shaped by a few trends which have already begun to emerge:

1. Increasing process automation using AI and ML

As artificial intelligence and machine learning algorithms and tools continue to be refined and become more powerful, clinical data review is likely to see increased implementation of intelligent solutions, leveraging capabilities of AI to glean insights and perform advanced data analytics functions, freeing up the human role to be more supervisory and action-oriented.

2. Improved data integration

Improvements in data standardization, particularly across borders, will facilitate data integration from varied sources such EHRs, real-world data (RWD), wearables, and clinical trial data. In combination with advanced analytics and AI, the ever-increasing amounts of data will enable the uncovering of broader and higher-resolution insights into health trends and patient outcomes, with potential for promoting leaps forward in advanced healthcare models such as personalized medicine.

3. Enhanced regulatory definitions and oversight

The prevalence of electronic source data is nothing new; the FDA has published guidance on electronic records and electronic signatures (CFR Part 11), and on the use of computerized systems in clinical trials. However, topics such as standardization of health data formats and the ethical use of AI in processing health data and even in influencing patient care decisions are quickly becoming relevant, and regulations have not caught up.

Regulations for standardizing health data formats may help to further streamline the clinical data review process by facilitating the analysis of data from disparate sources, which is relevant as the use of remote/hybrid/decentralized trials, wearable and connected devices, and RWD increases. Audit trails are already part of many clinical data management workflows, but as their use becomes more important for proving regulatory compliance in complex data workflows, standardizing data and audit trail formats would streamline regulatory oversight as well as internal monitoring.

With artificial intelligence beginning to play an increasingly prominent role in data processing, it’s likely that rules will be set forth to define and constrain its use, particularly in regard to preventing undue harm to patients or their unethical treatment. Data security standards will also likely undergo a facelift as interoperability and data integration increase, meaning that sensitive health data will be shared more than ever.


Clinical data review has evolved significantly over the past decades, and the changes are only accelerating with the increased adoption of new technologies and approaches. Traditional manual data review processes have given way to electronic systems, advanced analytics tools, and real-time monitoring, transforming the way in which investigators and data managers collect, analyze, and interpret clinical trial data. The integration of increasing amounts of data coming from varied sources, such as EHRs, wearables, and mobile apps, enables a more comprehensive analysis of patient outcomes and a more holistic view of patients' health conditions beyond trial settings. However, disparate data sources mean different data formats, and it’s thought that enhanced data standardization could potentially improve the possibilities of advanced analytics

Emerging technologies such as artificial intelligence (AI) and machine learning (ML) hold tremendous potential in aspects such as automating data processing tasks, detecting patterns or anomalies in large and varied datasets, and enhancing predictive capabilities. In the near future, these tools are predicted to further streamline data review processes, improving data accuracy and providing deeper insights into patient safety and treatment outcomes, while also supporting enhanced patient safety.

As these technologies continue to evolve and mature, clinical data review is likely to continue to become more efficient, accurate, and insightful. Researchers will be able to benefit from enhanced decision-making processes based on real-time information and powerful predictive models. However, it’s imperative that ethical considerations be prioritized in order to ensure that patient data – as well as their privacy, safety, and well-being – are held to the same high standards of the current clinical research landscape.