Clinical Trial Data Collection: An Overview of Methods and Important Considerations

Brief introduction to data collection in clinical trials

Data collection is a crucial component of clinical trials, as clinical trial data forms the basis for the subsequent scientific analysis that informs decision-making about the study drug (or therapy or device). In clinical trials, various types of clinical data are systematically collected from study participants. This information helps researchers analyze the effects of the treatment being studied and draw meaningful conclusions about its safety and/or efficacy. However, extensive planning and organization is required in order to set the trial up to collect the right data – and in the right way – to enable scientifically valid conclusions to be drawn.

In this article, we provide a thorough overview of clinical trial data collection, beginning with what information constitutes clinical trial data and why it is important for it to be collected accurately and systematically, which is usually organized under the structure of a clinical data management plan. We then touch upon data privacy regulations that are relevant to clinical data management, and finally provide a synopsis of the different methods typically used for collecting clinical trial data.

How is clinical trial data managed?

Clinical data collection begins long before any actual data comes in. There are various aspects to be taken into account, including not only the collection but also the organization, storage, and processing of data. Each person who will be involved in clinical data collection or handling should have clearly defined roles and responsibilities, and permissions need to be set up for each person on any software tools that will be used. Data security, validation/verification, and data protection laws are also important factors. Essentially, the entire framework to be used for the clinical data collection needs to be set up before the trial begins, which is usually the responsibility of a designated clinical data manager, or clinical data management team. All of these considerations are usually thought through and compiled in a clinical data management plan (DMP). We have written an extensive article covering clinical trial data management, which you can check out for a more thorough exploration of this topic.

Examples of clinical data

There are various types of clinical data that may be captured and recorded during a clinical trial, such as:

  • Patients’ basic and demographic information (name, age, sex, location, ethnicity, etc.)
  • Patients’ health information, including health records/history, diagnoses, past treatments, etc.
  • Informed consent forms (ICFs)
  • Results of laboratory tests, medical exams, etc., which could include simple measurements (blood pressure, pulse rate) or more-detailed analyses such as levels of a certain biomarker in the blood.
  • Medical images, such as MRIs, x-rays, etc.
  • Responses to surveys or questionnaires used to gather health outcomes
  • Values of the study’s endpoints/outcome measures used to assess treatment effects

Beyond patient-related data, there are also troves of data and documentation related to the trial’s organizational, financial, and operational structures, including agreements with sites and external providers. Clinical trial data might also encompass the additional layer of information related to how the data is handled and processed, including its transcription, consolidation, validation, and review, as well as query management and resolution.

Why is accurate data collection important in clinical trials?

Strong data collection practices are essential for several reasons:

  1. It supports regulatory compliance
  2. It helps sponsors stay organized and simplify the processing of large quantities of data
  3. It makes it easier to identify and correct errors, ensuring accuracy
  4. It supports transparency, which…
  5. …helps guarantee that the findings from a clinical trial are reliable and valid
  6. Ensures that the final statistical analysis is accurate in determining any treatment effects and/or identifying safety concerns
  7. It assists in the unification of data sources, ensuring consistency across systems and sites within the trial, and also supporting clinical data integration outside of the trial (for example with electronic health record (EHR) and registry data, or enabling the trial data to be used in other studies or to inform healthcare policy by representing a source of real world data)

Above all, clinical data forms the basis for many advances in medical knowledge and informs decision-making in healthcare, so it is important for it to be handled professionally.

Data privacy and confidentiality considerations

Data privacy and confidentiality are a common undercurrent across all aspects of clinical data collection and handlings. These regulations are in place to protect participants' rights and privacy, and also to uphold ethical standards. Personally identifiable information (PII) and personal health information (PHI) are two labels that can apply to certain pieces of information collected during trials; any data considered to constitute PHI or PII is subject to applicable laws and regulations (HIPAA in the United States and the GDPR and CTR in the EU). Sponsors must be familiar with data privacy regulations in each jurisdiction in which they operate, and must comply with these standards throughout all data collection and processing operations. For a deeper dive into the ins and outs of data privacy and confidentiality in clinical trials, have a read through our article: Data Privacy in Clinical Trials: Standards, Definitions, and Best Practices | Power.

How is clinical trial data collected? What are the methods of collecting patient data?

Now that we understand what constitutes clinical data, how it’s managed and by whom, and why sponsors must strictly adhere to standards of both quality and data privacy, we can get to the discussion of the actual methods used for collecting clinical trial data. In general, clinical data can be collected manually at the source, but nowadays it almost always ends up in electronic records or systems, such as a clinical data management system (CDMS). Thus, various methods have emerged for capturing clinical data directly in electronic format, which has also opened up new possibilities for data collection beyond the traditional in-person site visits.

Manual data collection

The traditional method of data collection, which is still used today by some sites, is manual recording of study data on paper forms. Research staff at study sites may collect data directly from medical records or through direct observations or interviews with participants conducted during in-person study visits.

Study data is usually first collected in individual case report forms (CRFs) for each study participant. Today, this data is almost always transcribed into electronic format in some sort of clinical trial management software (CTMS) or clinical data management software (CDMS). Manual recording of data on paper CRFs thus requires an additional step of transferring this data to the electronic system, which introduces an extra source of potential error in data entry, as handwriting can be misread or misinterpreted, or simply copied inaccurately. Thanks to a technology known as optical character recognition (OCR), it may be possible to capture handwritten information on paper forms in an automated fashion, but due to the extreme variability between handwriting, this can’t be guaranteed to be accurate and is still likely to require a manual verification step. Further, correction of any discrepancy in study data requires a formal query resolution/auditing process to ensure transparency and data validity. For these reasons, it’s becoming increasingly common for sponsors to record source data directly in electronic format (or to get sites to do so). This can be known as direct data capture (DDC) or electronic data capture (EDC), although the latter can technically also refer to the process of capturing manually entered data in the electronic system.

Clinical trial electronic data capture systems (EDC) and electronic case report forms (eCRF)

Most modern trials now utilize electronic data capture (EDC) systems, which allow for more efficient and standardized collection of clinical trial data. EDC platforms support features such as automated edit checks to validate data and flag errors, and connect readily with specific data collection tools and other eClinical software solutions.

An EDC system will usually involve electronic case report forms (eCRFs), the electronic version of CRFs. Investigators and study staff can directly enter study data into eCRFs, in a way that maintains a coherent format and structure across sites and also facilitates its automatic, real-time, and accurate transcription into other systems the sponsor may be using, such as a CTMS. EDC systems offer many features such as real-time data monitoring, efficient query management and resolution, and transparent records of compliance with regulatory requirements.

For more information on electronic data capture, see our in-depth article about EDC. You can also have a look at a piece we published compiling a list of the top providers of EDC systems for clinical trial sponsors.

Electronic clinical outcome assessment (eCOA)

Electronic clinical outcome assessments – eCOAs – refers to the use of digital means to record patients’ health outcomes directly in electronic format. This data could be collected remotely by a healthcare provider or study nurse or physician, or it could be entered by the patient him/herself, as in the case of electronic patient-reported outcomes (ePRO; see next point). Clinical outcome assessment (COA) refers to the assessment of study endpoints, which could include things like symptom severity, quality of life, or specific health indicators or physiological parameters being monitored during the study. By enabling these to be performed electronically, eCOA supports remote interpretation of study endpoints and health outcomes, enabling remote, decentralized, and direct-to-patient (DTP) trials, and minimizing issues such as recall bias (when patients are asked to recall how they felt in the past at periodic study visits). Overall, eCOA also offers improved accuracy by capturing patient experiences and conditions in real-time. See our article for further details about the use of eCOA in clinical trials.

Electronic patient-reported outcomes (ePRO)

Electronic patient-reported outcomes – ePROs – refer to the self-reporting of clinical data by study participants, and are considered to be a type of eCOA. Patients might record and report aspects of their symptoms, treatment-related effects, or how they’re feeling in general. This is supported by the use of digital tools and/or web-based platforms, and can take the form of surveys, free-form/open-ended questions, severity scales, etc. Collecting this information electronically allows for more frequent reporting, as well as remote data collection which reduces the burden placed on patients by mandating in-person study visits. It also offers the ability to capture more nuanced responses, as patients are generally more likely to feel more comfortable at home than at the study site. The real-time reporting also means that data is more likely to accurately reflect the patients’ true day-to-day experiences.

For more information, refer to our article: What is ePRO and how is it used in clinical trials?

Wearable devices and connected devices

Technological advances have also opened up the possibility for passive, automated data collection through wearable, connected devices. Such devices can include things like home monitors/sensors for heart rate, blood pressure, blood glucose, blood oxygen levels, sleep quality, etc. Wearable devices can be employed by the study sponsor as part of an eCOA or ePRO data collection methodology, offering significant advantages in terms of patient convenience and the ability to collect an increased number of data points without burdening the patient. They also support unparalleled levels of accuracy since there is no human interpretation involved, although this requires thorough calibration, validation, and maintenance of all devices and software, as well as proper training of participants on the use of the device(s).


Clinical trial data collection represents a complex undertaking involving numerous personnel, operations, and, increasingly, software systems and electronic tools. Researchers and sponsors must adhere strictly to data privacy regulations, and take all steps necessary to ensure the accuracy, validity, and security of clinical trial data. There are numerous methods for collecting trial data, which offers sponsors a high degree of flexibility in designing a data management plan that fits the needs of the trial at hand. The increasing utilization of electronic systems and tools, when properly setup, supports new levels of efficiency and accuracy in clinical data collection, as well as entirely new trial models and increased patient engagement.