Read our COVID-19 research and news.

Improving Data Collection for Patient Care and Clinical Trials


Editor's note: Several authors during this feature have noted the need to improve the processes used in drug discovery, with the hope that the development of new science talent can contribute to this goal. This article reports on a health care industry project for improving data collection processes that cover both clinical trials of drugs and patient care. This project includes a trial at Duke University that tests the idea of capturing clinical trial data in a single source and sharing the data using standard protocols, with direct implications for future research procedures and health care quality.

The potential link between the information underlying health care delivery and clinical research remains largely untapped. Information that could benefit all parties---patients, physicians, investigators, regulators, and biopharmaceutical product developers---remains in disparate databases and paper records.

Significant Opportunities for Improvements and Savings

An online survey conducted by the Clinical Data Interchange Standards Consortium (CDISC) and CenterWatch1 with 750 respondents--including investigative site personnel (355), biopharmaceutical companies (211), service providers for the industry (146), and technology providers (38)--yielded the following results:

  • Clinical trials are conducted using paper data collection as the primary tool (over 75%) despite the fact that electronic data collection tools have been available for more than 2 decades.

  • Biopharmaceutical companies (93%) feel that standards are very important for efficient interchange of clinical data among different parties, and 90% feel that these standards should be extended to facilitate data collection at the investigative site. Site personnel (89%) feel that sponsors of clinical trials should collaborate in the standardization of electronic data collection practices and systems for investigative sites.

  • Sponsors (80%) and service providers (81%) advocate the use of electronic source documentation online now or in the future. Sponsors (70%) and site personnel (73%) indicated that they feel this is a key area where technology can better be leveraged in the future to support clinical research.

  • Pharmaceutical companies/clinical trial sponsors (69%) and service providers/contract research organizations (CROs) (67%) do not feel that the current technological applications for clinical trials have adequate functionality to meet the current needs. Sponsors and CROs feel that there is no clear technology leader in the clinical trials arena.

Critical factors to achieve a higher level of information sharing include: a) the development and adoption of global data interchange standards that are harmonized between health care and clinical trials; b) the use of technology that is more acceptable to users; c) clarification of and adherence to regulatory requirements for health care and clinical trials; d) implementation of new technologies that are being employed by other industries to facilitate data interchange, specifically the use of the eXtensible Markup Language or XML.

The current paper-based process for collecting clinical trial data is inefficient and error prone. It requires that data be entered by hand, typically on a three-part paper case report form (even if it is already in an electronic record), from which it is reentered once or even twice into a clinical trial database and only then is it evaluated for errors/inconsistencies. The standards-based and technology-enabled single-source process proposed here can reduce transcription errors, increase efficiency, and facilitate information flow and timeliness of data, thus improving data quality and patient safety.

The Duke Single-Source Proof of Concept

Work on a proof of concept project--code-named Starbrite--began in early 2003 as a collaboration among the Duke Clinical Research Institute (DCRI), Duke Clinic, and CDISC with several technology partners and financial support from several sponsors. This proof-of-concept project will be conducted in parallel with traditional processes for an ongoing clinical trial. It will take advantage of two converging standards, Health Level Seven's Clinical Document Architecture (HL7's CDA) and CDISC's Operational Data Model (ODM).

The participants in the Starbrite trial bring considerable clinical and research experience. Duke Clinical Research Institute is an academic research organization that manages and conducts clinical trials for a number of biopharmaceutical sponsors and has used numerous electronic data collection tools. Duke Clinic is an early adopter of technology to support documentation of clinical visits. Paper charts in Duke Clinic were retired approximately 2 years ago, putting it into a very small group of U.S. providers who have made this transition. CDISC is a standards development consortium focused on developing standards to facilitate clinical trial data interchange, including regulatory submissions to the Food and Drug Administration (FDA).

This project seeks to demonstrate the extracting of clinical trial data and patient health care records from a single electronic source. Several factors set this proposal apart from earlier eSource (aggregated data from health care databases), electronic data capture, and single source efforts.

Previous efforts:

This project:

  • Changed provider workflow

  • Integrates into existing workflow

  • Extracted data from electronic medical records (EMRs) structured for individual patient records, not for research

  • Captures data for trial use first and later merges that into the patient record

  • Required that patient records be fully structured within the EMR, inhibiting use of a transcription interface

  • Supplements structured data entry and data reuse with transcription

  • Relied on closed, proprietary data formats

  • Uses open standards from established standards organizations (CDISC, HL7)

Several innovations have opened up new opportunities to converge the worlds of health care and clinical research standards and technology. XML, an information encoding meta-language from the World Wide Web Consortium, has gained currency as a universal data descriptor. XML is now incorporated in all major technology solutions, making possible widespread information exchange.

Both CDISC and HL7 have articulated XML strategies that can be mapped to one another at the implementation level and leveraged using off-the-shelf technology. CDISC and HL7 have collaborated in the development of an extensive data model, the HL7 Reference Information Model (RIM).

The current phase of this project is mapping the CDISC ODM, which is an XML implementation, to the HL7 CDA, which is derived from the RIM. The result will be corresponding data representations for clinical and trial data, both based on the RIM. The combination of XML and an underlying data model produces an unprecedented opportunity for semantic interoperability.

Concepts and Execution

The Starbrite trial will use a single source to create patient clinic notes in the form of the HL7 CDA and to create electronic case report forms that will support a trial currently being conducted by Duke Clinical Research Institute. The clinical trial data will be expressed using the CDISC XML standard operational data model.

The proof-of-concept project will work with a subset of the overall clinical trial data. The actual clinical trial will use the current paper data collection process. This proof of concept will use the same process, with minimal modifications done in parallel.

The following diagrams illustrate the two processes. Figure 1 depicts a traditional paper-based data acquisition process for clinical trials, and Figure 2 is the flow chart for the proof-of-concept project.

Figure 1: Traditional dual-purpose, multi-source data collection

Figure 2: Proposed dual-purpose, single-source data collection

Regulatory Environment and Support

Data from the CenterWatch study indicate that over 70% of clinical trial sponsors and service providers feel that the adoption of electronic data collection would be more rapid absent the fear of regulatory repercussions. Regulatory concerns ranked as the top reason listed for adoption delays for electronic clinical trial technologies.

This project will adhere to all appropriate regulatory requirements. The CDISC ODM is compliant with Title 21 Code of Federal Regulations Part 11, including electronic data archiving. Furthermore, it anticipates the direction regulators would like to move in terms of expediting and standardizing submissions.

FDA in its report, Improving Innovation in Medical Technology: Beyond 2002 , has stated a strategic goal to speed development of medical technologies. Three key areas of interest to FDA include:

  • Reducing delays in reviews of submissions,

  • Implementing a continuous improvement/quality systems approach, and

  • Expanding collaboration in the development of guidance for product development.

FDA representatives take an active part in the development of standards for regulatory review and warehousing of clinical data. They require that data be submitted in SAS transport files and PDF formats today, but have expressed an interest in moving to XML once appropriate standards and tools are available.

FDA, CDISC, and HL7 have collaborated for over 2 years through a technical committee formed within HL7 called Regulated Clinical Research and Information Management (RCRIM). A common informatics platform for health care and clinical research is one area of RCRIM interest.


The project will ease the burden of validating the source of clinical research data and improve quality and efficiency by eliminating manual and redundant data entry. Single-source data capture can improve data quality while decreasing the time and resources required for clinical trials. The trial design proposed here would also ease the burden of clinical documentation for the practicing physician. Ultimately, it will lead to richer sources of support for clinical trials and more rapid trials.

The authors

Liora Alschuler is a developer of XML-based standards for electronic health care information and a consultant in their application for providers and system vendors. She is co-chair of the HL7 Structured Documents Technical Committee responsible for HL7's Clinical Document Architecture.

Landen Bain, a member of the CDISC Board, explores emerging technologies and transformative business models in health care. Mr. Bain's current portfolio of investigations includes patient safety, mobile technologies, natural language processing, use of clinical data for research purposes, application of HL7's Clinical Document Architecture, standard clinical vocabularies, clinical genomics, space-based data, and community networks of patient care data.

Rebecca Daniels Kush, Ph.D., is president of Catalysis Inc. and a founder and current president of CDISC. Catalysis Inc. consults in the areas of strategy, process analysis, and redesign, particularly associated with "electronic clinical trials"; project management infrastructure and training; implementation of enabling technologies; and clinical trial metrics.


  • The study was sponsored by 24 companies within the area of pharmaceutical product development.