Comparative effectiveness research (CER) is designed to inform health-care decisions at the level of both policy and the individual by comparing the benefits and harms of therapeutic strategies available in routine practice, for the prevention, the diagnosis or the treatment of a given health condition. The interventions under comparison may be related to similar treatments, such as competing drugs, or different approaches, such as surgical procedures and drug therapy. The comparison may focus only on the relative medical benefits and risks of the different options or it may weigh both their costs and their benefits. The methods of comparative effectiveness research (Annu Rev Public Health 2012;33:425-45) defines the key elements of CER as (a) head-to-head comparison of active treatments, (b) study populations typical of day-to-day clinical practice, and (c) a focus on evidence to inform health care tailored to the characteristics of individual patients. In What is Comparative Effectiveness Research, the AHRQ highlights that CER requires the development, expansion and use of a variety of data sources and methods to conduct timely and relevant research and disseminate the results in a form that is quickly usable. The evidence may come from a review and synthesis of available evidence from existing clinical trials or observational studies or from the conduct of studies that generate new evidence. In Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide, AHRQ also highlights that CER is still a relatively new field of enquiry that has its origin across multiple disciplines and is likely to evolve and be refined over time.
Among resources for keeping up with the evolution in this field, the US National Library of Medicine provides a web site for queries on CER.
The terminology ‘Relative effectiveness assessment (REA)’ is also used when comparing multiple technologies or a new technology against standard of care, while ‘rapid’ REA refers to performing an assessment within a limited timeframe in the case of a new marketing authorisation or a new indication granted for an approved medicine (What is a rapid review? A methodological exploration of rapid reviews in Health Technology Assessments. Int J Evid Based Healthc. 2012;10(4):397-410).
Several initiatives have promoted the conduct of CER and REA and proposed general methodological guidance to help in the design and analysis of such studies.
The Methodological Guidelines for Rapid Relative Effectiveness Assessment of Pharmaceuticals developed by EUnetHTA cover a broad spectrum of issues on REA. They address methodological challenges that are encountered by health technology assessors while performing rapid REA and provide and discuss practical recommendations on definitions to be used and how to extract, assess and present relevant information in assessment reports. Specific topics covered include the choice of comparators, strengths and limitations of various data sources and methods, internal and external validity of studies, the selection and assessment of endpoints (including composite and surrogate endpoints and Health Related Quality of Life [HRQoL]) and the evaluation of relative safety.
AHRQ’s Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide identifies minimal standards and best practices for observational CER. It provides principles on a wide range of topics for designing research and developing protocols, with relevant questions to be addressed and checklists of key elements to be considered. The GRACE Principles provide guidance on the evaluation of the quality of observational CER studies to help decision-makers in recognizing high-quality studies and researchers in design and conduct high quality studies. A checklist to evaluate the quality of observational CER studies is also provided. the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) addressed several key issues of CER in three publications: Part I includes the selection of study design and data sources and the reporting and interpretation of results in the light of policy questions; Part II relates to the validity and generalisability of study results, with an overview of potential threats to validity; Part III includes approaches to reducing such threats and, in particular, to controlling of confounding. The Patient-Centered Outcomes Research Institute (PCORI) Methodology Standards document provides standards for patient-centred outcome research that aims to improve the way research questions are selected, formulated and addressed, and findings reported. The PCORI group has recently published how stakeholders may be involved in PCORI research, Stakeholder-Driven Comparative Effectiveness Research (JAMA 2015; 314: 2235-2236). In a Journal of Clinical Epidemiology series of articles, the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) working group offers a structured process for rating quality of evidence and grading strength of recommendations in systematic reviews, health technology assessment and clinical practice guidelines. The GRADE group recommends individuals new to GRADE to first read the 6-part 2008 BMJ series.
A guideline on methods for performing systematic reviews of existing comparative effectiveness research has been published by the AHRQ (Methods Guide for Effectiveness and Comparative Effectiveness Reviews).
The RWE Navigator website has been developed by the IMI GetReal consortium to provide recommendations on the use of real-world evidence for decision-making on effectiveness and relative effectiveness of medicinal products. It discusses important topics such as the sources of real-world data, study designs, approaches to summarising and synthesising the evidence, modelling of effectiveness and methods to adjust for bias and governance aspects. It also presents a glossary of terms and case studies relevant for RWD research, with a focus on effectiveness research.
While RCTs are considered to provide the most robust evidence of the efficacy of therapeutic options, they are affected by well-recognised qualitative and quantitative limitations that may not reflect how the drug of interest will perform in real-life. Moreover, relatively few RCTs are traditionally designed using an alternative therapeutic strategy as a comparator, which limits the utility of the resulting data in establishing recommendations for treatment choices. For these reasons, other research methodologies such as pragmatic trials and observational studies may complement traditional explanatory RCTs in CER.
Explanatory and Pragmatic Attitudes in Therapeutic Trials (J Chron Dis 1967; republished in J Clin Epidemiol 2009;62(5):499-505) distinguishes between two approaches in designing clinical trials: the ‘explanatory’ approach, which seeks to understand differences between the effects of treatments administered in experimental conditions, and the ‘pragmatic’ approach which seeks to answer the practical question of choosing the best treatment administered in normal conditions of use. The two approaches affect the definition of the treatments, the assessment of results, the choice of subjects and the way in which the treatments are compared. A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers (CMAJ 2009; 180 (10):E47-57) quantifies distinguishing characteristics between pragmatic and explanatory trials and has been updated in The Precis-2 tool: designing trials that are fit for purpose (BMJ 2015; 350: h2147). A checklist of eight items for the reporting of pragmatic trials was also developed as an extension of the CONSORT statement to facilitate the use of results from such trials in decisions about health-care (Improving the reporting of pragmatic trials: an extension of the CONSORT statement. BMJ 2008;337 (a2390):1-8).
The article Why we need observational studies to evaluate effectiveness of health care (BMJ 1996;312(7040):1215-18) documents situations in the field of health care intervention assessment where observational studies are needed because randomised trials are either unnecessary, inappropriate, impossible or inadequate. In a review of five interventions, Randomized, controlled trials, observational studies, and the hierarchy of research designs (N Engl J Med 2000;342(25):1887-92) found that the results of well-designed observational studies (with either a cohort or case-control design) did not systematically overestimate the magnitude of treatment effects. In defense of Pharmacoepidemiology-Embracing the Yin and Yang of Drug Research (N Engl J Med 2007;357(22):2219-21) shows that strengths and weaknesses of RCTs and observational studies make both designs necessary in the study of drug effects. However, When are observational studies as credible as randomised trials? (Lancet 2004;363(9422):1728-31) explains that observational studies are suitable for the study of adverse (non-predictable) effects of drugs but should not be used for intended effects of drugs because of the potential for selection bias.
With regard to the selection and assessment of endpoints for CER, the COMET (Core Outcome Measures in Effectiveness Trials) Initiative aims at developing agreed minimum standardized sets of outcomes (‘core outcome sets’, COS) to be assessed and reported in effectiveness trials of a specific condition as discussed in Choosing Important Health Outcomes for Comparative Effectiveness Research: An Updated Review and User Survey (PLoS One 2016 ;11(1):e0146444.).
A review of uses of health care utilization databases for epidemiologic research on therapeutics (J Clin Epidemiol 2005;58(4):323-37) considers the application of health care utilisation databases to epidemiology and health services research, with particular reference to the study of medications. Information on relevant covariates and in particular on confounding factors may not be available or adequately measured in electronic healthcare databases. To overcome this limit, CER studies have integrated information from health databases with information collected ad hoc from study subjects. Enhancing electronic health record measurement of depression severity and suicide ideation: a Distributed Ambulatory Research in Therapeutics Network (DARTNet) study (J Am Board Fam Med. 2012;25(5):582-93) shows the value of adding direct measurements and pharmacy claims data to data from electronic healthcare records participating in Assessing medication exposures and outcomes in the frail elderly: assessing research challenges in nursing home pharmacotherapy (Med Care 2010;48(6 Suppl):S23-31) describe how merging longitudinal electronic clinical and functional data from nursing home sources with Medicare and Medicaid claims data can support unique study designs in CER but pose many challenging design and analytic issues. Pragmatic randomised trials using routine electronic health records: putting them to the test (BMJ 2012;344:e55) discusses opportunities for using electronic healthcare records for conducting pragmatic trials.
A model based on counterfactual theory for CER using large administrative healthcare databases has been suggested, in which causal inference from observational studies based on large administrative health databases is viewed as an emulation of a randomized trial. This ‘target trial’ is made explicit and design and analytic approaches are reviewed in Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available (Am J Epidemiol (2016) 183 (8): 758-764).
Methodological issues and principles of Chapter 5 of the ENCePP Guide are applicable to CER as well and the textbooks cited in that chapter are recommended for consultation.
Methods to assess intended effects of drug treatment in observational studies are reviewed (J Clin Epidemiol 2004;57(12):1223-31) provides an overview of methods that seek to adjust for confounding in observational studies when assessing intended drug effects. Developments in post-marketing comparative effectiveness research (Clin Pharmacol Ther 2007;82(2):143-56) also reviews the roles of propensity scores (PS), instrumental variables and sensitivity analyses to reduce measured and unmeasured confounding in CER. Use of propensity scores and disease risk scores in the context of observational health-care programme research is described in Summary Variables in Observational Research: Propensity Scores and Disease Risk Scores. More recently, high-dimensional propensity score has been suggested as a method to further improve control for confounding as these variables may collectively be proxies for unobserved factors.
Results presented in High-dimensional propensity score adjustment in studies of treatment effects using health care claims data (Epidemiology 2009;20(4):512-22) show that in a selected empirical evaluation, high-dimensional propensity score improved confounding control compared to conventional PS adjustment when benchmarked against results from randomized controlled trials. See Chapter 5.3.4 of the Guide for an in-depth discussion of propensity scores. Several methods can be considered to handle cofounders in non-experimental CER (Confounding adjustment in comparative effectiveness research conducted within distributed research networks (Med Care 2013 ; 51(8 Suppl 3) : S4-S10); Disease Risk Score (DRS) as a Confounder Summary Method: Systematic Review and Recommendations (Pharmacoepidemiol Drug Saf 2013; 22(2): 122–129). Strategies for selecting variables for adjustment in non-experimental CER have also been proposed (Pharmacoepidemiol Drug Saf 2013; 22(11): 1139–1145).
A reason for discrepancies between results of randomised trials and observational studies may be the use of prevalent drug users in the latter. Evaluating medication effects outside of clinical trials: new-user designs (Am J Epidemiol 2003;158(9):915-20) explains the biases introduced by use of prevalent drug users and how a new-user (or incident user) design eliminate these biases by restricting analyses to persons under observation at the start of the current course of treatment. The Incident User Design in Comparative Effectiveness Research (Pharmacoepidemiol Drug Saf 2013; 22(1): 1–6) reviews published CER case studies in which investigators had used the incident user design, discusses its strengths (reduced bias) and weakness (reduced precision of comparative effectiveness estimates) and provides recommendations to investigators considering to use this design. The value of incident user design and its exceptions have been reviewed.