Skip to main content
European Network of Centres for Pharmacoepidemiology and Pharmacovigilance

Chapter 4: Study design

4.1. Overview

An epidemiological study measures a parameter of occurrence (generally incidence, prevalence or risk or rate ratio) of a health phenomenon (e.g., a disease) in a specified population and with a specified time reference (time point or time period). Epidemiological studies may be descriptive or analytic. Descriptive studies do not aim to evaluate a causal relationship between a population characteristic and the occurrence parameter and generally do not include formal comparisons between population groups. Analytic studies (also called causal inference studies), in contrast, use study populations assembled by the investigators to assess relationships that may be interpreted in causal terms. In pharmacoepidemiology, analytic studies generally aim to quantify the association between exposure to a medicine and a health phenomenon, and test the hypothesis of a causal relationship. They are comparative by nature, e.g., comparing the occurrence of an outcome between subjects being users of the medicine or non-users, or users of a different medicinal product.

Studies can be interventional or non-interventional (observational). In interventional studies, the subjects are assigned by the investigator to be either exposed or unexposed. Most often, in these studies, exposure is assigned randomly and are known as randomised clinical trials (RCTs), and are typically conducted to test the efficacy of treatments such as new medications. In RCTs, randomisation is used with the intention that the only difference between the exposed and unexposed groups will be the treatment itself. Thus, any differences in the outcome can be attributed to the effect of such treatment. In contrast to experimental studies where exposure is assigned by the investigator, in observational studies the investigator plays no role with regards to which subjects are exposed and which are unexposed. The exposures are either chosen by, or are characteristics of, the subjects themselves. Observational Studies: Cohort and Case-Control Studies (Plast Reconstr Surg. 2010;126(6):2234-42) provides a simple and clear explanation of the different types of observational studies and of their advantages and disadvantages (see also Chapter 4.2. Study designs).

In order to obtain valid estimates of the effect of a determinant on a parameter of disease occurrence, analytic studies must address three factors: random error (chance), systematic error (bias) and confounding. It is important to understand that error is defined as the difference in the measured value to the true value of a particular observation. 

  • Random error (chance): the observed effect estimate is a numerical value which may be explained by random error because of the underlying variation in the population. The confidence interval (CI) allows the investigator to estimate the range of values within which the actual effect is likely to fall. 

  • Systematic error (bias): the observed effect estimate may be due to systematic error in the  selection of the study population or in the measurement of the exposure or disease. Two main types of biases need to be considered, selection bias and information bias. Selection bias results from procedures used to select subjects and from factors that influence study participation. For example, a case-control study may include non-case subjects with a higher prevalence of one category of the exposure of interest than in the source population for the cases. External factors such as media attention to safety issues may also influence healthcare seeking behaviours and measurement of the incidence of a given outcome. Information biases can occur whenever there are errors in the measurement of subject characteristics, for example a lack of pathology results leading to outcome misclassification of certain types of tumours, or lack of validation of exposure, leading to misclassify the exposed and non-exposed status of some study participants. For example, mothers of children with congenital malformations will recall more instances of medicine use during pregnancy than mothers of healthy children. This is known in epidemiology as “recall bias”, a type of information bias. The consequences of these errors generally depend on whether the distribution of errors for the exposure or disease depends on the value of other variables (differential misclassification) or not (nondifferential misclassification). 

  • Confounding: Confounding results from the presence of an additional factor, known as a confounder or confounding factor, which is associated with both the exposure of interest and the outcome. As a result, the exposed and unexposed groups will likely differ not only with regards to the exposure of interest, but also with regards to a number of other characteristics, some of which are themselves related to the likelihood of developing the outcome. Confounding distorts the observed effect estimate for the outcome and the exposure under study. As there is not always a firm distinction between bias and confounding, confounding is also often classified as a type of bias.

There are many different situations where bias may occur, and some authors attribute a name to each of them. The number of such situations is in theory illimited. ENCePP recommends that, rather than being able to name each of them, it is preferable to understand the underlying mechanisms of information bias, selection bias and confounding, be alert to their presence and likelihood of occurrence in a study, and recognise methods for their prevention, detection, and control at the analytical stage if possible - such as restriction, stratification, matching, regression and sensitivity analyses. Chapter 6 on methods to address bias nevertheless treats time-related bias (a type of information bias with misclassification of person-time) separately, as it may have important consequences on the result of a study and may be dealt with by design and time-dependent analyses.

The role of chance (random error) in the interpretation of evidence in epidemiology has often relied on whether the p-value is below a certainty threshold and/or the confidence interval excludes some reference value. The ASA statement on P values: context, process, and purpose (Am Statistician 2016;70(2),129-33) of the American Statistical Association emphasised that a p-value, or statistical significance, does not provide a good measure of evidence regarding a model or hypothesis, nor does it measure the size of an effect or the importance of a result. It is therefore recommended to avoid relying only on statistical significance, such as p-values, to interpret study results (see, for example, Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations, Eur J Epidemiol. 2016;31(4):337-50; Scientists rise up against statistical significance, Nature 2019;567(7748):305-7; It’s time to talk about ditching statistical significance, Nature 2019;567(7748):283; Chapter 15. Precision and Study size in Modern epidemiology, Lash TL, VanderWeele TJ, Haneuse S, Rothman KJ, 4th edition, Philadelphia, PA, Wolters Kluwer, 2021). This series of articles led to substantial changes in the guidelines for reporting study results in manuscripts submitted to medical journals, as discussed in Preparing a manuscript for submission to a medical journal (International Committee for Medical Journal Editors, 2021). Causal analyses of existing databases: no power calculations required (J Clin Epidemiol. 2022;144:203-5) encourages researchers to use large healthcare databases to estimate measures of association as opposed to systematically attempting at testing hypotheses (with sufficient power). The ENCePP also recommends that, instead of a dichotomous interpretation based on whether a p-value is below a certain threshold, or a confidence interval excludes some reference value, researchers should rely on a more comprehensive quantitative interpretation that considers the magnitude, precision, and possible bias in the estimates, in addition to a qualitative assessment of the relevance of the selected study design. This is considered a more appropriate approach than one that ascribes to chance any result that does not meet conventional criteria for statistical significance.

Given that the large number of observational studies performed urgently with existing data and in sometimes difficult conditions in early times of the COVID-19 pandemic has raised concerns about the validity of many studies published without peer-review, we recommend to balance urgency and use of appropriate methodology. Considerations for pharmacoepidemiological analyses in the SARS-CoV-2 pandemic (Pharmacoepidemiol Drug Saf. 2020;29(8):825-83) provides recommendations across eight domains: (1) timeliness of evidence generation; (2) the need to align observational and interventional research on efficacy (3) the specific challenges related to “real‐time epidemiology” during an ongoing pandemic; (4) which design to use to answer a specific question; (5) considerations on the definition of exposures and outcomes and what covariates to collect ; (6) the need for transparent reporting; (7) temporal and geographical aspects to be considered when ascertaining outcomes in COVID-19 patients, and (8) the need for rapid assessment. The article Biases in evaluating the safety and effectiveness of drugs for covid-19: designing real-world evidence studies.(Am J Epidemiol. 2021;190(8):1452-6) reviews and illustrates how immortal time bias and selection bias were present in several studies evaluating the effects of drugs on SARS-CoV-2 infection, and how they can be addressed. Although these two examples specifically refer to COVID-19 studies, such considerations are applicable to research questions with other types of exposures and outcomes.

COVID-19 pandemic-related disruptions in healthcare are likely to have impacted the design of current as well as future non-interventional, real-world studies. Changes in access to healthcare and healthcare seeking behavior during the pandemic will create and exacerbate the challenges inherent to observational studies when using real-world data from this period. The article Noninterventional studies in the COVID-19 era: methodological considerations for study design and analysis (J Clin Epidemiol. 2023;153:91-101) presents a general framework for supporting study design of non-interventional studies using real-world data from the COVID-19 era. 

Finally, graphical frameworks for presenting study designs are increasingly recommended, to foster transparency, enhance understanding of the design, and support the evaluation of study protocols and the interpretation of study results, as illustrated in A Framework for Visualizing Study Designs and Data Observability in Electronic Health Record Data (Clin Epidemiol. 2022;14:601-8) and Visualizations throughout pharmacoepidemiology study planning, implementation, and reporting (Pharmacoepidemiol Drug Saf. 2022;31(11):1140-52).

4.2. Types of study design

This chapter briefly describes the main types of study designs. Specific aspects or applications of these designs are presented in Chapter 4.4. These designs are fully described in several textbooks cited in the Introduction, for example, Modern Epidemiology 4th ed. (T. Lash, T.J. VanderWeele, S. Haneuse, K. Rothman. Wolters Kluwer, 2020).

The choice of the study design should be primarily driven by the need to obtain valid evidence regarding the objective(s) of the study by mitigating the risk of selection bias, information bias and confounding (see Chapter 6). Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available (Am J Epidemiol. 2016;183(8):758-64) has proposed target trial emulation as a strategy that uses existing tools and methods to formalise the design and analysis of observational studies. It stimulates investigators to identify potential sources of concerns and develop a design that best addresses these concerns and the risk of bias. Target trial emulation is described in Chapter 4.2.6. The increasing ability to use electronical data from routine healthcare systems has opened up new opportunities for investigators to conduct studies. Many investigators use the data source(s) they have access to and are familiar with in terms of potential bias, confounding and missing data.

4.2.1. Cohort studies

In a cohort study, the investigator identifies a population from which the study subjects will be identified, defines two or more groups of subjects (referred to as study cohorts) who are at risk for the outcome of interest and differ according to their exposure, and follows them over time to observe the occurrence of the outcome of interest in the exposed and unexposed cohorts. A cohort study may also include a single cohort that is heterogeneous with respect to exposure history, and occurrence of the outcome is measured and compared between exposure groups within the cohort. The amount of follow-up of each subject in the cohorts is counted and the total person-time experience serves as the denominator for the calculation of the incidence rate of the outcome of interest. Cohorts are called fixed when individuals may not move from one exposure group to the other. They are called closed when entry is not allowed after the cohort’s inception. The population of a cohort may also be called dynamic (or open) if it can gain and lose members who contribute to the person-time experience for the duration of their presence in the cohort. The main advantages of a cohort study are the possibility to calculate directly interpretable incidence rates of an outcome and to investigate multiple outcomes for a given exposure. The cohort design is also well suited to studies using large electronic records (such as electronic healthcare records and administrative claim data) where individual data are collected over long periods of time, allowing to study the effect of drug exposures to outcomes occurring later. Disadvantages are the need for a large sample size and possibly a long study duration to study rare outcomes, although use of existing electronic healthcare databases allow to retrospectively observe and analyse large cohorts (see Chapters 8.2 and 9).

Cohort studies are commonly used in pharmacoepidemiology to study the utilisation and effects of medicinal products. At the beginning of the COVID-19 pandemic, it was the design of choice to compare the risk and severity of SARS-CoV-2 infection in persons using or not certain types of medicines. An example is Renin-angiotensin system blockers and susceptibility to COVID-19: an international, open science, cohort analysis (Lancet Digit Health 2021;3(2):e98-e114) where electronic health records were used to identify and follow patients aged 18 years or older with at least one prescription for RAS blockers, calcium channel blockers, thiazide or thiazide-like diuretics. Four outcomes were assessed: COVID-19 diagnosis, hospital admission with COVID-19, hospital admission with pneumonia, and hospital admission with pneumonia, acute respiratory distress syndrome, acute kidney injury, or sepsis.

4.2.2. Case-control studies

In a case-control study, the investigator first identifies cases of the outcome of interest and establishes their exposure status, but the denominators (person-time of observation) to calculate their incidence rates are not measured. A referent (traditionally called “control”) group without the outcome of interest is then sampled to estimate the relative distribution of the exposed and unexposed denominators in the source population from which the cases originate. Only the relative size of the incidence rates can therefore be calculated. Advantages of a case-control study include a computational efficiency far superior to the cohort design, the possibility to initiate a study based on a set of cases already identified (e.g., in a hospital) and the possibility to study rare outcomes and their association with multiple exposures or risk factors. One of the main difficulties of case-control studies is the appropriate selection of controls independently of exposure or other relevant risk factors in order to ensure that the distribution of exposure categories among controls is a valid representation of the distribution in the source population. Another disadvantage is the difficulty to study rare exposures, as a large sample of cases and controls would be needed to identify exposed groups large enough for the planned statistical analysis.

In order to increase the efficiency of exposure assessment in case-control studies, an alternative approach is a design in which the source population is a cohort. The nested case-control design includes all cases occurring in the cohort and a pre-specified number of controls randomly chosen from the population at risk at each time a case (or other relevant event) occurs. A case-cohort study includes all cases and a randomly selected sub-cohort from the population at risk. Advantages of such designs is to allow the conduct of a set of case-control studies from a single cohort and use efficiently electronic healthcare records databases where data on exposures and outcomes are already available.

The study Impact of vaccination on household transmission of SARS-COV-2 in England (N Engl J Med. 2021;385(8):759-60) is a nested case-control study where the cohort was defined by occurrence of a laboratory-confirmed COVID-19 case occurring in a household between 4 January 2021 to 28 February 2021. A ‘cases’ was defined as a secondary case occurring in the same household as a COVID-19 case and a ‘control’ was identified as a person without infection. Exposure was defined by the presence of a vaccinated COVID-19 case vs. an unvaccinated COVID-19 case in the same household with the restriction that the vaccinated COVID-19 case had to be vaccinated 21 days prior to being diagnosed. The statistical analysis calculated the odds ratios and 95% confidence intervals for household members becoming ‘cases’ if the COVID-19 case was vaccinated with 21 days or more before testing positive, vs. household members where the COVID-19 case was not vaccinated.

4.2.3. Case-only designs

4.2.3.1. General considerations

Although case-only designs are not considered as traditional study designs, they are increasingly used, and have been the topic of a large amount of methodological research. Case-only designs are designs in which cases are the only subjects; they reduce confounding by using the exposure and outcome history of each case as its own control, thereby eliminating confounding by characteristics that are constant over time, such as sex, socio-economic factors, genetic factors or chronic diseases. They are also best suited to studying transient exposures in relation to acute outcomes. Control yourself: ISPE-endorsed guidance in the application of self-controlled study designs in pharmacoepidemiology (Pharmacoepidemiol Drug Saf. 2021;30(6):671–84) proposes a common terminology to facilitate critical thinking in the design, analysis and review of studies, called by the authors ‘Self-controlled Crossover Observational PharmacoEpidemiologic (SCOPE)’ studies. These are split into outcome-anchored (case-crossover, case-time-control and case-case-time control) and exposure-anchored (self-controlled case series and self-controlled risk interval) that are suitable for slightly different research questions.

A simple form of a self-controlled design is the sequence symmetry analysis (initially described as prescription sequence symmetry analysis), introduced as a screening tool in Evidence of depression provoked by cardiovascular medication: a prescription sequence symmetry analysis (Epidemiology 1996;7(5):478-84). Hypothesis-free screening of large administrative databases for unsuspected drug-outcome associations (Eur J Epidemiol 2018;33(6):545-55) demonstrates how the sequence symmetry analysis can screen across a very wide range of exposures and outcomes.

4.2.3.2. Case-crossover design

The case-crossover (CCO) design compares the risk of exposure in a time period prior to an outcome, with that in an earlier reference time-period, or set of time periods, to examine the effect of transient exposures on acute events (see The Case-Crossover Design: A Method for Studying Transient Effects on the Risk of Acute Events, Am J Epidemiol 1991;133(2):144-53). The case-time-control design is a modification of the case-crossover design which use exposure history data from a traditional control group to estimate and adjust for the bias from temporal changes in prescribing (The case-time-control design, Epidemiology 1995;6(3):248-53). However, if not well matched, the case-time-control group may reintroduce selection bias (see Confounding and exposure trends in case-crossover and case-time-control designs, Epidemiology 1996;7(3):231-9). Methods have been suggested to overcome the exposure-trend bias while controlling for time-invariant confounders (see Future cases as present controls to adjust for exposure trend bias in case-only studies, Epidemiology 2011;22(4):568-74). Persistent User Bias in Case-Crossover Studies in Pharmacoepidemiology (Am J Epidemiol. 2016;184(10):761-9) demonstrates that case-crossover studies of medicines that may be used indefinitely are biased upward. This bias is alleviated, but not removed completely, by using a control group. Evaluation of the Case-Crossover (CCO) Study Design for Adverse Drug Event Detection (Drug Saf. 2017;40(9):789-98) showed that the CCO design adequately performs in studies of acute outcomes with abrupt onsets and exposures characterised as transient with immediate effects.

The self-controlled case-series design (SCCS) and the self-controlled risk interval (SCRI) method were initially developed more specifically for vaccine studies and include only cases with an exposure history, with the observation period for each case and each exposure divided into risk window(s) (e.g., number of days immediately following each exposure) and a control window (observed time outside this risk window). 

4.2.3.3. Self-controlled case series

A good overview of the self-controlled case series (SCCS) is provided in Tutorial in biostatistics: the self-controlled case series method (Stat Med. 2006;25(10):1768-97), Self-controlled case series methods: an alternative to standard epidemiological study designs (BMJ. 2016; 354) and Investigating the assumptions of the self-controlled case series method (Stat Med. 2018;37(4):643-58). 

SCCS estimate a relative incidence, that is, incidence rates within the risk window(s) after exposure relative to incidence rates within the control window(s). The SCCS design inherently controls for time-invariant and between-individual confounding, but potential confounders that vary over time e.g., confounding by indication, within the same persons still need to be controlled for.

Three assumptions of the SCCS are that 1) events arise independently within individuals (e.g., fractures do not affect the occurrence of a subsequent fracture), 2) events do not influence subsequent follow-up, and 3) the event itself does not affect the chance of being exposed. However, SCCS studies can be adapted to circumvent these assumptions in specific situations. The third assumption is generally the most limiting one, but where the event only temporarily affects the chance of exposure, additional ‘pre-exposure’ windows can be included; otherwise Cases series analysis for censored, perturbed, or curtailed post-event exposures (Biostatistics 2009;10(1):3-16) describes an extended SCCS method that can address permanent changes to the chance of exposure post-event where exposure windows are short, and is suitable where the event of interest is death.

Tutorial in biostatistics: the self-controlled case series method (Stat Med. 2006;25(10):1768-97) details how to fit SCCS models using standard statistical packages. The book Self-Controlled Case Series Studies: A Modelling Guide with R (P. Farrington, H. Whitaker, Y. G. Weldeselassie, 1st Edition, Chapman and Hall/CRC, 2021) provides a more detailed account. Examples from the tutorial and book are available from http://sccs-studies.info/.

An illustrative example of an SCCS study is Opioids and the Risk of Fracture: a Self-Controlled Case Series Study in the Clinical Practice Research Datalink (Am J Epidemiol. 2021;190(7):1324-31) where the relative incidence of fracture was estimated by comparing time windows when cases were exposed following an opioid prescription and unexposed to opioids. Multiple contiguous risk windows were included to capture changes in risk from new use through to long-term use. A washout window was included after prescriptions stopped, and a pre-exposure window was included to address potential bias from event-dependent exposure. Age, season and exposure to fracture risk–increasing drugs were adjusted for. SCCS assumptions were checked using sensitivity analyses, including taking first fractures only to address independence of events, and excluding individuals who died to address events influencing follow-up.

Use of the self-controlled case-series method in vaccine safety studies: review and recommendations for best practice (Epidemiol Infect. 2011;139(12):1805-17) assesses how the SCCS method has been used across 40 vaccine studies, highlights good practices, and provides guidance on how the method should be used and reported. Using several analytical approaches is recommended, as it can reinforce conclusions or shed light on possible sources of bias when these differ for different study designs. When should case-only designs be used for safety monitoring of medical products? (Pharmacoepidemiol Drug Saf 2012;21(Suppl. 1):50-61) compares the SCCS and case-crossover methods as to their use, strengths, and major differences (directionality). It concludes that case-only analyses of intermittent users complement the cohort analyses of prolonged users because their different biases compensate for one another. It also provides recommendations on when case-only designs should, and should not, be used for drug safety monitoring. Empirical performance of the self-controlled case series design: lessons for developing a risk identification and analysis system (Drug Saf. 2013;36(Suppl. 1):S83-S93) evaluates the performance of the SCCS design using 399 drug-health outcome pairs in 5 observational databases and 6 simulated datasets to assess four outcomes and five design choices. The Use of active Comparators in self-controlled Designs (Am J Epidemiol. 2021;190(10):2181-7) showed that presence of confounding by indication can be mitigated by using an active comparator, using an empirical example of a study of the association between penicillin and venous thromboembolism (VTE), with roxithromycin, a macrolide antibiotic, as the comparator, and upper respiratory infection, a transient risk factor for VTE, representing time-dependent confounding by indication.

4.2.3.4. Self-controlled risk interval design

The self-controlled risk interval (SCRI) design is a restricted SCCS design suitable when exposure risk windows are short. Rather than using all follow-up time available, short control windows before and/or after risk windows are selected; gaps between risk and control windows may be included e.g., to allow for a washout period. Power may be reduced as compared with the SCCS, but will often suffice for use with large databases where events are not very rare. Since each individual’s observation period is short, age and time effects often do not require control. In Use of FDA's Sentinel System to Quantify Seizure Risk Immediately Following New Ranolazine Exposure (Drug Saf. 2019;42(7):897-906), new users were restricted to patients with 32 days of continuous exposure to ranolazine (i.e., capturing individuals that typically would have a 30-day dispensing). The observation period began the day after the start of the incident ranolazine dispensing and ended on the 32nd day after the index date, with two risk windows covering days 1-10 and 11-20, and the control window days 21-32. The relative incidence is calculated as a ratio of the number of events in the risk interval to the number of events in the control interval multiplied by the ratio of the length of control interval to length of risk interval from only cases.

According to the Master Protocol: Assessment of Risk of Safety Outcomes Following COVID-19 Vaccination (bestinitiative.org) (2021), the standard SCCS design is more adaptable and is thus preferred when risk or control windows may be less well-defined, when there is a need to increase statistical power, or when unmeasured time-varying confounding is a lesser concern. The SCCS design can also be more easily used to assess multiple occurrences of independent events within an individual. The SCRI design is preferred when it is feasible to have strictly defined risk and control windows for outcomes of interest, or when time varying confounding is a concern. Despite the short observation periods, SCRI may be vulnerable to time-varying confounders; a means of adjustment in SCRI studies, e.g., for steep age effects sometimes seen in studies of childhood vaccine safety, is provided in Quantifying the impact of time-varying baseline risk adjustment in the self-controlled risk interval design (Pharmacoepidemiol Drug Saf. 2015;24(12):1304-12).

4.2.4. Cross-sectional studies

Cross-sectional studies are studies that seek to collect information on a study population at a specified time point without considering the relative timing of putative outcomes and exposures. Cross-Sectional Studies: Strengths, Weaknesses, and Recommendations (Chest 2020;158(1S):S65-S71) provides recommendations for the conduct of such studies, as well as use cases.

The data collected at the time point may include both exposure and outcome data. In studies looking at the association between drug use and a clinical outcome, use of prevalent drug users (i.e., patients already treated for some time before study follow-up begins) can introduce two types of bias. Firstly, prevalent drug users are “survivors” of the early period of treatment, which can introduce substantial (selection) bias if the risk varies with time. Secondly, covariates relevant for drug use at the time of the entry (e.g., disease severity) may be affected by previous drug utilisation, or patients may differ regarding health-related behaviours (healthy user effect). No firm inference on a causal relationship can therefore be made from the results.

The study The incidence of cerebral venous thrombosis: a cross-sectional study (Stroke 2012;43(12):3375-7) was used to provide an estimate of the background incidence of cerebral sinus venous thrombosis (CSVT) in the context of the safety assessment of COVID-19 vaccines. Patients were identified from all 19 hospitals from two Dutch provinces using specific code lists. Review of medical records and case ascertainment were conducted to include only confirmed cases. Incidence was calculated using population figures from census data as the denominator.

4.2.5. Ecological studies and case-population studies

In ecological studies, populations are the unit of analysis, for example, comparing measures of a drug’s utilisation across countries and correlating it with these countries’ aggregate incidence rate of an outcome. Fundamentals of the ecological design are described in Ecologic studies in epidemiology: concepts, principles, and methods (Annu Rev Public Health 1995;16:61-81) and a ‘tool box’ is presented in Study design VI - Ecological studies (Evid Based Dent. 2006;7(4):108).

As illustrated in Control without separate controls: evaluation of vaccine safety using case-only methods (Vaccine 2004;22(15-16):2064-70), ecological analyses assume that a strong correlation between the trend in an indicator of an exposure (vaccine coverage in this example) and the trend in incidence of a disease (trends calculated over time or across geographical regions) is consistent with a causal relationship. Such comparisons at the population level may only generate hypotheses as they do not allow controlling for time-related confounding variables, such as age and seasonal factors. Moreover, they do not establish whether the outcome primarily occurred in the exposed individuals.

Case-population studies are a form of ecological studies where cases are compared to an aggregated comparator consisting of population data. The case-population study design: an analysis of its application in pharmacovigilance (Drug Saf. 2011;34(10):861-8) explains this design and its application in pharmacovigilance for signal generation and drug surveillance. The design is also explained in Chapter 2: Study designs in drug utilization research of the textbook Drug Utilization Research - Methods and Applications (M Elseviers, B Wettermark, AB Almarsdóttir, et al. Editors. Wiley Blackwell, 2016). An example is a multinational case-population study aiming to estimate population rates of a suspected adverse event using national sales data in Transplantation for Acute Liver Failure in Patients Exposed to NSAIDs or Paracetamol, Drug Saf. 2013;36(2):135–44. Based on the same study, Choice of the denominator in case population studies: event rates for registration for liver transplantation after exposure to NSAIDs in the SALT study in France (Pharmacoepidemiol Drug Saf. 2013;22(2):160-7) compared sales data and healthcare insurance data as denominators to estimate population exposure and found large differences in the event rates. Choosing the wrong denominator in case-population studies might generate erroneous results. The choice of the right denominator depends not only on a valid data source but will also depend on the hazard function of the adverse event.

The case-population approach has also been adapted for vaccine safety surveillance, in particular for prospective investigation of urgent vaccine safety concerns or for the prospective generation of vaccine safety signals (see Vaccine Case-Population: A New Method for Vaccine Safety Surveillance, Drug Saf. 2016 Dec;39(12):1197-1209).

A pragmatic approach towards case-population studies is recommended: in situations where nation-wide or region-wide electronic health records (EHRs) are available and allow assessing the outcomes and confounders with sufficient validity, a case-population approach is neither necessary nor desirable, as one can perform a population-based cohort or case-control study with adequate control for confounding. In situations where outcomes are difficult to ascertain in EHRs, or where such databases do not exist, the case-population design might give an approximation of the absolute and relative risk when both events and exposures are rare. This is limited by the ecological nature of the reference data that restricts the ability to control for confounding.

Other forms of ecological studies include interrupted time-series analyses (see Chapter 4.3.3) and the case-coverage (ecological) design mainly used for vaccine monitoring (see Chapter 16.2).

4.2.6. Target trial emulation

4.2.6.1. General principles

Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available (Am J Epidemiol. 2016;183(8):758-64) introduced target trial emulation in pharmacoepidemiology as a conceptual framework helping researchers to identify and avoid potential biases in observational studies. Target trial emulation is a strategy that uses existing tools and methods to formalise the design and analysis of such studies. It stimulates investigators to identify potential sources of concerns and develop a design that best addresses these concerns and minimises the risk of bias. The first step of the strategy is to design a hypothetical ideal randomised trial (“target trial”) that would answer the research question. The target trial is described with regards to all design elements: the eligibility criteria, the treatment strategies, the assignment procedure, the follow-up, the outcome, the causal contrasts, and the analysis plan. In the second step, the researcher specifies how best to emulate the design elements of the target trial using the available observational data and considering analytic approaches given the trade-offs in an observational setting.

The target trial paradigm has been shown to prevent some common biases, such as immortal time bias or prevalent user bias while also identifying situations where adequate emulation may not be possible using the data at hand (see Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses, J Clin Epidemiol. 2016;79:70-5). Target Trial Emulation: A Framework for Causal Inference From Observational Data (JAMA. 2022;328(24):2446-7) stresses, however, that the lack of randomisation and blinding still requires high attention to the prevention and/or control of selection bias, information bias and confounding, as described in Chapter 6. Successful emulation of a target trial also requires proper definition of time zero, i.e., the start of follow-up in the observational data. Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available (Am J Epidemiol. 2016;183(8) 758-64) describes two unbiased choices of time zero when eligibility criteria can be met at multiple times.

The need to explicitly describe the design elements that emulate the clinical trial provides transparency on the study design, the assumptions needed to emulate the trial, and the definition of causal effects, which also increases replicability of the study. The design of both the target trial and its emulation should be compared in a table, following the example of Emulating a Target Trial of Interventions Initiated During Pregnancy with Healthcare Databases: The Example of COVID-19 Vaccination (Epidemiology 2023;34(2):238-46).

Statistical aspects of target trials are discussed in Chapters 3.6 (The target trial) and 22 (Target trial emulation) of the Causal Inference Book (Hernán MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC).

4.2.6.2. Extensions of the approach

Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses (J Clin Epidemiol. 2016;79:70-5) gives recommendations on how to deal with more complex scenarios in target trial emulation. The problem of multiple eligible points zero for patients can either be resolved by random selection or by using them all by emulating a sequence of nested trials with increasing time zero. Inverse probability weighting is proposed to estimate the per protocol effect of sustained treatment accounting for potential selection bias due to informative censoring.

A three-step method (cloning, censoring, weighting) has been proposed in How to estimate the effect of treatment duration on survival outcomes using observational data (BMJ. 2018;360: k182) to overcome bias in studies on the effect of treatment duration (and cumulative dose), that are often impaired by selection bias and to achieve better comparability with the treatment assignment performed in clinical trials. A clone-censor-weight approach is also recommended to deal with situations where individuals’ data are consistent with several strategies.

Emulating a target trial in case-control designs: an application to statins and colorectal cancer (Int J Epidemiol. 2020;49(5):1637–46) describes how to emulate a target trial using case-control data and demonstrates that better emulation reduces the discrepancies between observational and randomised trial evidence.

4.2.6.3. Target trial emulation in causal inference studies

A causal inference study is a study designed to investigate, at the individual patient level, the causal effect of an exposure in comparison to non-exposure or to another exposure. In the context of pharmacoepidemiology, the exposure is generally a medical treatment, and the outcome of interest is generally a measure of its safety or effectiveness.

ENCePP recommends that, unless an alternative strategy is justified, target trial emulation should be considered for non-interventional causal inference studies to improve internal validity and increase transparency on definitions and assumptions.

Consideration of the estimand framework (as described in the ICH Addendum on Estimands and Sensitivity Analysis in Clinical Trials to the Guideline on Statistical principles for Clinical Trial, 2019) for the design of the hypothetical trial may provide additional coherence and transparency on definitions of exposures, endpoints, intercurrent events (ICEs), strategies to manage ICEs, approach to missing data and sensitivity analyses to be emulated in the observational study. In particular, the observational study may benefit from the formalised identification of the ICEs in the hypothetical trial.

4.2.6.4. Examples

In the context of the COVID-19 pandemic, several observational studies on vaccine effectiveness used target trial emulation. The observational study BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination Setting (N Engl J Med. 2021;384(15):1412-23) emulated a target trial of the effect of the BNT162b2 vaccine on COVID-19 outcomes by matching vaccine recipients and controls on a daily basis on a wide range of potential confounding factors. The large population size of four large healthcare organisations led to a nearly perfect matching leading to a consistent pattern of similarity between the groups in the days just before day 12 after the first dose, the anticipated onset of the vaccine effect. A similar target trial emulation design was used in Comparative Effectiveness of BNT162b2 and mRNA-1273 Vaccines in U.S. Veterans (N Engl J Med. 2022;386(2):105-15).

In the field of pregnancy epidemiology, Emulating a Target Trial of Interventions Initiated During Pregnancy with Healthcare Databases: The Example of COVID-19 Vaccination (Epidemiology 2023;34(2):238-46) describes a step-by-step specification of the protocol components of a target trial and their emulation including sensitivity analyses using negative controls to evaluate the presence of confounding and, alternatively to a cohort design, a case-crossover or case-time-control design to eliminate confounding by unmeasured time-fixed factors.

In oncology, The value of explicitly emulating a target trial when using real world evidence: an application to colorectal cancer screening (Eur J Epidemiol. 2017;32(6):495-500) compared an observational analysis that explicitly emulated a target trial of screening colonoscopy with simpler observational analyses that do not synchronise treatment assignment and eligibility determination at time zero and/or do not allow for repeated eligibility. This comparison suggests that the lack of an explicit emulation of the target trial leads to biased estimates and shows that allowing for repeated eligibility increases the statistical efficiency of the estimates.

4.2.6.5. Target trial emulation vs. replication of an existing RCT

It is important to distinguish between target trial emulation, i.e., the emulation of a hypothetical ideal RCT, and the replication of existing RCTs, which is sometimes also called emulation. The aim of target trial emulation is to use a framework to conduct a study that avoids common biases and to transparently describe its underlying assumptions and limitations. Replication studies of existing RCTs, however, try to come as close as possible to the results of the existing, non-ideal RCT, to prove the validity of the data and the study design.

Emulation of Randomized Clinical Trials With Nonrandomized Database Analyses: Results of 32 Clinical Trials (JAMA 2023;329(16):1376-85) concludes that real-world evidence studies can reach similar conclusions as RCTs when design and measurements can be closely emulated, but this may be difficult to achieve. Concordance in results varied depending on the agreement metric. Emulation differences, chance, and residual confounding can contribute to divergence in results and are difficult to disentangle Several studies have compared the results of randomised clinical trials and of observational target trial emulations designed to ask similar questions. Comparing Effect Estimates in Randomized Trials and Observational Studies From the Same Population: An Application to Percutaneous Coronary Intervention (J Am Hear Assoc. 2021;10(11):e020357) highlighted differences between the two study designs that may affect the results and be generalisable to other types of interventions: the observational study conducted in the same registry as the registry used to recruit clinical trial patients needed to be performed in a period that preceded the clinical trial; eligibility criteria differed as not all the necessary data were available for the observational study and no exclusion was based on informed consent; some outcomes could not be defined similarly and some potential confounding factors could not be measured in the observational study.

Emulation differences versus biases when calibrating RWE findings against RCTs(Clin Pharmacol Ther. 2020;107(4):735-7) provides guidance on how to investigate and interpret differences in the estimates of treatment effect in the two study types. It is also emphasised that observational effectiveness studies should not aim at emulating RCTs but at investigating questions that cannot be answered by RCTs, as in cases where randomisation would be difficult or unethical.

4.2.7. Pragmatic trials and large simple trials

4.2.7.1. Pragmatic trials

Randomised controlled trials (RCTs) are considered the gold standard for demonstrating the efficacy of medicinal products and for obtaining an initial estimate of the risk of adverse outcomes. However, they are not necessarily indicative of the benefits, risks or comparative effectiveness of an intervention when used in clinical practice. The ADAPT-SMART Glossary defines pragmatic clinical trials (PCTs) as ‘trials [that] examine interventions under circumstances that approach real-world practice, with more heterogeneous patient populations, possibly less-standardised treatment protocols, and delivery in routine clinical settings as opposed to a research environment’.

The GetReal Trial Tool: design, assess and discuss clinical drug trials in light of Real World Evidence generation (J Clin Epidemiol. 2022;149:244-253) more broadly defines PCTs as ‘methodologies which incorporate real-world elements into clinical trial design, maintaining randomisation’ and describes the GetReal Trial Tool, designed to assess the impact of design choices on generalisability to routine clinical practice, while taking into account risk of bias, precision, acceptability and operational feasibility.

The book Pragmatic Randomized Clinical Trials Using Primary Data Collection and Electronic Health Records (1st Edition - April 8, 2021, Eds: Cynthia Girman, Mary Ritchey) addresses practical aspects and challenges of the design, implementation, and dissemination of PCTs. The publication Series: Pragmatic trials and real world evidence: Paper 1. Introduction (J Clin Epidemiol. 2017;88:7-13) describes the main characteristics of this design and the complex interplay between design options, feasibility, acceptability, validity, precision, and generalisability of the results, and the review Pragmatic Trials (N Engl J Med. 2016;375(5):454-63) discusses the context in which a pragmatic design is relevant, and its strengths and limitations based on examples. Pragmatic trials revisited: applicability is about individualization (J. Clin. Epidemiol. 2018;99:164-166) advocates for more patient-oriented pragmatic trials and suggests to 1) develop new study designs that focus on a single person, 2) incorporate patients’ perspectives on their care, and 3) integrate clinical research and medical care.

PCTs are focused on evaluating benefits and risks of treatments in patient populations and settings that are more representative of routine clinical practice. To ensure generalisability, PCTs should represent the patients to whom the treatment will be applied, for instance, inclusion criteria may be broader (e.g., allowing co-morbidity, co-medication, wider age range) and the follow-up may be minimised and allow for treatment switching. Real-World Data and Randomised Controlled Trials: The Salford Lung Study (Adv Ther. 2020;37(3):977-997) and Monitoring safety in a phase III real-world effectiveness trial: use of novel methodology in the Salford Lung Study (Pharmacoepidemiol Drug Saf. 2017;26(3):344-352) describes the model of a phase III PCT where patients were enrolled through primary care practices using minimal exclusion criteria and without extensive diagnostic testing, and where potential safety events were captured through patients’ electronic health records and triggered review by the specialist safety team.

Pragmatic explanatory continuum summary (PRECIS): a tool to help trial designers (CMAJ. 2009;180(10): E45-E57) is a tool to support pragmatic trial designs and help define and evaluate the degree of pragmatism. The Pragmatic–Explanatory Continuum Indicator Summary (PRECIS) tool has been further refined and now comprises nine domains each scored on a 5 point Likert scale ranging from very explanatory to very pragmatic with an exclusive focus on the issue of applicability (The PRECIS-2 tool: designing trials that are fit for purpose. BMJ. 2015;350: h2147). A checklist and additional guidance is provided in Improving the reporting of pragmatic trials: an extension of the CONSORT statement (BMJ. 2008;337 (a2390):1-8), and Good Clinical Practice Guidance and Pragmatic Clinical Trials: Balancing the Best of Both Worlds (Circulation 2016;133(9):872-80) discusses the application of Good Clinical Practice to pragmatic trials, and the use of additional data sources such as registries and electronic health records for “EHR-facilitated” PCTs.

Based on the evidence that costs and complexity of conducting randomised trials lead to more restrictive eligibility criteria and shorter durations of trials, and therefore reduce the generalisability and reliability of the evidence about the efficacy and safety of interventions, the article The Magic of Randomization versus the Myth of Real-World Evidence (N Engl J Med. 2020;382(7):674-678) proposes measures to remove practical obstacles to the conduct of randomised trials of appropriate size.

The BRACE CORONA study (Effect of Discontinuing vs Continuing Angiotensin-Converting Enzyme Inhibitors and Angiotensin II Receptor Blockers on Days Alive and Out of the Hospital in Patients Admitted With COVID-19: A Randomized Clinical Trial, JAMA. 2021;325(3):254-64) is a registry-based pragmatic trial that included patients hospitalised with COVID-19 who were taking ACEIs or ARBs prior to hospital admission, to determine whether discontinuation vs. continuation of these drugs affects the number of days alive and out of the hospital. Patients with a suspected COVID-19 diagnosis were included in the registry and followed up until diagnosis confirmation and randomised to either discontinue or continue ACEI or ARB therapy for 30 days. There was no specific treatment modification beyond discontinuing or continuing use of ACEIs or ARBs, the study team provided oversight on drug replacement based on current treatment guidelines. Treatment adherence was assessed based on medical prescriptions recorded in electronic health records after discharge.

4.2.7.2. Large simple trials

Large simple trials are pragmatic clinical trials with minimal data collection narrowly focused on clearly defined outcomes important to patients as well as clinicians. Their large sample size provides adequate statistical power to detect even small differences in effects, the clinical relevance of which can subsequently be assessed. Additionally, large simple trials include a follow-up time that mimics routine clinical practice.

Large simple trials are particularly suited when an adverse event is very rare or has a delayed latency (with a large expected attrition rate), when the population exposed to the risk is heterogeneous (e.g., different indications and age groups), when several risks need to be assessed in the same trial or when many confounding factors need to be balanced between treatment groups. In these circumstances, the cost and complexity of a traditional RCT may outweigh its advantages and large simple trials can help keep the volume and complexity of data collection to a minimum.

Outcomes that are simple and objective can also be measured from the routine process of care using epidemiological follow-up methods, for example by using questionnaires or hospital discharge records. Classical examples of published large simple trials are An assessment of the safety of paediatric ibuprofen: a practitioner based randomised clinical trial (JAMA. 1995;279:929-33) and Comparative mortality associated with ziprasidone and olanzapine in real-world use among 18,154 patients with schizophrenia: The Zodiac Observational Study of Cardiac Outcomes (ZODIAC) (Am J Psychiatry 2011;168(2):193-201).

Note that the use of the term ‘simple’ in the expression ‘Large simple trials’ refers to data structure and not to data collection. It is used in relation to situations in which a small number of outcomes are measured. The term may therefore not adequately reflect the complexity of the studies undertaken,  see for example Methods for the Watch the Spot Trial. A Pragmatic Trial of More- versus Less-Intensive Strategies for Active Surveillance of Small Pulmonary Nodules (Ann Am Thorac Soc 2019;16(12): 1567–1576)

4.2.7.3. Randomised database studies

Randomised database studies can be considered a special form of a large simple trial where patients included in the trial are enrolled from a healthcare system with electronic records. Eligible patients may be identified and flagged automatically by the software, with the opportunity of allowing comparison of included and non-included patients with respect to demographic characteristics and clinical history. Database screening or record linkage can be used to collect outcomes of interest otherwise assessed through the normal process of care. Patient recruitment, informed consent and proper documentation of patient information are hurdles that still need to be addressed in accordance with the applicable legislation for RCTs.

Randomised database studies attempt to combine the advantages of randomisation and observational database studies. These and other aspects of randomised database studies are discussed in The opportunities and challenges of pragmatic point-of-care randomised trials using routinely collected electronic records: evaluations of two exemplar trials (Health Technol Assess. 2014;18(43):1-146) which illustrates the practical implementation of randomised studies in general practice databases. More recent work has been conducted to extend quality standards in the Consolidated Standards of Reporting Trials (CONSORT) to also include database studies: CONSORT extension for the reporting of randomised controlled trials conducted using cohorts and routinely collected data (CONSORT-ROUTINE): checklist with explanation and elaboration (BMJ. 2021;373:n857). These quality standards for reporting also have implications on trial design and conduct.

Published examples of randomised database studies are still scarce, however, this design is becoming more common with the increasing use of electronic health records. Pragmatic randomised trials using routine electronic health records: putting them to the test (BMJ. 2012;344:e55) describes a project to implement randomised trials in the everyday clinical work of general practitioners, comparing treatments that are already in common use, and using routinely collected electronic healthcare records both to identify participants and to gather results. The above-mentioned Salford Lung Study, and the study described in Design of a pragmatic clinical trial embedded in the Electronic Health Record: The VA's Diuretic Comparison Project (Contemp Clin Trials 2022, 116:106754) belong to this category.

A particular form of randomised database studies is the registry-based randomised trial, which uses an existing registry as a source for the identification of cases, their randomisation and their follow-up. The editorial The randomized registry trial - the next disruptive technology in clinical research? (N Engl J Med. 2013;369(17):1579-81) introduces this concept. This hybrid design aims at achieving both internal and external validity by performing a RCT in a data source with higher generalisability (such as registries). Other examples are the TASTE trial that followed patients in the long-term using data from a Scandinavian registry (Thrombus aspiration during ST-segment elevation myocardial infarction (N Engl J Med. 2013;369:1587-97) and A registry-based randomized trial comparing radial and femoral approaches in women undergoing percutaneous coronary intervention: the SAFE-PCI for Women (Study of Access Site for Enhancement of PCI for Women) trial (JACC Cardiovasc Interv. 2014;7:857-67).

The importance of large simple trials has been highlighted by their role in evaluating well-established products that were repurposed for the treatment of COVID-19. The PRINCIPLE Trial platform (for trials in primary care) and the RECOVERY Trial platform (for trials in hospitals) have been recruiting large numbers of study participants and sites within short periods of time. In addition to brief case report forms, important clinical outcomes such as death, intensive care admission and ventilation were ascertained through data linkage to existing data streams. The study Lopinavir-ritonavir in patients admitted to hospital with COVID-19 (RECOVERY): a randomised, controlled, open-label, platform trial (Lancet 2020;396:1345–52) found that in patients admitted to hospital with COVID-19, lopinavir–ritonavir was not associated with reductions in 28-day mortality, duration of hospital stay, or risk of progressing to invasive mechanical ventilation or death. On the other hand, in Dexamethasone in Hospitalized Patients with Covid-19 (N Engl J Med. 2021;384(8):693-704), the RECOVERY trial also reported that the use of dexamethasone resulted in lower 28-day mortality in patients who were receiving either invasive mechanical ventilation or oxygen alone at randomisation. Inhaled budesonide for COVID-19 in people at high risk of complications in the community in the UK (PRINCIPLE): a randomised, controlled, open-label, adaptive platform trial (Lancet 2021;398:843-55) reported on the effectiveness of an inhaled corticosteroid for COVID-19 community patients. The streamlined and reusable approaches in data collection in these still recruiting platform trials clearly were essential in the achievements to enrol larger numbers of trial participants and evaluate multiple treatments rapidly.

4.3. Specific aspects of study design

4.3.1. Positive and negative control exposures and outcomes

The validity of causal associations may be tested by using control exposures or outcomes. A negative control outcome is a variable known not to be causally affected by the treatment of interest. Likewise, a negative control exposure is a variable known not to causally affect the outcome of interest. Conversely, a positive control outcome is a variable that is understood to be positively associated with the exposure of interest and a positive control exposure is one which is known to increase the risk of the outcome of interest.

Well-selected positive and negative controls support decision-making on whether the data at hand correctly support the study results for known associations or correctly demonstrate lack of association. Positive controls with negative findings and negative controls with positive findings may signal the presence of bias, as illustrated in Utilization of Positive and Negative Controls to Examine Comorbid Associations in Observational Database Studies (Med Care 2017;55(3):244-51). This general principle, with additional examples, is described in Negative Controls: A Tool for Detecting Confounding and Bias in Observational Studies (Epidemiology 2010 May; 21(3): 383–388.) and Control Outcomes and Exposures for Improving Internal Validity of Nonrandomized Studies (Health Serv Res. 2015;50(5):1432-51). Negative controls have also been used to identify other sources of bias including selection bias and measurement bias in Brief Report: Negative Controls to Detect Selection Bias and Measurement Bias in Epidemiologic Studies (Epidemiology. 2016 Sep; 27(5): 637–641) and in Negative control exposure studies in the presence of measurement error: implications for attempted effect estimate calibration (Int J Epidemiol. 2018 Apr; 47(2): 587–596). The use of negative and positive controls has therefore been recommended as a diagnostic test to evaluate whether the study design produced valid results. Practical considerations for their selection are provided in Chapter 18. Method Validity of The Book of OHDSI (2021).

Selecting drug-event combinations as reliable controls nevertheless poses important challenges: it is difficult to establish for negative controls proof of absence of an association, and it is still more problematic to select positive controls because it is desirable not only to measure an association but also an accurate estimate of the effect size. This has led to attempts to establish libraries of controls that can be used to characterise the performance of different observational datasets in detecting various types of associations using a number of different study designs. Although the methods used to identify negative and positive controls may be questioned according to Evidence of Misclassification of Drug-Event Associations Classified as Gold Standard 'Negative Controls' by the Observational Medical Outcomes Partnership (OMOP) (Drug Saf. 2016;39(5):421-32), this approach may allow to separate random and systematic errors in epidemiological studies, providing a context for evaluating uncertainty surrounding effect estimates.

Beyond the detection of bias, positive and negative controls can be used to correct unmeasured confounding, such as through empirical calibration on p-values or confidence intervals, as described in Interpreting observational studies: Why empirical calibration is needed to correct p-values (Stat Med. 2014;33(2):209-18), Robust empirical calibration of p-values using observational data (Stat Med. 2016;35(22):3883-8), Empirical confidence interval calibration for population-level effect estimation studies in observational healthcare data (Proc Natl Acad Sci. USA 2018;115(11): 571-7).

The empirical calibration approach has been used in both the case-based study design (Empirical assessment of case-based methods for identification of drugs associated with acute liver injury in the French National Healthcare System database (SNDS), Pharmacoepidemiol Drug Saf. 2021;30(3):320-33) and the cohort design (Risk of depression, suicide and psychosis with hydroxychloroquine treatment for rheumatoid arthritis: a multinational network cohort study, Rheumatology (Oxford) 2021;60:3222-34). While this method may reduce the number of false positive results, it may also reduce the ability to detect a true safety or efficacy signal and is computationally expensive, as suggested in Limitations of empirical calibration of p-values using observational data (Stat Med. 2016;35(22):3869-82) and Empirical confidence interval calibration for population-level effect estimation studies in observational healthcare data (Proc Natl Acad Sci. USA 2018;115(11): 571-7).

An Overview of key negative controls techniques has been published by the Duke-Margolis Center for Health Policy, providing a brief description of key assumptions, strengths and limitations of using negative controls (Duke-Margolis/ FDA workshop on Understanding the Use of Negative Controls to Assess the Validity of Non-Interventional Studies of Treatment Using Real-World Evidence, March 8, 2023).

4.3.2. Use of an active comparator

The main purpose of using an active comparator is to reduce confounding by indication and by disease severity. Its use is optimal in the context of the new user design (see Chapter 6.1.1), where patients with the same indication initiating different treatments are compared, as described in The active comparator, new user study design in pharmacoepidemiology: historical foundations and contemporary application (Curr Epidemiol Rep. 2015;2(4):221-8). Active comparators implicitly restrict comparisons to patients with an indication for treatment who are actually receiving treatment. Therefore, use of an active comparator not only reduces confounding by indication, but also confounding by frailty and healthy user bias.

Active-comparator design and new-user design in observational studies (Nat Rev Rheumatol. 2015;11:437-41) points out that active comparator studies give insight how safe/effective a certain therapy is, compared to a therapeutic alternative, which is usually the more meaningful research question. Ideally, an active comparator should be interchangeable with the therapy of interest, and represent the counterfactual risk of a given event with the therapeutic alternative. This means that the active comparator should be indicated for the same disease and disease severity and have the same absolute or relative exclusion criteria. The active comparator represents the background risk in the diseased and should be known to have no effect on the event(s) of interest or competing events. If the effect of the active comparator is unknown, multiple comparators, including non-users, should be used.

Identification of an active comparator should be based on clinician input and respective guidelines, acceptability of its use within the chosen data source should be verified and the balance in patient characteristics should be reviewed as described in Core concepts in pharmacoepidemiology: Confounding by indication and the role of active comparators (Pharmacoepidemiol Drug Saf. 2022;31(3):261-269). In situations where an acceptable active comparator is lacking, such as due to unavailability of a therapeutic alternative, extensive channeling or reimbursement restrictions, the validity of the planned study needs to be assessed. Alternative methods to reduce confounding by indication, such as use of inactive comparators and alternative approaches such as methods based on  propensity scores and instrumental variable analysis to balance patients’ characteristics, should be considered.

4.3.3. Interrupted time series analyses and Difference-in-Differences method

In evaluating the effectiveness of population-level interventions that are implemented at a specific point in time (with clearly defined before-after periods, such as policy effect date, regulatory action date) interrupted time series (ITS) studies are becoming the standard approach. ITS, a quasi-experimental design to evaluate the longitudinal effects of interventions through regression modelling, establishes the expected pre-intervention trend for an outcome of interest. The counterfactual scenario in the absence of the intervention serves as the comparator, the expected trend that provides a comparison for the evaluation of the impact of the intervention by examining any change occurring following the intervention period (Interrupted time series regression for the evaluation of public health interventions: a tutorial, Int J Epidemiol. 2017;46:348-55).

ITS analysis requires that several assumptions are met and its implementation is technically sophisticated, as explained in Regression based quasi-experimental approach when randomisation is not an option: Interrupted time series analysis (BMJ. 2015; 350:h2750). The use of ITS regression in pharmacovigilance impact research is illustrated in Chapter 16.4.

When data on exposed and control populations are available, Difference-in-Differences (DiD) methods are sometimes preferable. These methods compare the outcome mean or trend for exposed and control groups before and after a certain time point (usually indicating a treatment or intervention point), providing insight into the changes of the variable for the exposed population relative to the change in the negative outcome group. This approach can be a more robust approach to causal inference than ITS, by comparing the exposed group to a control group subject to the same time-varying factors. First, DiD takes the difference for both groups before and after the intervention; then it subtracts the difference of the control group from the exposed group to control for time-varying factors, thus estimating the clean impact of the intervention.

A basic introduction can be found in Impact evaluation using Difference-in-Differences (RAUSP Management Journal 2019;54:519-532) and further extensions, for example assessment of variation in treatment timing, in Difference-in-differences with variation in treatment timing (Journal of Econometrics 2021;225:254-77). A good overview applied to public health policy research is available in Designing Difference in Difference Studies: Best Practices for Public Health Policy Research (Annu Rev Public Health 2018;39:53-469). A recent review from the econometrics perspective discusses possible avenues when some core assumptions are violated and models with relaxed hypotheses are needed, and provides recommendations which can be applied to pharmacoepidemiology (What’s Trending in Difference-in-Differences? A Synthesis of the Recent Econometrics Literature, J Econom. 2023;235(2); 2218-2244).