Print page Resize text
High contrast

Home > Standards & Guidances > Methodological Guide

10.1. General considerations

10.2. Timing of the statistical analysis plan

10.3. Elements of the statistical analysis plan

There is a considerable body of literature explaining statistical methods for observational studies but very little addressing the statistical analysis plan (SAP). A clear guide to general principles and the need for a SAP is provided in *Design of Observational Studies* (P.R. Rosenbaum, Springer Series in Statistics, 2010. Chapter 18), which also gives useful advice on how to test complex hypotheses in a way that minimises the chances of drawing incorrect conclusions.

A study is generally designed with the objective of addressing a set of research questions. A main component of a study is an initial raw dataset with a set of numerical and categorical observations that do not usually provide a direct answer to the questions that the study is designed to address. The SAP details the mathematical calculations that will be performed on the observed data in the study and the patterns of results that will be interpreted as supporting answers to the questions. An important part of the SAP should explain how issues with the data will be handled in such calculations, for example, missing or incomplete data.

Planning analyses for randomised clinical trials is covered in a number of publications. These often give checklists of the different components of the SAP and much of this applies equally to non-randomised designs. Good references in this respect are the ICH E9 Statistical Principles for Clinical Trials and its addendum on estimands and sensitivity analysis in clinical trials (E9-R1), and the Guide to the statistical analysis plan (Paediatr Anaesth. 2019;29(3):237-242).

The following objectives of a SAP apply to most studies, including observational studies.

*Transparency*as to how the analysis will proceed by specifying in advance the methodology that will be applied. A SAP should always be completed prior to data analysis. Revisions after the start of the analysis might be possible provided these changes are noted and justified in a revised SAP.*Clear communication*to the study team, especially statisticians, involved in the study. It promotes good planning and efficiency for other stakeholders such as reviewers and the target audience of the study. Readers of observational research might dismiss important findings when they were not prespecified.*Replication*so that in the future, for similar studies, the same analytical steps can be performed. The SAP should be sufficiently detailed so that it can be followed and reproduced by any competent statistician. Thus, it should provide clear and complete templates for each analysis.

Pre-specification of statistical and epidemiological analyses can be challenging for data that are not collected specifically to answer the study questions. This is often the case in observational studies, where secondary data are used. However, thoughtful specification of the way missing values will be handled or the use of a small part of the data as a pilot set to guide analysis can be useful techniques to overcome such problems. A feature common to most studies is that some analyses that are not pre-specified will be performed in response to observations in the data to help interpretation of results. It is important to distinguish between such data-driven analyses and the pre-specified findings. Post-hoc modifications to the analysis strategy should be noted and explained. The SAP provides a confirmation of this process.

Specific to observational studies, strong emphasis will be given to measures applied to control and to quantify levels of bias. Factors that may bias the results of observational studies are described in Chapter 6.1. Avoiding bias in observational studies: part 8 in a series of articles on evaluation of scientific publications (Dtsch Arztebl Int. 2009;106(41):664-8) explains how these main methodological problems can be avoided by careful planning. Part of the SAP will be devoted to converting scientific understanding of the causal relationship between the exposures and outcomes that are the primary focus of the study and other variables that are available in the dataset into a credible mathematical model. It is also advisable to consider appropriate negative controls in the analysis – (exposure, outcome) pairs that are strongly believed not to be causally related for which a similar model is considered reasonable – as these may indicate bias, or unknown or unmeasured confounding (see Chapter 5.3.4).

The study protocol will have specified the questions to be addressed by the study and will contain a generic description of the study design and the statistical methods. However, the SAP is likely to be the document in which the statistics to be calculated, and tabular and graphical presentations, are fully described. It provides a more detailed elaboration of the statistical modelling of outcomes, plans for measuring the outcomes and predictor variables, whether there will be control variables, confidence intervals, multiplicity issues and how missing data will be handled (for handling of missing data, see Chapter 6.3). Since the decision criteria for the study are specified in terms of the observed values of these detailed statistics, it is worth formulating the SAP at an early stage and, in particular, before any informal inspection of aspects of the data or results that might influence opinions regarding the study hypotheses. Ideally the SAP will be developed as soon as the protocol is finalised.

A particular concern in retrospective studies is that decisions about the analysis should be made blinded to any knowledge of the results. This should be a consideration in the study design, particularly when feasibility studies are to be performed to inform the design phase. Feasibility studies should be independent of the main study results (see Chapter 2).

At any cost, a SAP should always be completed before the data have been unblinded for the statistician. This contributes to the transparency of the study and that the set of analyses have not been influenced by the data. Making alterations to a planned statistical analysis after seeing the data increases the risk of bias and inflates the probability of type I errors.

A SAP is usually structured to reflect the protocol but will provide more granularity regarding the statistical methodology. Ideally it includes and addresses the following elements in detail:

- Objectives and testable hypothesis to answer a well framed question.

Defining primary and secondary objectives is important to avoid 'data dredging'. A hypothesis is the product of deductive reasoning, going from general premises to specific results one would expect if those general premises are indeed true. This usually involves a set of possible relationships between a set of variables. It should be clearly stated how each outcome will be measured. Negative findings may be equally important as positive findings.

- Formal definitions of any outcomes.

Outcome variables based on historical data may involve complex transformations to approximate clinical variables not explicitly measured in the dataset used. These transformations should be discriminated from those made to improve the fit of a statistical model. In either case the rationale should be given. In the latter case this will include which tests of fit will be used and under what conditions a transformation will be used. Next to the outcomes, also the variables used in the study need to be further formalised; formatting (e.g. categorisation, dichotomisation), modifications or derivations with a special attention to time-dependent variables (e.g. age, BMI).

- Study methods addressing the elements of study design (see Chapter 5) and sample size.

The SAP should make explicit the data source(s) from which the expected variation of relevant quantities and the clinically relevant differences are derived. It should be noted that in observational studies on data that already exist and where no additional data can be collected, sample size is not preclusive and the ethical injunction against 'underpowered' studies has no obvious force provided the results, in particular the 'absence of effect' and 'insufficient evidence', are properly presented and interpreted.

- Interim analyses.

If considered, interim analyses can be beneficial. Criteria, circumstances and possible drawbacks for performing an interim analysis and possible actions (including stopping rules) that can be taken on the basis of such an analysis should be presented.

- Study population.

This section includes a description of the study data sources and linkage methods, inclusion and exclusion criteria, withdrawal/follow-up, baseline patient characteristics and potential confounding variables and analysis population.

- Analytical methods.

This section should describe effect measures and statistical methods used to for each primary and secondary objective; how the achieved patient population will be characterised; handling of confounding and assessing bias; statistical methods to handle missing data; assessing goodness of fit; sensitivity analysis considered.

- Statistical principles including confidence intervals and level of significance.

When false positives are a greater concern, a smaller confidence interval should be considered. Any planned adjustment of the significance level to control for type 1 error that can arise from comparisons across multiple subgroups or analysis of multiple predictors or outcomes (secondary analyses) should be presented.

- Decision criteria.

If decisions are drawn from the study results, a section of the SAP should explain the different outcomes that might be selected for each decision, which statistics influence the decision making process and which values of the statistics will be considered to support each outcome.

Often statistical analyses will employ standard procedures incorporated in statistical packages that provide outputs seen as implicit decision criteria – for instance default p-values (i.e. 5%) or confidence intervals (i.e. 95%). However, different objectives of the study may require lower or higher strength of evidence – for instance, policy recommendations regarding drug licensing may require a lower chance of false positive decisions than the classical one when deciding whether further investigation is needed for a product safety issue. Hence, consideration of decision-making criteria with explicit reference to the type of decision to be made is beneficial.

Further reading on how to draft a SAP tailored to observational studies, see DEBATE-statistical analysis plans for observational studies (BMC Med Res Methodol. 2019;19(1):233) and The value of statistical analysis plans in observational research: defining high-quality research from the start (JAMA. 2012 Aug 22;308(8):773-4).