ENCePP Guide on Methodological Standards in Pharmacoepidemiology


Chapter 3: Development of the study protocol

The study protocol is the core document of a study that should be drafted as one of the first steps in any research project once the research question has been clearly defined. The final version must precisely describe everything being done in the study to ensure reproducibility of the study. The protocol should be amended as needed and amendments should be justified.


For PASS described in the Guideline of good pharmacovigilance practices (GVP) Module VIII - Post-authorisation safety studies, the Commission Implementing Regulation (EU) No 520/2012 provides legal definitions of the start of data collection (the date from which information on the first study subject is first recorded in the study dataset, or, in the case of secondary use of data, the date from which data extraction starts) and of the end of data collection (the date from which the analytical dataset is completely available). These dates provide timelines for the commencement of the study and the submission of the final study report to the competent authorities. It also provides the format of protocols, abstracts and final study reports for imposed PASS. Based on these formats, the European Medicines Agency (EMA) published detailed templates for the protocol and final study report which it recommends to be used for all PASS, including meta-analyses and systematic reviews. The ISPE Guidelines for Good Pharmacoepidemiology Practices (GPP) provides guidance on what is expected from a pharmacoepidemiology study protocol and on the different aspects to be covered. It states that the protocol should include a description of the data quality and integrity, including abstraction of original documents, extent of source data verification, and validation of endpoints. The FDA’s Best Practices for Conducting and Reporting Pharmacoepidemiologic Safety Studies Using Electronic Health Care Data Sets includes a description of the elements that should be addressed in the protocols of such studies, including the choice of data sources and study population, the study design and the analyses. The ENCePP Checklist for Study Protocols seeks to stimulate researchers to consider important epidemiological aspects when designing a pharmacoepidemiological study and writing a study protocol. The Agency for Healthcare Research and Quality (AHRQ) published Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide including best practice principles and checklists on a wide range of topics that are also applicable to observational studies outside the scope of comparative effectiveness research.


For studies involving human patients, consent form and special ethical guidelines apply, see Chapter 9.2. The protocol should cover at least the following aspects:

  • The research question the study is designed to answer, which might be purely descriptive, exploratory or explanatory (hypothesis driven). The protocol should include a background description that explains the origin (scientific, regulatory, etc.) and current knowledge on the research question. It will also explain the context of the research question, including what data are currently available and how these data can or cannot contribute to answering the question. The context will also be defined in terms of what information sources can be used to generate appropriate data and how the proposed study methodology will be shaped around these. See Chapter 2 for more information.
  • The main study objective and possible secondary objectives, which are operational definitions of the research question. In defining secondary objectives, consideration could be given to time and cost, which may impose constraints and choices, for example in terms of sample size, duration of follow-up or data collection.


  • The source and study population to be used to answer the research question. The protocol should describe whether this population is already identified, and whether data are already available (secondary data collection) or whether it needs to be recruited de novo (primary data collection). The boundaries of the desired population will be defined, including inclusion/exclusion criteria, timelines (such as index dates for inclusion in the study) and any exposure or events defining the population. Exposure of interest that needs to be pre-specified and defined, including duration and intensity of exposure, source of data and methods of ascertainment.


  • Outcomes of interest that need to be pre-specified and defined, including data sources, operational definitions and methods of ascertainment such as data elements in field studies or appropriate codes in database studies.



  • The covariates and potential confounders that need to be pre-specified and defined, including how they will be measured.


  • The statistical plan for the analysis of the resulting data, including statistical methods and software, adjustment strategies, and how the results are going to be presented.


  • The identification and way of minimisation of potential biases.


  • Major assumptions, critical uncertainties and challenges in the design, conduct and interpretation of the results of the study given the research question and the data used.


  • Ethical considerations, as described in Chapter 9.

Various data collection forms including the Case Report Form (CRF), list of disease codes or descriptions of the data elements may be appended to the protocol, providing an exact representation of how the data will be collected. The study protocols could include a section specifying ways in which the CRF will be piloted, tested and finalised. Amendments of final CRFs should be justified. For field studies, physician or patient forms would be included depending on the data collection methodology. Other forms may be included as needed, such as patient information, consent form or patient-oriented summaries.


