The POS over the period of January 1, 2005, to October 31, 2015, computed using a 3-year rolling window from January 1 in year |$t-2$| to December 31 in year |$t$|⁠, with the exception of the last window, which terminates on October 31, 2015. However, the methodology used by the authors does not necessarily make that true in this case. SE denotes the standard error. A drug development program is the investigation of a particular drug for a single indication (see top diagram of Figure S2 of the supplementary material available at Biostatistics online). The POS for a given Phase |$i$|⁠, denoted by POS|$_{i,i+1}$|⁠, is defined as the probability that the drug development program advances to the next phase. Since the FDA has a 6-month period to decide if it wishes to follow-up on a filing, and an additional 18 months to deliver a verdict, this places the overall time between Phase 3 and Approval to about 30 months, hence we set |$t_3 = 900$| days. In several cases, our results differ significantly in detail from widely cited statistics. The overall POS (POS|$_{1,\rm APP}$|⁠) ranges from a minimum of 3.4% for oncology to a maximum of 33.4% for vaccines (infectious disease). This is particularly important for estimating a drug candidate's POS|$_{1,{\rm APP}}$|⁠, which is typically estimated by multiplying the empirical POS of Phase 1 (safety), 2 (efficacy for a given indication), and 3 (efficacy for larger populations and against alternatives) trials. In summary, our algorithm allows us to impute missing trial data, and by counting the number of phase transitions, we can estimate the phase and overall POS. This suggests higher risks in oncology projects and may explain their lower approval rate. It may be that trials that attempt to evaluate the effectiveness of biomarkers are more likely to fail, leading to a lower overall POS compared to trials that only use biomarkers in patient stratification. Gathering such data is expensive, time-consuming, and susceptible to error. Details of our robustness results are provided in Section A15 of the supplementary material available at Biostatistics online. We find that the overall success rate for all drug development programs did decrease between 2005 (11.2%) and 2013 (5.2%), as anecdotal reports suggest. These findings are similar in spirit to the analysis by Thomas and others (2016), which also found substantial improvement in the overall POS when biomarkers were used. The algorithm from Figure S5 in the Supplementary Material is not used, as it would overestimate the phase success if applied to a short window. Let |n^j| be the number of drug development paths with observed Phase |j| trials, and |n^j_s| be the number of drug development paths where we observe phase transitions of state |s| of Phase |j| (defined below). For full access to this pdf, sign in to an existing account, or purchase an annual subscription. Conflict of Interest: No conflicts of interest are declared for Chi Heem Wong and Kien Wei Siah. While we used the entire data set from January 1, 2000, to October 31, 2015, it has to be noted that there are only 3548 data points relating to orphan drugs, with the majority (95.3%) of the trials' statuses observed on or after January 1, 2005. In our first experiment, we attempt to replicate Thomas and others (2016) by using only data between 2006 and 2015. Our phase-specific POS estimates are higher in all phases. However, this assumption breaks down when we look at short windows of duration, for example, in a rolling window analysis to estimate the change in the POS over time. We computed the results using the path-by-path method. See Figure S2 (bottom) of the supplementary material available at Biostatistics online for an illustration. Compared to Thomas and others (2016), we find that our phase POS is higher in Phases 1 and 2, but lower in Phase 3, due to our use of the path-by-path method for calculating the POS. We term this the 'path-by-path' approach. The overall POS (POS|$_{1,\rm APP}$|⁠) increases when considering only lead indications, which is in line with the findings by Hay and others (2014). We estimate aggregate success rates, completion rates (CRs), phase-transition probabilities, and trial durations, as well as more disaggregated measures across various dimensions such as clinical phase, disease, type of organization, and whether biomarkers are used. Materials and methods: We undertook a literature search for randomised clinical trials reporting a 5-year survival benefit attributable solely to cytotoxic chemotherapy in adult malignancies. The computed success rates are comparable to those from our original data set, with deviations of less than 2.1 percentage points despite having approximately 30% fewer data points. In this article, we construct estimates of the POS and other related risk characteristics of clinical trials using 406 038 entries of industry- and non-industry-sponsored trials, corresponding to 185 994 unique trials over 21 143 compounds from Informa Pharma Intelligence's Trialtrove and Pharmaprojects databases from January 1, 2000 to October 31, 2015. The CR at Phase |$i$| refers to the proportion of Phase |$i$| trials that are tagged as completed. As the use of biomarkers to select patients, enhance safety, and serve as surrogate clinical endpoints has become more common, it has been hypothesized that trials using biomarkers are more likely to succeed. The probability of success (POS) of a clinical trial is critical for clinical researchers and biopharma investors to evaluate when making scientific and economic decisions. Trials using biomarkers exhibit almost twice the overall POS (POS|$_{1,\rm APP}$|⁠) compared to trials without biomarkers (10.3% vs. 5.5%). In what follows we assess clinical development success rates and other proxies of social value for a sample of pediatric Phase 1 trials in oncology to examine how frequently such trials influence clinical development. We find that 13.8% of all drug development programs eventually lead to approval, which is higher than the 10.4% reported by Hay and others (2014) and the 9.6% reported by Thomas and others (2016). In our industry-sponsored analysis, we counted 41 040 development paths or 67 752 phase transitions after the imputation process. These assumptions allow us to more accurately reconstruct 'drug development paths' for individual drug-indication pairs, which in turn yield more accurate POS estimates. Using a sample of 406 038 entries of clinical trial data for over 21 143 compounds from January 1, 2000 to October 31, 2015, we estimate aggregate clinical trial success rates and durations. Also, no funding bodies had any role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. In this article, we introduce the "path-by-path" approach that traces the proportion of development paths that make it from one phase to the next. This mixed result suggests that synergies between industry and non-industry organizations can be exploited through collaboration. Implicit in the path-by-path computation method is the assumption that we have relatively complete information about the trials involved in drug development programs. In order to derive the most accurate numbers possible for clinical trial success rates by phase and therapeutic area, a group of authors from MIT analyzed a mountain of data on drugs and vaccines from January 1, 2000 to October 31, 2015. Prudent resource allocation relies on the accurate and timely assessment of risk. We elaborate on this in Section A2 of the supplementary material available at Biostatistics online. To avoid confusion and facilitate the comparison of our results with those in the extant literature, we begin by defining several key terms. In this article, we attempt to use trial data to trace every drug/indication/sponsor triplet from first trial to last. Furthermore, as 92.3% of the trials using biomarkers in our database are observed only on or after January 1, 2005, we do not include trials before this date to ensure a fair comparison of the POS between trials that do and do not use biomarkers. This POS is computed using the phase-by-phase method, our adaptation of Hay and others (2014)' methodology, which reports the proportion of phase transitions that advances to the next phase. This is done by considering only those drug development programs with phases that ended between |$t_1$| and |$t_2$| in the computation of the POS. In the landmark study of this area, Hay and others (2014) analyzed 7372 development paths of 4451 drugs using 5820 phase transitions. The timing of the upward trend coincides with the time period during which the FDA has been approving more novel drugs, compared to the historical mean (see U.S. Food and Drug Administration, Center for Drug Evaluation and Research, 2016). It is highly risky, expensive, and takes a long time to realize a return, if any. To determine if a drug development program has been terminated in the last observed phase or is still ongoing, we use a simple heuristic: if the time elapsed between the end date of the most recent Phase |$i$| and the end of our sample exceeds a certain threshold |$t_i$|⁠, we conclude that the trial has terminated. This is done by modifying Algorithm 1 (see Figure S5 of supplementary material available at Biostatistics online) to increment counts only if there exists a biomarker trial in that phase. Trial length is a key determinant of the financial risk and reward of drug development projects. Instead of finding a huge increase in the overall POS, we find no significant difference. In our database, only 7.1% of all drug development paths that use biomarkers use them in all stages of development. By summing up the individual durations across Phases 1 through 3 and across therapeutic area, we find that the median time spent in the clinic ranged from 5.9 to 7.2 years for non-oncology trials, but the median duration for oncology trials was 13.1 years. Since skipping Phase 2 trials is motivated by compelling Phase 1 data, imputing the successful completion of Phase 2 trials in these cases to trace drug development paths may not be a bad approximation. We further note that if no phase transitions are missing, the path-by-path and phase-by-phase methods should produce the same results, but the former will be more representative of actual approval rates if phase transitions are missing. There exist some cases where Phase 2 trials are skipped, as with the recent example of Aducanumab (BIIB037), Biogen's Alzheimer's candidate, as reported by Root (2014). Apart from the gains in efficiency, our algorithmic approach allows us to perform previously infeasible computations, such as generating time-series estimates of POS and related parameters. This may potentially increase drug development costs and lower the profitability of the drugs in the long run. Looking at the distribution, we find that most disease area Phase I success rates cluster within +/-10% of the overall Phase I success rate. The database encodes each unique quartet of trial identification number, drug, indication, and sponsor as a data point. However, after declining to 1.7% in 2012, this rate has improved to 2.5% and 8.3% in 2014 and 2015, respectively. The overall POS presented in this study, Hay and others (2014), and Thomas and others (2016) are much higher than the 1% to 3% that is colloquially seen as it is conditioned on the drug development program entering Phase 1. Conditioned on one or more trial(s) being completed, the sponsor can choose to either pursue Phase |$i+1$| trials, or simply terminate development. This discrepancy can be attributed to their identification of only non-oncology indications as 'rare diseases' and their use of the phase-by-phase method of computing the POS. The largest increase is seen in POS|$_{2,3}$|⁠, where we obtained a value of 58.3% compared to 32.4% in Hay and others (2014) and 30.7% in Thomas and others (2016). While the use of biomarkers in the stratification of patients improves the POS in all phases, it is most significant in Phases 1 and 2. Table 4 contains POS estimates for drugs that treat rare diseases, also known as 'orphan drugs'. The data set included 406,038 trials (of which 185,994 were unique)1 and well over 21,000 compounds. The POS by therapeutic group, using data from January 1, 2000, to October 31, 2015. The probability of success (POS) of a clinical trial is critical for clinical researchers and biopharma investors to evaluate when making scientific and economic decisions. Hence the overall probability of success—moving a drug from Phase 1 to approval, which Hay and others (2014) calls the likelihood of approval (LOA)—is POS|$_{1,{\rm APP}}$|⁠. We find that the median clinical trial durations are 1.6, 2.9, and 3.8 years, for trials in Phases 1, 2, and 3, respectively. After deleting 46 524 entries with missing dates and unidentified sponsors, and 1818 entries that ended before January 1, 2000, 406 038 data points remain. Before presenting these and other results, we begin by discussing our methodology and describing some features of our data set. This is the largest investigation thus far into clinical trial success rates and related parameters. However, a major caveat is that just because a drug or vaccine is deemed a success by receiving FDA approval does not mean it works particularly well. Our database contains information from both US and non-US sources. The result for 2015 has to be treated with caution, as boundary effects increase the success rates artificially. The relative performance of the various therapeutic groups remains the same when considering only lead indications, with oncology remaining the lowest performing group at 11.4% for POS|$_{1,\rm APP}$|⁠. Clinical trials are research studies that involve people. Biostatistics 20(2): April 2019, Pages 273-286. The POS of orphan drug development programs. However, the success rate varies wildly depending on the therapeutic area. The overall success rate is mainly driven by changes in POS|$_{1,2}$| and POS|$_{2,3}$|⁠. As some observers have suggested, companies may have been more careful with licensing compounds and gotten better at identifying potential failures (see Smietana and others, 2016), thus leading to higher productivity The practice of initiating clinical trials for multiple indications using the same drug is prevalent in the industry, as documented in Table S2 in Section A5 of the supplementary material available at Biostatistics online. We find that the POS from the truncated sample differs from the full sample by less than 2.1 percentage points for all therapeutic groups, while the overall POS is 0.6 percentage points lower than the overall POS of the full sample. We provide a more detailed analysis of the differences between our analysis and Thomas and others (2016) in Section A7 of the supplementary material available at Biostatistics online. However, clinical trials are almost always beneficial for cancer patients:. Trends in risks associated with new drug development: success rates for investigational drugs. 