Most Phase I oncology trials are primarily intended to establish the safety profile of a new treatment. They usually focus on toxicity alone to determine the maximum tolerated dose (MTD) defined as the highest dose with the probability of toxicity less than a pre-specified target toxicity rate or a recommended dose (RD) for use in later phase clinical trials. Methods for oncology dose-escalation clinical trials fall into two broad classes: the rule-based designs including the traditional 3 + 3 design and the model-based designs  .
Model-based designs assume some model for the dose-toxicity relationship, and enable defining the MTD as the dose with a given probability of causing a dose limiting toxicity (DLT) in a patient. A DLT is a pre-defined toxicity that is serious enough to raise concern about that dose level. An overview of single-agent trial designs and the advantages of model-based designs over other available options are given by  .
An example of model-based designs is the Bayesian dose toxicity model (BDTM) which is a Bayesian logistic regression model (BLRM) described in  and  . In this model, the statistical focus is on the inference for DLT rates, which is model-based and uses the actual trial data as well as other relevant trial-external (historical) data as prior distributions.
In standard phase I design, it is assumed that both the probabilities of toxicity and efficacy of a new drug increase as dose level increases. This assumption could hold for cytotoxic agents. However, recent development of molecularly targeted agents (MTA’s) challenges this assumption  , particularly, in immunotherapy trials, where the dose-efficacy relationship may be bell-shaped: at low dose levels there is no efficacy; at some optimal dose levels, the maximal activity is reached, but at dose levels that exceed the optimal levels, the efficacy starts to diminish and the drug likely causes an over-stimulation, with either no efficacy, or even life-threatening toxicity  .
In this paper, we present a simple method of incorporating toxicity and efficacy data using bivariate modelling to improve dose escalation decisions and to further guide the determination of the RD. In Section 2, the dose-toxicity and dose-efficacy models are presented. In Section 3, the proposed method of dose-escalation, which is practical and utilizes both dose-toxicity and dose-efficacy models as well as the bivariate model combining both models, is presented. In addition, the operating characteristics of the models are presented using several scenarios of dose-response and dose-toxicity shapes. The paper concludes with a discussion in Section 4.
2. Materials and Methods
In this section, we propose a bivariate model which describes both the toxicity and efficacy response of a new treatment. The two univariate models are described first, and a joint probability is derived to allow for association between dose-toxicity and dose-efficacy relationships using global cross-ratio method.
2.1. Toxicity Model
The toxicity model used in this paper will be called Bayesian Dose Toxicity Model (BDTM), which is a two-parameter logistic Regression Model, introduced by  , assessing the dose-toxicity relationship and given as:
where p(d) is the probability that a patient has a DLT at dose d, α1 and β1 are the model parameters, and d* is a scaling dose allowing for the interpretation of α1 as the odds of a DLT at d*. The logit function is defined as
It is assumed that the probability of toxicity increases monotonically with dose and therefore β1 is assumed to take non-negative values.
2.2. Efficacy Model
The Efficacy Model is a Bayesian Dose Efficacy Model (BDEM) which is a polynomial logistic regression model describing the tumor-response and dose level relationship and is given below:
where r(d) is the probability that a patient has a response at dose d, α2 β2 and γ are the model parameters, and d* is a scaling dose allowing for the interpretation of α2 as the odds of a response at d*. Here the response is assumed to be a binary outcome as an efficacy response or a surrogate biomarker endpoint.
This model generalizes the monotonic relationship observed for cytotoxic therapies and can handle several scenarios of dose-response curves observed for both cytotoxic and cytostatic therapies. For illustrative purpose, Figure 1 provides some dose-response shapes supported by the model. Indeed, the quadratic term in the model is included to allow model flexibility should the probability of efficacy levels off or diminishes after a certain dose level  . This is the case when γ < 0. However, γ > 0 corresponds to an unrealistic scenario with a U-shaped curve (i.e. high efficacy at lower and higher doses, and small efficacy at intermediate doses). Therefore, in the remainder of this paper, it is assumed that γ < 0.
Black curve corresponds to monotonic models (γ = 0, and β2 > 0). Red and blue curves correspond to models with non-positive quadratic parameter
2.3. Modelling Association Structure
Several methods have been proposed in the literature to describe association
Figure 1. Examples of dose-response models.
between two outcomes. Here, we propose to use the global cross-ratio model  to characterize the association structure between toxicity and efficacy.
The joint probability P11 of efficacy and toxicity for each dose level j is defined as:
where and .
is the odds ratio describing the association between the toxicity and the efficacy.
Once are obtained, the other three joint probabilities can be recovered easily from the margin.
The bivariate logistic model is specified by modelling the marginal distributions and the odds ratio    .
2.4. Bayesian Framework
Bayesian methods for dose finding trials are attractive due to the sequential nature of the data collection in dose escalation trials. To perform a Bayesian analysis, we must specify a prior distribution for each parameter as well as the association parameter. In this paper and without loss of generality, non-informative priors are assumed for the bivariate logistic model parameters and for the odds ratios. Readers are encouraged to use different assumptions when prior knowledge is available.
For efficacy, toxicity as well as association model (using odds ratio), posterior distributions are obtained by MCMC using JAGS Software, and rjags package (R software)  .
After having specified priors, the Bayesian approach updates information on model parameters based on observed data. The posterior distributions of the model parameters are derived and used for decision making. These distributions are summarized by their mean and credible intervals. For the toxicity model, we are interested in finding a dose with a corresponding probability of DLT in the targeted toxicity interval. In general, the following classifications of the probability of DLT rate are used:
[0, 16%] corresponds to an under-dosing;
[16%, 33%] corresponds to a targeted toxicity;
[33%, 100%] corresponds to an excessive toxicity.
The following classifications of the probability of efficacy as well as the joint probability of efficacy and no toxicity are used:
[0, 30%] corresponds to an unacceptable activity;
[30%, 60%] corresponds to moderate activity;
[60%, 100%] corresponds to strong activity.
The efficacy’s thresholds above were chosen for illustration purpose and may be tuned to the specific study.
3. Simulation Study Set-Up
3.1. Dose-Finding Procedure
For dose escalation decisions, each new cohort will consist of 3 to 6 patients who will be treated at the specific dose level, but additional patients could be enrolled at any dose level to collect more data. The first cohort will be treated with the starting dose. Patients must have a minimum drug exposure and safety evaluation to be considered evaluable for dose escalation decisions. The adaptive Bayesian models described in the previous Sections incorporate all DLT and efficacy information to provide an estimate of all dose levels that do not exceed the MTD and satisfy a maximum activity. Indeed, after each cohort of patients was completed at a given dose level, the posterior distributions for the DLT rates and the posterior distributions of the activity rates as well as their joint distributions at all dose levels are calculated. However, in some situations when the efficacy data are not available for a given dose level, the efficacy model as well as the cross-ratio model could be run using only available efficacy data from previous dose levels. The outputs from modeling of dose-DLT and dose-efficacy relationships are combined to calculate the joint probability of efficacy and no toxicity, and the Bayesian inference provides a set of dose levels considered safe and active for future administration. Indeed, the next dose is selected using the following algorithm:
1) Select the dose or the doses which satisfy the Escalation with Over-dose Control (EWOC) criteria (less than 25% chance of excessive toxicity) using BDTM  among the possible doses.
2) Among the previously selected doses, choose the one with the highest joint probability of activity and no toxicity P01 using the cross-ratio model according the classification described above.
Dose escalation will continue until identification of the MTD/RD. This will occur when the following conditions are met:
1) At least 6 patients have been treated at this dose;
2) This dose satisfies one of the following conditions:
a) The posterior probability of targeted toxicity at this dose exceeds 50% and is the highest among potential doses (BDTM only).
b) The posterior joint probability of unacceptable activity is less than 10% and the posterior mean of Activity and no Toxicity rate is larger than 60% (BDEM+BDTM).
c) Aminimum of 21 patients have already been treated.
The thresholds above (10% and 60%) used to reject the doses with low activity and to select the doses with an acceptable activity without toxicity were chosen for illustration purpose and may be tuned to the specific study.
The example below illustrates dose escalation part of a phase I study. For each cohort, numbers of DLTs and tumor responses were evaluated. At each dose escalation meeting, the dose-finding procedure described above was used to guide dose recommendation. Tables below show the summary of the data used to guide the dose escalation for the first 3 cohorts (Table 1(a)), and a hypothetical scenario of the data that could be collected for the future cohorts (Table 1(b)).
Based on the results of the univariate models BDTM and BDEM as well as the bivariate model shown in Table 2, all the doses presented meet EWOC and maximal activity with no toxicity can be reached at 5 and 10 mg/kg dose levels. The RD could be declared at this range although the BDTM allows escalation to 20 mg/kg.
3.2. Operating characteristics
In order to show how the design performs, 5 hypothetical and relevant scenarios are investigated. We consider 5 dose levels ranging from 75 mg to 1200 mg, and
Table 1. (a) Data used for the dose escalation for the first 3 cohorts; (b) Hypothetical scenario for the future cohorts.
Table 2. Bayesian inference.
we assign different probabilities of toxicity and activity:
Scenario 1: Peak of Activity occurs inside the excessive toxicity set;
Scenario 2: Peak of Activity occurs at edge of acceptable set (close to 33% threshold);
Scenario 3: Peak of Activity occurs inside targeted toxicity set;
Scenario 4: Peak of Activity occurs inside under-dosing set;
Scenario 5: All the doses but the starting dose is too toxic.
For each scenario, data for 1000 trials were generated, with randomly chosen cohorts of size 3 to 6. The starting and maximal doses were chosen as 75 mg and 1200 mg respectively. The maximum number of patients per trial was set to 60. The trial was stopped using the same rules described in Section 3.1.
The parameter settings and simulation results for each scenario are respectively presented in Tables 3-7.
For each scenario, the proportion of patients allocated at each dose as well the percentage of selected dose as MTD were reported for the bivariate model (BDTM + BDEM) as well as when using BDTM alone. The last column in each table provides the average sample size as well as the percentage of trials stopped without determination of the MTD/RD.
The simulated operating characteristics show that the bivariate (BDEM + BDTM) model performs well and has good operating characteristics under the five hypothetical profiles investigated. The results from the bivariate model and the BDTM model are consistent and the former model performs better with higher proportion of selecting the MTD/RD within the expected interval. In Scenario 1, the simulations show that there is 71% of chance for identifying the RD within the targeted toxicity and strong activity interval, compared to only 36% when using the BDTM alone. While in Scenario 5, all the doses are too toxic except the starting dose, the simulations show that BDTM + BDEM and BDTM alone are consistent in enrolling only small proportion of patients to toxic doses. This outcome is expected as per the algorithm used which gives the priority to
Table 3. Scenario 1: Peak of Activity occurs inside the excessive toxicity set.
Table 4. Scenario 2: Peak of activity occurs at edge of acceptable set.
Table 5. Scenario 3: Peak of activity occurs inside targeted toxicity set.
Table 6. Scenario 4: Peak of activity occurs inside under-dosing set.
the safety. The chance to stop the trial due to high toxicity of all dose levels is higher in BDTM + BDEM than in BLRM alone. In Scenario 2, the simulations show that there is 82.1% of chance for identifying the RD at 300 mg although the peak of activity is reached at 600 mg. This is partly due to the fact that the
Table 7. Scenario 5: All the doses but the starting dose is too toxic.
probability of toxicity at 600 mg is close to the 33% level which is the lower bound of the excessive toxicity interval. However, in this scenario the bivariate model outperforms the BDTM in terms of RD selection and patients’ allocation to the doses with higher activity and acceptable toxicity. In Scenario 3, all the doses’ toxicity rates are below 33% and are considered within the acceptable toxicity set. Therefore, the bivariate model selected the RD at either 300 mg or 600 mg with a probability of 98.2%.
In Scenario 4, the simulations show that when using the bivariate model, only 6.5% of patients are treated at 600 mg level compared with 36.7% when using BDTM model alone. Although the simulations show similar average sample sizes for the two models, one can declare RD at 300 mg using the bivariate model and therefore reduce the duration of the study. In conclusion, the simulations performed illustrate that the bivariate model has good operating characteristics and outperforms the BDTM alone.
Several reviews report that toxicity has been the most prevalent endpoint used to define RD in phase I trials   . However, novel cancer therapies may not have dose dependent toxicities and there is a need of additional data and parameters to further guide the RD determination. The approach presented in this paper takes into consideration both safety and efficacy data and can successfully improve the determination of the RD by jointly modeling efficacy and toxicity. The bivariate model has a flexible structure that considers the toxicity-efficacy association and the marginal structure of the toxicity and efficacy models. Also, the BDEM model allows fitting different dose response shapes. In addition, the setting is flexible since it uses all the available data at any given time point and it doesn’t prevent escalation to proceed if only safety data are available and efficacy data are not critical for next decision. In practice, it is also still possible for the clinical team to override the model decision based on any additional available data and to select a different safe dose (i.e. target toxicity not exceeded). The results of the operating characteristics show that the bivariate model performs better than BDTM alone over a variety of true model settings.
On the other hand, the design methodology described in this paper can be modified to use a different toxicity modelling and generalized to take into consideration trials with combined drugs. A limitation of this design is that it depends on the availability of the tumor response. Close collaboration between clinicians and statisticians is needed to choose appropriate surrogate efficacy endpoints, and to define clinically relevant thresholds to be used for these decisions. Additional work is in progress to assess different forms of association between efficacy and toxicity as an alternative to the global cross-ratio model and their impact on the estimation of the joint probabilities. The use of continuous data for the efficacy model combined with PK data as well as toxicity models is also under investigation.
The authors acknowledge Daniel Lorand for his valuable comments that improved an earlier version of the article.
Mounir Aout is currently an employee of Hoffman-La Roche Ltd.
This paper contains the author’s current thinking on the subject. Any views expressed in this paper do not reflect those of author’s affiliations or any other entity.