In this paper, we propose a new diagnostic plotting method for the proportional hazards (PH) model  with continuous survival time , which may be right censored, and with possible time-dependent covariates or time-varying re- gression coefficients . The new method has some advantages over the existing methods in the literature, especially when the data do not satisfy any PH model or when the PH model is mis-specified. It provides an alternative diagnostic plot for model checking.
Denote the conditional hazard function and survival function of for given by and , respectively, or simply and . Let , be i.i.d. copies of with the distribution function . The PH model was first defined by , where is the baseline hazard function, is a covariate vector, is a para- meter vector, and are unknown, and does not depend on . The model is referred as the time-independent covariate PH (TIPH) model. This model has been extended in two ways: 1) the covariate is time-dependent, i.e., is a function of time ; 2) the regression coefficient is time-varying, i.e., is a function of time . For the time-dependent covariates PH (TDPH) model, Kalbfleisch and Prentice  distinguish the external ones from the internal ones.
It is often that the external time-dependent covariate can be written as , where is a function, are i.i.d. co- pies from the time-independent random vector and is a function of time t. A simple example of is , where is a function not depending on time . Without loss of generality (WLOG), we can assume . Two simple examples of are (a) and (b) , where is a constant but may depend on subject  . The PH model in case (b) is referred as the linearly-dependent PH model (LDPH model) hereafter. We also consider other forms of for the external case. For the time-varying regression coefficients PH model, in addition to the assumptions made on the TDPH model, one further assumes . A special case of this model is the piecewise PH (PWPH) model with cut-points  .
An important step in the data analysis under the PH model is to check whether the model is indeed appropriate for the data. To this end, it is desirable to have some diagnostic plotting methods for the PH model. In the literature, some diagnostic plotting methods under the semi-parametric set-up are designed to inspect whether the data follow the TIPH model , (i.e., is constant). One well-known diagnostic method for the PH model is the log-minus-log plots (log-log plots).
Several other graphical methods using residuals to check the PH model assumption have been proposed in the literature  . By plotting the estimator of the cumulative hazard functions evaluated at each Cox-Snell residuals against the residuals, one expects a straight line pattern if the data set satisfies the PH model. However, as mentioned by Baltazar-Aban and Pena  this plotting method has serious defect. This method may fail to identify the poor fitting in many circumstances. Martingale residuals and its cumulative sums are in- troduced to examine the functional form of the covariate    . By plotting the Martingale residuals without a covariate, say , against this missing covariate, one can observe the functional form of . And by plotting the score process versus follow-up time, one can examine the violation of the proportional hazards assumption. Scheike and Martinussen  later extend this method to the time-varying model. Deviance residuals  are then introduced. These methods provide more symmetric plots and are useful in identifying the outliers. Moreover, one may use Schoenfeld residuals or the cumulative sums form  . The residual plotting methods also induce several residual tests for the PH models.
We provide a new diagnostic plotting method for the PH model. The main idea is to plot the Kaplan-Meier estimator (KME) of against a proper estimator of the marginal distribution of under the selected model. Thus it is called the marginal distribution (MD) plot. The MD plot can be described as a 5-step procedure: 1) Fit the Cox model you have in mind to obtain the regression coefficients. 2) Choose a reference value for the covariate , say , such that there exist many observations in its neighborhood, say . 3) Estimate the survival function of for using . 4) Use the estimator in 3) to estimate the marginal survival function of . 5) Compare the estimator of the marginal survival function in 4) with the KME of .
The paper is organized as follows. In Section 2, we propose the MD plot and other supplementary diagnostic plots. In Section 3, we present simulation results on the performance of the plot. We also compare the MD plot to the current residual plots. In Section 4, we apply the new diagnostic plot to the long-term breast cancer follow-up data analyzed in Wong et al.  . Section 5 is a concluding remark, where we explain that the MD plot can also be served as a naive hypothesis test for the PH models.
2. The Main Results
The assumption and notations are given in 2.1. The idea of the marginal approach is introduced in 2.2. The method is explained in 2.3 and 2.4.
2.1. Assumptions and Notations
Let be the collection of all PH models specified by
where and are now possible vectors of functions of  . For model checking under the semi-parametric set-up, the null hypothesis is
where is a -dimensional random covariate vector and the baseline hazard function is unknown. Let be the collection of all possible joint dis- tribution functions of . Notice that does not need to belong to . Abusing notation, by , we mean that , where is a diagonal matrix with diagonal elements ’s.
Our method involves the mode, say a vector , of the distribution of the random vector U. That is, and , , where is a norm, e.g., , where .
Proposition 1. If satisfies Model (1), then for each , , where and .
In view of Proposition 1, hereafter, WLOG, we can assume that
AS1. The zero vector 0 is a mode of and it satisfies that .
Otherwise, let be a mode of and define , where . Then by Proposition 1, Model (1) is equivalent to another PH model, where takes place of the role of the baseline hazard function and is also unknown.
Since may not satisfy the PH model in the null hypothesis (see (2)), one can define a new conditional survival function of a new response variable, say , for given Z such that satisfies the PH model in the null hypothesis . Correspondingly, one can define the new marginal survival function of , say . That is,
Notice that and are equivalent if exists. Abusing notation, write . Notice that is a function of the unknown para- meter . Given the distribution function , if is not a function of , then is the almost sure limit of the maximum partial likelihood estimator (MPLE) of under (see Example 1 below), otherwise, it is conceivable that is some limiting point of the estimator of  .
Example 1. Assume that is a uniform distribution in the region , where is the set bounded by the four straight lines , , and , and is the set bounded by , , and . Then the family of distributions does not satisfy the PH model, and for . If one fits the TIPH model , with data from without knowing , then by (3), where , which is the limit of the MPLE based on the random sample from under . and are uniquely determined by and .
Lemma 1. If does not follow the model defined in (see (2)), then (a) , and (b) . Otherwise, (a) , (b) and (c) .
The proof of Lemma 1 is trivial and is skipped.
2.2. The Marginal Distribution Plot
Motivated by Lemma 1, the new plotting method we propose here is to plot an estimator of , say , against (the KME of ) together with the confidence band of , and
where , (see (3)), is a
consistent estimator of under , and is a consistent estimator of under the assumption that , even if the data do not satisfy the pre-assumed PH model in (see Remark 1). In the latter case, it is conceivable that gives a close image of If the graphs of and are close, then it suggests that in (2) is true. We thus call this plot the marginal distribution plot.
Since the main issue of the MD plot is the estimator and it is not trivial to find due to in (3), the main focus of the paper is to introduce how to construct . We shall explain in details how to obtain through various . We also give for the general piecewise continuous or .
For simplicity, we shall first explain our method when or is a univariate covariate and . We introduce the generalization to the case of a covariate vector (or matrix) in Remark 3 and to the time-dependent model for general in 2.3.4.
Suppose that are i.i.d. copies from a random vector , where may be subject to right censoring by censoring variable . The success of the MD plot relies on the proper estimators of , and . Denote them by , and , respectively. Since 0 is a mode of the covariate by assumption AS1, a consistent estimator of is the KME, denoted by , based on the data satisfying , where with (e.g., ), and is a given positive number (e.g., the inter-quartile-range or the standard deviation of ’s). WLOG, let the first observations be all the observations satisfying . If ’s are right censored, then the KME is based on , , where . For ease of explanation, we only consider the case of complete data in this section. The extension to the right censored data is straightforward and we present simulation results with right-censored data in Section 3. For the complete data, the KME of based on the first ob- servations is
will be introduced in details in 2.3 and
Remark 1. Since is a kernel estimator, it is well known that under certain regularity conditions, converges to in probability and
converges to in probability for each given ,
and . If the given PH model holds, then and we expect that the graphs of and are close, as and are consistent estimators of and , respectively. Otherwise it is likely that and two curves and are apart.
Remark 2. One may wonder whether in (4) can be replaced by the existing estimators of the baseline survival function under the PH model, denoted by . For instance, under the TIPH model, several consistent estimators of , say , can be obtained from the standard statistical packages. For example, the Breslow estimator of baseline survival function can be obtained by applying in R. However, if the given TIPH model does not satisfy the data, is inconsistent, whereas given in (4) is still consistent. In fact, given a joint distribution of a random vector , if it does not satisfy the TIPH model, there exists at least one pair of survival function and
vector satisfying , e.g., . It means
that the estimator using will always suggest that the given TIPH model fits the regression data even though the data set may not satisfy the TIPH model, and hence is not a proper choice. Simulation study in 3.1 suggests that under the assumption in Example 1, converges to in probability and using fails to identify the wrong model assumption.
2.3. Estimation of
We shall first illustrate the main idea through three typical cases when is a constant: 1) i.e., the TIPH model, 2) i.e., the PWPH model, 3) i.e., the LDPH model. We also discuss the general case of .
1) Case of , i.e., the TIPH model. Since by (3), define
where is a consistent estimator of under the selected PH model, e.g, the MPLE.
Remark 3. Even though in (4) and (5) is a random variable, it is easy to extend to the case that is a vector. Assume is a - dimensional random vector, is much larger than and is bounded. We can define or . In the simulation study, we know the mode of . In applications, the sample mode is not well defined. We choose a “sample mode” of ’s, say as follows: 1) Select a proper radius r and choose points such that in a neighborhood of with rasius r, , there are more than 20 (or ) observations. Of course, we do not want to be too large. 2) Among these points , choose the one, say , with the largest number of observations in among all . 3) Then set , , (in view of Proposition 1).
2) Case of , i.e., the PWPH model with one cut point. Now and
The covariate in (2) is , where is a 2 ´ 2 diagonal matrix. Then and
where may depend on the -th observation. And corresponds to a special case of the PWPH model with one cut-point. For the PWPH model with more than one cut-point, the derivation of is similar. Typically for the PWPH model with two cut-points,
where , the covariate , and is a 3 ´ 3 diagonal matrix. Then (3) yields
where is an estimator of .
3) Case of , i.e., the LDPH model. In this situation, is not a simple form in terms of . Let be defined as in (4) and let be the discontinuous points of for , we propose to estimate and by
and . The reason is as follows. Notice that
defined in (4) is a step function. It has the same consistency property as
, , , and at the discontinuous points of . Since
Write . At the discontinuous points of ,
leads to (7). is simpler than in implementation and they have the same asymptotic properties. Simulation results in section 3 suggest that is consistent under the selected PH model.
4) The case of other forms of or . It can be shown that the estimator in the previous three cases of are all in the form that it is a step function with discontinuous points , which are as the same as those of , and
where ’s are defined in the case of . It can be shown that if is piece-wise continuous in , then it also leads to a con- sistent estimator of .
2.4. Supplementary Diagnostic Plots
The MD plot needs to know . One possible way to conjecture the form of the function for a time-dependent covariate is to extend the PWPH plotting method in Wong et al.  as follows:
is a positive constant and belongs to the support of . For instance, for a PWPH model defined in (6),
which corresponds to two lines: and , and the cut-points can be determined from the PWPH plot (see Figure 3 in Section 3).
Let be all the distinct exact observations. In view of the ex- pression in (7) under the PH model with , if , then
where and are given in (11) and (12). Thus another diagnostic plotting method is
and check whether it appears as a two-piecewise-linear curve: one is and another one is . If so, it is likely that , where is the intersection of the two line segments. We shall call the plotting method the LDPH plot, as corresponding to the LDPH model. The advantage of the PWPH plot and the LDPH plot is that they provide clues on the cut-points, which are needed in the MD plot, unless the cut-point is given.
Remark 4. If the cut-points in the PWPH or TDPH model vary from observation to observation, then the PWPH plot as in (11) and LDPH plot as in (13) do not work. However the cut-points are also observations in such cases, in addition to ’s (in the case of complete data). Thus we do not need to guess the cut-points, and one can replace in (12) by
where is a predetermined reference cut-point and and are positive constants.
3. Simulation Study
In order to compare the MD plot to the other plots, we present three sets of simulation results in 3.1, 3.2 and 3.3, respectively. The mode is 0 as assumed in AS1. can be uniform or exponential distributions. The covariate can be discrete or continuous. The data do not need to be from a PH model. There is no unit in time due to the simulation.
3.1. Simulation Study under the Assumptions in Example 1
Two random samples of and pseudo random numbers are generated from U(0,1) distribution. For each , generate from with probability 0.5 and from with probability 0.5. These satisfy the assumptions given in Example 1. Let . Then the family of distributions does not satisfy the null hypothesis , but it can be shown that does.
The sample of size is only used for the MD plots in panels (1,3) and (3,3) of Figure 1 with data ’s and data ’s respectively.
3.1.1. The MD Plots Successfully Identify the Violation of the TIPH Model Assumption When Data Are (Yi,Zi)’s
Since does not satisfy the PH model, a proper estimate of is expected to deviate from . It is seen from the two MD plots with data ’s in panels (1,2) and (1,3) of Figure 1 that (a) when , lies most of the time outside or around the edge of the confidence band of ; (b) when , is totally outside the confidence band of . The MD plots suggest that our estimator appears quite different from even when and they suggest that the data , are not from the TIPH Model, as expected.
3.1.2. The MD Plots Correctly Support the TIPH Model for (Yi,Wi)’s
Since satisfies the PH model, a proper estimate of should be close to . It is seen from both of the MD plot with data ’s in panels (3,2) and (3,3) of Figure 1 that the curves of corresponding to and both lie within the confidence bands of . It is a strong evidence that the data ’s are from the TIPH model, as expected.
3.1.3. About Defined in Remark 2
To show that the consistent estimator of is the key in the MD approach, we
Figure 1. Diagnostic plots based on Example 1.
present in panels (1,1) and (3,1) of Figure 1 the plots of the curves of “ph est” ( ) and the curves of , together with its confidence band, with data ’s and ’s, respectively. It is seen that the curves of “ph est” and almost coincide for both data sets even when . These plots suggest that the curve of “ph est” is always close to the curve of the KME , regardless of whether the pre-assumed model fits the data. Thus is not a proper choice for the diagnostic plot.
3.1.4. About Residual Plots
In panels (2,1) and (4,1) of Figure 1, the residual plots using the score process by the cumulative martingale residuals method are plotted with data ’s and ’s, respectively. There is no big difference in pattern between these two plots. Notice that does not satisfy the TIPH model but does. Thus this residual plot can not tell whether the data satisfy the TIPH model in this example.
In panels (2,2) and (4,2) of Figure 1, the deviance residuals are plotted with data ’s and ’s, respectively. In panels (2,3) and (4,3) of Figure 1, the scaled Schoenfeld residuals are plotted with data ’s and ’s, respectively. There is no big difference in pattern between these two pairs of plots. Thus these two types of residual plots can not tell whether the data satisfy the TIPH model in this example.
3.1.5. About Residual Tests
We also carried out simulation study on the testing with data ’s and using the residual test in the existing R package. Our simulation study suggests that for , 100 or 200 and with a replication of 5000, the residual test does not reject the incorrect for more than of the time. Thus, it is not surprised that the residual plots do not work well.
3.2. Simulation Based on Data from the TIPH Model
A sample of complete data with is generated from the TIPH model: , where , and . Panels (1,1), (1,2), (1,3) and (3,1) in Figure 2 presents four different plots by fitting the data set into the TIPH model . Thus the covariate is mis-specified.
The second sample of complete data with size is generated from the model , , where , and , . Panels (2,1), (2,2), (2,3) and (4,1) in Figure 2 presents 4 plots by fitting the data set into the TIPH model . Thus the covariate is correctly specified.
3.2.1. Residuals Plots
Panels (1,1) and (2.1) in Figure 2 display the residual plots using the score process by the cumulative martingale residuals method by fitting data from the first and the second TIPH model respectively into . It is
Figure 2. Diagnostic plots based on data from Model: or .
not easy to distinguish the mis-specified model from the correct one by this pair of residual plots.
Similarly, in Figure 2, Panels (3,1) and (4,1) display the scaled Schoenfeld residual plots for fitting the first and second samples, respectively, into the TIPH model . Moreover, Panels (1,2) and (2,2) present the log-log plots in a similar pattern. Again, those methods do not provide enough information about the overall fitting of the models.
3.2.2. Residuals Tests
In fact, with the data from the first mis-specified model and with a moderate sample sizes , our simulation study with a replication of 5000 suggests that the residual test in the existing R package (e.g., ) would not reject the mis-specified TIPH model for more than of the time. Thus it is not surprised that the residual plots cannot detect the mis-specified TIPH model.
3.2.3. MD Plots
The MD plot in panel (1,3) fits the mis-specified TIPH model with data from the first sample. It successfully identifies that the functional form of the covariates is mis-specified for the first data set, as is almost totally outside the confidence band of . In other words, the MD approach suggests that the first data set does not follow the PH model . On the other hand, the MD plot in panel (2,3) successfully identifies that the functional form of the covariates is correct for the second data set, as is totally inside the confidence band of .
Based on the second sample, the modified PWPH plot and the LDPH plot are displayed in panels (3,2) and (4,2); the MD plots under the PWPH and LDPH Models are displayed in panels (3,3) and (4,3). The PWPH plot in panel (3,2) suggests that the data are either from a TIPH Model, or from a PWPH model with one cut-point at . The MPLE of the regression coefficient under the TIPH Model is and the MPLE of the regression coefficients under the PWPH model with is with . Both and are not significantly different from , as their differences from are within two SEs. Both MD plots in panels (2,3) and (3,3) suggest that the PWPH model with at most one cut-point fits the data, as expected, as both curves of are totally inside the confidence band of .
The LDPH plot in panel (4,2) suggests that the LDPH Model may fit the data, but it is seen from panel (4,3) that even within the interval [0,1], only less than 30% of the curve of lies inside the confidence band of . Thus the MD plot suggests that the data are not from the LDPH Model, as expected. It is seen that the LDPH plot performs not as good as the MD plot.
3.3. Simulation on the LDPH Model
A sample of right-censored data is generated under the LDPH Model, where , , , and . has a Poisson distribution with mean 1, and . It is subject to right censoring and the right censoring variable .
The modified PWPH plots and the LDPH plot are given in panels (1,1), (2,1) and (3,1) in Figure 3. The MD plots and the qqplots for the TIPH, PWPH and LDPH Models are given in the second and the third columns of Figure 3.
Both the PWPH plot and the MD plots with corresponding qqplot in panels (1,1), (1,2) and (1,3) suggest that the data are not from the TIPH Model. We
Figure 3. Diagnostic plots based on data from the LDPH Model.
need the information from the PWPH plot to decide the cut-point needed in the MD plots. The PWPH plot in panel (2,1) suggests that the data may be from a PWPH model with a cut-point satisfying , that is, . However, the MD plot in panel (2,2) suggests that the data are not from the PWPH model. This again indicates that the MD plot performs better than the modified PWPH plot.
The LDPH plot in panel (3,1) in Figure 3 suggests that the data are from the LDPH Model with the cut-point in . The MD plot and the cor- responding qqplot in panels (3,2) and (3,3) suggest that the data are from the LDPH Model, as expected. Thus the LDPH plot has the advantage that it can suggest the cut-point in the case that is not given in the LDPH Model. Since is discrete in this case, the PWPH plot and the LDPH plot perform better than those in the previous simulation studies when is continuous.
4. Data Analysis
A common situation that will involve the use of the PH model is a long-term clinical follow-up study. In such a study, the impact of a prognostic variable may change at different time periods. This is the case in the breast cancer data analyzed in Wong et al.  . The data are obtained from an Institutional Review Board approved long-term clinical follow-up study on 371 women with stages I-III unilateral invasive breast cancer treated by surgery at Memorial Sloan- Kettering Cancer Center in New York City between 1985 and 2001. The median follow-up time of the study is 7.4 years (range is 1 month-180 months (14.8 years)), which is the longest among published studies on bone marrow micro- metastasis (BMM).
One objective of the study is to investigate whether tumor diameter is significant in predicting early or late relapse. Then the relapse time is the response and the tumor diameter is the covariate. Clinical consideration and survival plots suggest late failure can be considered at time greater than 5 years from initial breast cancer surgery. Data analysis based on a PWPH model with two cut-points at 2 years and 5 years with covariate is carried out in Wong et al.  . We apply our diagnostic plotting methods to check whether such a PWPH model fits our data in the case that is continuous and is discretized as a dichotomized variable ( cm or cm), as is done in Wong et al.  . In the latter case, we try the log-log plots as well.
We apply our data to the TIPH model with covariate (which is basically continuous), the regression coefficient is and is significant. We shall only present our new methods for the case that the tumor diameter is continuous in panel (2,1) of Figure 4. The modified log-log plots in panel (1,1) seems to suggest that the TIPH Model fits the data, but the MD plot in panel (2,1) suggests that the TIPH Model does not fit our data, as the curve of lies almost totally outside the confidence band of . Thus the partial like- lihood estimate is not valid.
Wong et al.  suggest to do data analysis by dichotomizing according
Figure 4. Diagnostic plots for cancer data with the tumor diameter measurements.
to or . Then the covariate is discrete, and the standard log-log plots, as well as the PWPH plot proposed by Wong et al.  , is applicable. These two plots are given in panels (1,2) and (1,3) of Figure 4. In view of the graph, the log-log plots suggest that the TIPH Model does not fit the discretized data. Moreover, it does not present useful information on the other possible PH models. However, the PWPH plot in panel (1,3) does suggest that the PWPH model with two cut-points close to 1.5 years and 6 years fits the discretized data. Wong et al.  suggest to do data analysis using a PWPH model with the cut-points at 2 and 5 years (which are close to 1.5 and 6). Thus we further compute the partial likelihood MLE for model with . The estimates , and are −1.76, 2.02 and 0.34, with p-value 0.77, 0.001 and 0.52, respectively. In panels (2,2) and (2,3) of Figure 4, we provide the MD plots based on the discrete and continuous covariates, respectively, for fitting the PWPH model. In particular, since the curve of lies totally inside the confidence band of in panel (2,2), the MD plot suggests that the PWPH model considered in Wong et al.  fits the discretized data. Thus their data analysis is valid. However, the MD plot in panel (2,3) indicates that the original data set with the continuous covariate does not fit a PWPH model, as the curve of lies totally on one edge of the confidence band of .
5. Concluding Remarks
Our simulation results and the data analysis suggest that the MD plot has certain advantages over the existing residual plots, especially when the null hypothesis in (2) is mis-specified or the data are not from any PH model. Our MD plot does not involve residuals studied in the literature, and this is the first difference between the residual approaches and the MD approach.
5.1. Drawback and Remedy of the MD Plot
The MD approach is closely related to in (2) with , where the parameter is unknown but is given. The MD plot is applicable to all PH models with and with all types of covariates , provided that is given. The assumption of a given may be viewed as a drawback of the MD plot if one wants to find a certain PH model to fit the data. There are several ways to overcome this drawback.
1) For the case that , we also propose a modified PWPH plot and LDPH plot in 2.4 for inferring the functional form of .
2) One can apply the MD approach to several possible typical models. For instance, in 3.2 and 3.3, given a data set, the MD approach finds which of the 3 semi-parametric PH models fits.
3) Of course, one can also make use of the existing residual approaches in the literature for guessing . There is no harm to inspect all diagnostic plots available based on the data.
On the other hand, the MD plot can further check the validity of the function forms suggested by the existing residual plots. As illustrated in the paper, with the help of the confidence band of the KME , it is more reliable and more informative than the residual plots on whether the model suggested by the residual plots is appropriate for the data. Thus the MD plot is a nice complement to the existing diagnostic methods, not a replacement to them.
5.2. A Naive but Valid Model Test
As seen from the simulation results, when the data are not from any PH model (see Example 1 in 3.1) or is mis-specified (see simulation on the TIPH Model in 3.2), the MD plots can successfully reject , provided that is large enough . Thus the MD plot can play a role of a model checking test , that is,
On the contrary, in such case, our simulation results suggest that the residual plots often cannot detect that is incorrect and the residual tests often do not reject . There must be some reasons.
As summarized in Therneau and Grambsch  “Interestingly, nearly all of the tests for proportional hazards that have been proposed in the literature are tests” “and differ only in their choice of the time transform g(t)”. The parameter space for the residual approach contains all distributions that follow Model (1) with , where . For instance, in the case of Example 1, let . So a residual test is to test v.s. , instead of testing in (2) v.s. is not true. The parameter space for testing v.s. is a subset of the para- meter space for testing v.s. , where is the collection of all , where is continuous. When the data are not from any PH model or is mis-specified, the residual tests are not valid. Moreover, in that case, it is some- what true that most of the time, and thus when is not rejected, it is likely to make type II error. In real applications, we do not know whether the data are indeed from some PH model, or whether we select a correct sub-model of the PH model.
Remark 5. In summary, if the existing residual tests reject , the decision is likely to be correct. Otherwise, we have no confidence to believe that the test is valid. In this regard, it is interesting to point out that the parameter space for the aforementioned “test” induced by the MD plot is actually . Thus the MD approach is valid even if . It can also detect the incorrect model assumption by testing the new null hypothesis when is not too small, but the residual approach cannot accomplish this task. This is the difference between the MD approach and the residual approach. In application, if is not rejected, the residual tests often make type II error, and we strongly suggest to consider the MD approach then.
Of course, very often, neither case (I) nor (II) happens, then we need some more rigorous model checking tests. The difference of and is a natural choice for a test statistic, and it is one of our on-going research. The other is to extend the MD approach to other common regression models, such as the linear regression model and the generalized additive model.
We thank the Editor and the referee for their comments.
 Baltazar-Aban, I. and Pena, E. (1995) Properties of Hazard-Based Residuals and Implications in Model Diagnostics. Journal of the American Statistical Association, 90, 185-197.
 Scheike, T.H. and Martinussen, T. (2004) On Estimation and Tests of Time-Varying Effects in the Proportional Hazards Model. Scandinavian Journal of Statistics, 31, 51-62.
 Tian, L., Zucker, D. and Wei, L.J. (2005) On the Cox Model with Time-Varying Regression Coefficients. Journal of the American Statistical Association, 100, 172-183.