Back
 OJS  Vol.11 No.5 , October 2021
A Flexible Joint Longitudinal-Survival Model for Analyzing Longitudinally Sampled Biomarkers
Abstract: We propose a flexible joint longitudinal-survival framework to examine the association between longitudinally collected biomarkers and a time-to-event endpoint. More specifically, we use our method for analyzing the survival outcome of end-stage renal disease patients with time-varying serum albumin measurements. Our proposed method is robust to common parametric assumptions in that it avoids explicit specification of the distribution of longitudinal responses and allows for a subject-specific baseline hazard in the survival component. Fully joint estimation is performed to account for uncertainty in the estimated longitudinal biomarkers that are included in the survival model.

1. Introduction

In this paper, we propose a flexible joint longitudinal-survival framework for quantifying the association between longitudinally measured biomarkers (e.g., serum albumin) and time-to-death among end-stage renal disease (ESRD) patients. ESRD is a condition where the filtration performed by the kidneys has been reduced to a point at which life can no longer adequately be sustained. According to data from the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), over 850,000 persons in the United States are being treated for ESRD and many more suffer from early stage chronic kidney disease. The standard of care for adult ESRD patients that do not have access to a viable transplant is hemodialysis.

Hemodialysis patients experience high rates of morbidity and mortality. Multiple epidemiologic studies have shown that indices of protein-energy malnutrition (PEM) are a strong predictor of total mortality among hemodialysis patients [1] [2] . Serum albumin, a protein biomarker and surrogate for nutritional status, is among the indices of PEM that have been associated with mortality. Fung et al. found that among hemodialysis subjects, baseline albumin level and the slope of albumin over time were independent risk factors for mortality, suggesting that the low albumin level is not simply a consequence of co-morbidities associated with dialysis but may be a precursor [1] . This analysis did not, however, consider other potential characteristics of the within-subject changes in serum albumin that may also be associated with mortality in this high risk population. It is natural to hypothesize that the rate of change in albumin at a given time and high within-subject variability in serum albumin measured over time may also be indicative of increased mortality. That is, non-linear patterns or high instability around a patient’s first-order trend are likely an indication of nutritional instability (possibly due to inadequate dialyzing) and hence may be a risk factor for morbidity and mortality. While such associations are plausible, they have not been considered in the literature to the best of our knowledge. The reason for this is, at least partly, due to the difficulty in summarizing and efficiently estimating flexible subject-specific longitudinal trajectories in the context of a joint-longitudinal modeling framework. If these hypotheses were established, however, the resulting summary measures would provide nephrologists with additional biomarkers to monitor ESRD patients and potentially decrease the risk of mortality in these patients.

Survival analysis often involves evaluating effects of longitudinally measured biomarkers on mortality. When longitudinal measures are sparsely collected, incomplete, or prone to measurement error, including them directly as a traditional time-varying covariate in a survival model may lead to biased regression estimates [3] . To address this issue, one could apply a two-stage method, where the first stage consists of modeling the longitudinal components via a mixed-effects model, and in the second stage, the modeled values or their summaries (e.g., first-order trends) are included in a survival model [4] [5] . Standard approaches for analyzing longitudinal covariates include frequentist mixed effects models [6] [7] and Bayesian hierarchical models [8] [9] . In general, these approaches parameterize the population mean model as a function of potentially time-varying covariates, subject-specific deviations from the population mean via random effects, and residual variability by subject level variance-covariance matrices that generally account for serial auto-correlation. More recently, several authors have proposed Gaussian process (GP) models as an alternative to more standard longitudinal regression methods since they easily allow for flexible model specification in a coherent probabilistic framework [10] [11] .

Two-stage methods in general fail to account for uncertainty in the estimated longitudinal summary measures. To overcome this issue, several joint longitudinal-survival models have been proposed [3] [12] - [20] . These models account for uncertainty in longitudinal measures by modeling them simultaneously with the survival outcome. However, most existing joint models still rely on multiple restrictive parametric and semi-parametric assumptions and generally focus only on associating the first moment of the distribution of the longitudinal covariate with survival. In this paper, we address these issues by developing a flexible joint longitudinal-survival framework that avoids simple distributional assumptions on longitudinal measures and allows for subject-specific baseline hazard in modeling the survival outcome.

In the current work, we present a joint longitudinal-survival model where flexibility is achieved in the longitudinal component via the use of a Gaussian process prior with a parameter that captures within-subject volatility in time-varying biomarkers. The survival component of our proposed method quantifies the association between the longitudinally measured biomarkers and the risk of mortality using a Dirichlet process mixture of Weibull distributions. The clustering mechanism of the Dirichlet process provides a framework for borrowing information when estimating subject-specific baseline hazards in the survival component. Estimation for the longitudinal and survival parameters is carried out simultaneously via Bayesian parameter posterior sampling approach. In contrast to most competing models, our proposed method provides the following advantages: 1) it avoids relying on restrictive parametric assumptions; 2) it properly accounts for variability in longitudinal measures; 3) it provides a natural framework to capture the association of both first and second moments of the distribution of the longitudinal covariate with survival; and 4) its underlying mechanism for clustering patients can help clinicians to design more personalized treatments.

The remainder of the manuscript is organized as follows: In Section 2 we present the underlying methodology for our proposed joint longitudinal-survival estimation framework. This section also provides examples of model formulations that allow for the estimation of associations between various summary measures of a longitudinal trajectory and the risk of an event. Section 3 provides details on posterior evaluation of the presented model. In Section 4 we consider the empirical performance of the proposed methods using simulation studies. Section 5 considers the application of our model to assess the relationship between longitudinally sampled serum albumin and the risk of death among ESRD patients. Specifically, we consider data from a nutritional sub-study cohort of N = 1112 hemodialysis patients included in the United States Renal Data System (USRDS) [21] . Using our model, we show that real-time albumin, the derivative of the albumin trajectory, and the volatility of albumin are all significantly associated with mortality in ESRD patients. We conclude in Section 6 with a discussion of the proposed methods and potential future directions.

2. Methodology

In this section, we provide the details of our proposed joint modeling framework for a longitudinal covariate, X , and a survival outcome, Y . Throughout this section, we consider n independent subjects where l i longitudinal measurements, X i j , are obtained for subject i at time points t i j , j = 1 , , l i . Also, associated with each subject, there is an observed survival time, Y i min { T i , C i } and event indicator δ i 1 [ Y i = E i ] , where T i and C i denote the true event and censoring time for subject i, respectively. Further, we make the common assumption that C i is independent of T i for all i, i = 1 , , n .

2.1. The Joint Model

Our interest lies in estimating the effect of patient-specific longitudinal measures on a survival outcome. As such, we begin by specifying the joint model likelihood using a similar approach as [16] , where we define the contribution of each subject to the joint model likelihood as the multiplication of the likelihood function of the longitudinal measures for that subject and her/his time-to-event likelihood that is conditioned on her/his longitudinal measures. Let f L ( i ) , f S | L ( i ) , and f L , S ( i ) denote the longitudinal likelihood contribution, the conditional survival likelihood contribution, and the joint likelihood contribution for subject i. One can write the joint longitudinal-survival likelihood function as

f L , S = i = 1 n f L , S ( i ) = i = 1 n ( f L ( i ) × f S | L ( i ) ) . (1)

2.2. Longitudinal Component

We motivate the development of the Gaussian process model for the longitudinal biomarker by first considering the following simple linear model for estimating the trend in the biomarker for a single subject i with an l i × 1 vector of measure biomarkers of X i which is of the form

X i = ( X i ( t i 1 ) X i ( t i 2 ) X i ( t i l i ) ) , (2)

where

X i | β i 0 ( L ) N ( β i 0 ( L ) , Σ i ) . (3)

with β i 0 ( L ) as the subject-specific intercept, β i 0 ( L ) is vector of repeated β i 0 ( L ) value that is of size l i × 1 , and Σ i = σ 2 I l i × l i .

By adding a stochastic component that is indexed by time in the model, one can extend the model to capture non-linear patterns over time. Specifically, we consider a stochastic vector, W , that is a realization from a Gaussian process prior, W ( t ) with mean zero and covariance function C ( t , t ) . Thus for subject i, W i N l i ( 0 , C l i × l i ) , where W i = ( W t i 1 , , W t i l i ) and the ( j , j ) element of C l i × l i is given by C ( t i j , t i j ) , j , j { 1, , l i } . We characterize the covariance function, C l i × l i , using the following squared exponential form

C l i × l i ( j , j ) = κ i 2 e ρ 2 ( t i j t i j ) 2 . (4)

In this setting, the hyperparameter ρ 2 controls the correlation length, and κ 2 controls the height of oscillations [22] , and t i j and t i j are two different time points. For notational simplicity, we define K i = e ρ 2 ( t i j t i j ) 2 ; j , j { 1, , l i } , and re-write our longitudinal model as

X i | β i 0 ( L ) , κ i 2 , ρ 2 , σ 2 N ( β i 0 ( L ) , κ i 2 K i + σ 2 I l i × l i ) , (5)

where σ 2 is assumed to be common across all subjects. The correlation length parameter ρ 2 controls the maximum distance in time between two time-dependent measurements to be still correlated. This distance for GP models is often called the practical range. Diggle and Ribeiro defined the practical range for GP as the distance in time between two time-dependent measurements where the correlation between those two measurements is 0.05 [23] . With the squared exponential covariance function, that practical range distance is of the form 3 / ρ 2 . For ρ 2 = 0.1 , the practical range distance is 5.7 months, a reasonable range for albumin trajectories among patients with end-stage renal disease in the USRDS data motivating the present work. As such, we fix ρ 2 to 0.1 for the remainder of the empirical assessments presented in this paper. With the longitudinal model specified in (5), the subject-specific parameter κ i 2 will have the role of capturing within-subject volatility of the longitudinal measures. In the context of the motivating USRDS example, κ i 2 will be of primary scientific interest as it reflects the within-subject volatility (Figure 1) in serum albumin over time, which we hypothesize to be negatively correlated with longer survival time.

We complete the specification of the longitudinal component of our joint model by extending (5) include a Gaussian process component such that

X i | W i , β i 0 ( L ) , κ i 2 , ρ 2 , σ 2 N ( β i 0 ( L ) + W i , σ 2 I l i × l i ) , (6)

where X i is a vector of longitudinal measures on subject i, W i is a Gaussian process stochastic vector, β i 0 ( L ) is subject specific intercept for subject i, κ i 2 is a subject-specific measure of volatility for subject i, ρ 2 is a fixed correlation length, σ 2 is a measurement error that is shared across all subjects, and finally I l i × l i represents the identity matrix of size l i where l i is the number of longitudinal measures on subject i. The Gaussian process stochastic vector W i is distributed as a Gaussian process,

W i | κ i 2 , t i i n d G P m i ( 0 , κ i 2 K i ) , (7)

where t i is a vector of the time points at which longitudinal measures on subject i were collected and K i = e ρ 2 ( t i j t i j ) 2 , with t i j and t i j are the j t h and j t h element of the time vector t i . We assume a Normal prior on the subject-specific random intercepts β i 0 ( L ) that is of the form

β i 0 ( L ) i . i . d N ( μ β 0 ( L ) , σ β 0 L 2 ) ,

Figure 1. With a fixed correlation length parameter ρ 2 , κ 2 parameter captures volatility in Gaussian process models with the squared exponential covariance function. In each plot, ten random realizations of the Gaussian process were selected (represented by the varying curves). Subfigure (a) has a κ 2 parameter of 0.01, subfigure (b) has a κ 2 value of 0.5, and subfigure (c) has a κ 2 value of 1.0. In all plots, correlation length ρ 2 is fixed to 0.1.

where μ β 0 ( L ) and σ β 0 L 2 are prior mean and prior variance respectively. κ i 2 , where i { 1, , n } with n as the number of subjects in the study, are assumed

to have a log-Normal prior with the prior mean μ κ 2 and the prior variance σ κ 2 that is of the form

κ i 2 i . i . d log-Normal ( μ κ 2 , σ κ 2 2 ) . (8)

As previously noted, the correlation length ρ 2 is assumed to be fixed and known in our model (taken to be 0.1 for the empirical results presented here). Finally, the measurement error σ 2 is assumed to have a log-Normal prior of the form

σ 2 log-Normal ( μ σ 2 , σ σ 2 2 ) , (9)

where μ σ 2 and σ σ 2 are the prior mean and the prior variance, respectively.

2.3. Survival Component

In order to quantify the association between a longitudinal biomarker and a time-to-event outcome, we define our survival component by using a multiplicative hazard model of the form:

λ ( T i | Z i ( s ) , Z i ( L ) ) = λ 0 ( T i ) e ζ ( s ) Z i ( s ) + ζ ( L ) Z i ( L ) ( t ) , (10)

where Z i ( s ) is a vector of baseline covariates, Z i ( L ) is a vector of longitudinal covariates from the longitudinal component of the model, λ 0 ( T i ) denotes the baseline hazard function, and ζ ( s ) and ζ ( L ) are regression coefficients for the baseline survival covariates and the longitudinal covariates, respectively.

We consider a Weibull distribution for the survival component to allow for log-linear changes in the baseline hazard function over time. Thus we assume

T i Weibull ( τ , λ i ) , (11)

where T i is the survival time, τ is the shape parameter of the Weibull distribution, and exp { λ i } is the scale parameter of the Weibull distribution. One can write the density function for the Weibull distribution above for the random variable T i as

f ( T i | τ , λ i ) = τ T i τ 1 e λ i exp ( λ i ) T i τ . (12)

In this case, the Weibull distribution is available in closed form providing greater computational efficiency. Under this parameterization, covariates can be incorporated into the model by defining λ i = ζ ( s ) Z i ( s ) + ζ ( L ) Z i ( L ) . In particular, we specify our model as

T i | τ , ζ ( s ) , ζ ( L ) , Z i ( s ) , Z i ( L ) Weibull ( τ , λ i = β i 0 ( s ) + ζ ( s ) Z i ( s ) + ζ ( L ) Z i ( L ) ) , (13)

where τ is a common shape parameter shared across all subjects. β i 0 ( s ) is a subject specific coefficient in the model which allows for a subject-specific baseline hazard. Z i ( s ) and ζ ( s ) are baseline covariates and their corresponding coefficients, respectively. Finally, Z i ( L ) and ζ ( L ) are coefficients linking the longitudinal parameters of interest to the hazard for mortality.

In order to avoid an explicit distributional assumption for the survival times, we specify our survival model as an infinite mixture of Weibull distributions that is mixed on the β i 0 ( s ) parameter. In particular, we use the Dirichlet process mixture of Weibull distributions that is defined as

β i 0 ( s ) | μ i , σ β 0 ( s ) 2 N ( μ i , σ β 0 ( s ) 2 ) , (14)

μ i | G G , (15)

G D P ( α ( S ) , G 0 ) , (16)

where σ β 0 ( s ) 2 is a fixed parameter, μ i is a subject-specific mean parameter from a distribution G with a DP prior, α ( S ) is the concentration parameter of the DP and G 0 is the base distribution. By placing a Dirichlet process prior on the distribution of β i 0 ( s ) , we allow patients with similar baseline hazards to cluster together which subsequently provides a stronger likelihood to estimate the baseline hazards. For other covariates in the model, we assume a multivariate normal prior of the form

( ζ ( s ) , ζ ( L ) ) M V N ( 0 , σ 0 2 I ) , (17)

where σ 0 2 is a prior variance and I is an identity matrix.

The shared scale parameter τ is considered to have a Log-Normal prior of the form

τ log-Normal ( a τ , b τ ) , (18)

where a τ and b τ are fixed prior location and prior scale parameters, respectively.

Finally, we assume that information about the concentration parameter of the Dirichlet process can be specified with the prior

α ( S ) Γ ( a α ( S ) , b α ( S ) ) , (19)

where a α ( S ) and b α ( S ) are fixed prior shape and prior scale parameters, respectively.

2.4. Linking the Two Components

The proposed modeling framework easily allows for associating multiple summaries of the longitudinal biomarker with the time-to-event outcome. Here we consider three alternative models that incorporate various summary measures of the longitudinal trajectory that are easily and flexibly estimated using the GP model presented in Section 2.2:

Model I: directly modeling longitudinal outcome at each event time t as a covariate in the survival model:

Z i ( L ) = X i ( t ) (20)

Model II: modeling both the value of the longitudinal covariate and also the average rate at which the biomarker changes for each subject. We define this average rate as a weighted area under the derivative curve of the biomarker trajectory

Z i ( L ) = ( X i ( t ) , X A U C τ 0 τ 1 ) with X A U C τ 0 τ 1 = τ 0 τ 1 Q ( u ) X ( u ) d u . (21)

X A U C τ 0 τ 1 is a time-dependent covariate that is a weighted average of the derivative of the biomarker trajectory, that is denoted by X ( u ) from τ 0 to τ 1 where τ 1 is the time of death for each subject. This average area under the derivative curve can be a weighted average with weights Q ( u ) .

Model III: modeling summary measures of the longitudinal trajectory. Motivated by our primary scientific question of interest, we lastly consider random intercepts and subject-specific volatility as summary measures of interest:

Z i ( L ) = ( β 0 i ( L ) , κ i 2 ) (22)

Below, we will explain these three models in more detail.

2.4.1. Model I: Associating Survival at Time t and the Longitudinal Biomarker at Time t

This model quantifies the association between a longitudinal biomarker of interest and the time-to-event outcome by directly adjusting for the biomarker measured values in the survival component. While biomarkers are usually measured at discrete times, the survival event of interest happens on a continuous basis. Frequentist models most commonly use the so-called last-observation-carried forward (LOCF) technique where the biomarker value at each even time is assumed to be the same as the last measured value for that biomarker. In contrast, our joint flexible longitudinal-survival model provides a proper imputation method for the biomarker values at each individual’s event time. In particular, in each iteration of the MCMC, given the sampled parameters for each individual and by using the flexible Gaussian process prior, there exists posterior trajectories of biomarker for that individual. Our method then considers the posterior mean of those trajectories as the proposed trajectory for that individual’s biomarker values over time at that iteration. The trajectory, then, can be used to impute the time-dependent biomarker covariate value inside the survival component. More specifically, consider the longitudinal biomarker X i of the form

X i | β i 0 ( L ) , κ i 2 , ρ 2 , σ 2 N ( β i 0 ( L ) , κ i 2 K i + σ 2 I l i × l i ) , (23)

where β i 0 ( L ) is subject-specific random intercept for subject i, β i 0 ( L ) is a vector of repeated subject-specific intercept β i 0 ( L ) that is of size l i × 1 , κ i 2 is subject-specific measure of volatility in the longitudinal biomarker for individual i, ρ 2 is a fixed measure of correlation length, σ 2 is the measurement error shared across all subjects, K i is a an l i × l i matrix with it’s j j element as K i j j = e ρ 2 ( t i j t i j ) 2 where l i is the number of longitudinal biomarker measures on subject i, and I l i × l i is the identity matrix.

For a new time-point t * , predicted albumin biomarker for individual i is X * and can be written as

X * | X i , t , t * N ( μ * , Σ * ) , (24)

where the conditional posterior mean μ * is

μ * = β i 0 L + K ( t * , t ) K X 1 ( X i β i 0 ( L ) ) , (25)

and the conditional posterior variance Σ * is

Σ * = K ( t * , t * ) K ( t * , t ) K X 1 K ( t * , t ) , (26)

where K ( t * , t ) is defined as

K ( t * , t ) = κ i 2 e ρ 2 ( t * t ) 2 , (27)

and K X 1 is defined as

K X 1 = ( K ( t , t ) + σ 2 I l i × l i ) 1 . (28)

In order to relate the biomarker value at each time point t to the risk of the event of interest at that time point, we formulate the survival component of the model as

T i | τ , ζ ( s ) , ζ X i Weibull ( τ , λ i = β i 0 ( s ) + ζ ( s ) Z i ( s ) + ζ X i X i ( t ) ) , (29)

where T i is the survival time, τ is the shape parameter of the Weibull distribution, ζ ( s ) is a vector of coefficients relating baseline survival covariates to the risk of the occurrence of the event of interest, ζ X i is the coefficient that relates the biomarker value at time t and the risk of the event occurring at that time point, λ i is the natural log of the scale parameter in the Weibull distribution, β i 0 ( s ) is the subject-specific baseline hazard for subject i, Z i ( s ) is a vector of survival coefficients, and X i ( t ) is the biomarker value at time t.

2.4.2. Model II: Associating Survival at Time t with the Longitudinal Biomarker Value and the Derivative of the Biomarker Trajectory at Time t

It is natural to hypothesize that, beyond the biomarker value, the direction of local change in the biomarker value may be of value in predicting survival. To address this, we can extend our proposed model I by including a measure of the marginal slope of the biomarker over a specified interval of time. In particular, we define this average slope from time τ 0 to τ 1 as the area under the derivative of the trajectory curve of the biomarker from τ 0 to τ 1 . More generally, this area under the curve can be a weighted sum where weights are chosen according to the scientific question of interest. One may hypothesize that the area under the derivative curve that is closer to the event time should be weighted higher compared to times that are farther away from survival time t. In general, we define a weighted area under the derivative curve of the form

X A U C τ 0 τ 1 = τ 0 τ 1 Q ( u ) X ( u ) d u , (30)

where τ 0 and τ 1 are arbitrary time points chosen according to the scientific question of interest, Q ( t ) is a weight, and X ( t ) represents the derivative of the biomarker evaluated at time t. In particular, we consider two weighted approaches. The first specifies a weight of the form

Q ( t ) = 1 τ 1 τ 0 . (31)

The second specifies a weight of the form

Q ( t ) = ( 1, if t = T i 0, otherwise . (32)

Under the first weighting scheme, X A U C will be the area under the derivative with equal weights, whereas the second weighting scheme leads to the pointwise derivative value at the event time. Under this model, the survival component of our joint model will now include two longitudinal covariates: the biomarker value X i ( t ) , and the average derivative of the biomarker trajectory, X A U C .

The derivative of the Gaussian process is still a Gaussian process with the same hyperparameters ρ 2 and κ 2 . Therefore, using the same idea of modeling the trajectory of the biomarker, we can also model the derivative of the trajectory. In Model I, we proposed using the posterior mean of all plausible biomarker trajectories as the proposed trajectory for each subject in order to impute biomarker values at any time point t inside the survival component of the model. Similarly, we propose using the posterior mean of all plausible derivative trajectories for each subject in order to compute the average derivative up until time t. Given the fact that differentiation is a linear operation, one can easily compute the posterior mean of the derivative curve by simply switching the order of the differentiation and the expectation as

E ( X i ( t ) ) = E ( X i ( t ) t ) = ( E ( X i ( t ) ) ) t . (33)

Hence, by using Formula (25) and by taking the derivative of the posterior mean trajectory of the biomarker with respect to time t * , the posterior mean of the derivative of the biomarker trajectory is of the form

( E ( X i ( t * ) ) ) t * = 2 ρ 2 ( t * t ) ( K ( t * , t ) K X 1 ( X i β i 0 ( L ) ) ) , (34)

where E ( X i ( t * ) ) denotes the posterior mean of the biomarker trajectory as a function of time t * , ρ 2 is the correlation length, t * is the time-point at which we desire to impute the biomarker value and the average derivative of the biomarker trajectory, β i 0 ( L ) is subject-specific random intercept, K ( t * , t ) is defined as in (27), and K X 1 is defined as in (28).

Given the biomarker value X i ( t ) and the average derivative value X A U C , i ( t ) , the survival component of our proposed joint model is of the from

T i | τ , ζ ( s ) , ζ X i , ζ X i Weibull ( τ , λ i = β i 0 ( s ) + ζ ( s ) Z i ( s ) + ζ X i X i ( t ) + ζ X i X A U C , i ( t ) ) , (35)

where T i is the survival time, τ is the shape parameter of the Weibull distribution, ζ ( s ) is a vector of coefficients relating baseline survival covariates to the risk of the occurrence of the event of interest, ζ X i is the coefficient that relates the biomarker value at time t and the risk of the event at time t, ζ X i is the coefficient that relates the average derivative of the biomarker trajectory up until t and the risk of the event at that time point, λ i is the natural log of the scale parameter in the Weibull distribution, β i 0 ( s ) is the subject-specific baseline hazard for subject i, and Z i ( s ) is a vector of survival coefficients.

2.4.3. Model III: Associating Survival at Time t with the Volatility of a Subject-Specific Biomarker Marker Trajectory

The longitudinal model we have proposed provides a natural parameter for describing subject-specific volatility in a biomarker over time. Specifically, one can summarize the longitudinal trajectory of biomarker by using β 0 i ( L ) as a measure of subject-specific intercept of longitudinal biomarker as well as κ i 2 as a measure of volatility of those trajectories. The survival component of the model is then of the form

T i | τ , β i 0 ( s ) , ζ ( s ) , ζ β 0 i ( L ) , ζ κ i 2 ( L ) Weibull ( τ , λ i = β i 0 ( s ) + ζ ( s ) Z i ( s ) + ζ β 0 i ( L ) β 0 i ( L ) + ζ κ i 2 ( L ) κ i 2 ) , (36)

where T i is the survival time, τ is the shape parameter of the Weibull distribution, ζ ( s ) is a vector of coefficients relating baseline survival covariates to the risk of the occurrence of the event of interest, ζ β 0 i ( L ) is the coefficient that relates the subject specific random intercept β 0 i ( L ) and the risk of the event, ζ κ i 2 ( L ) is the coefficient that relates the subject-specific measure of volatility of the biomarker measure and the risk of the event, λ i is the natural log of the scale parameter in the Weibull distribution, β i 0 ( s ) is the subject-specific baseline hazard for subject i, and Z i ( s ) is a vector of survival coefficients.

3. Posterior Distribution

Consider the joint longitudinal-survival likelihood function, f L , S , introduced in Equation (1). Let ω be a vector of all model parameters with the joint prior distribution π ( ω ) . The posterior distribution of the parameter vector ω can be written as

π ( ω | X , Y ) f L , S × π ( ω ) , (37)

where X and Y denote longitudinal and time-to-event data respectively, and f L , S is the joint model likelihood function in (1).

The posterior distribution of the parameters in our proposed joint model is not available in closed form. Hence, samples from the posterior distribution of the model parameters are obtained via Markov Chain Monte Carlo (MCMC) methods. We use a hybrid sampling technique where in each iteration of the MCMC, we first sample subject-specific frailty terms in the survival model using Neal’s Algorithm 8. Then given the sampled frailty terms, we use the Hamiltonian Monte Carlo [24] to draw samples from the posterior distribution. Prior distributions on parameters of the joint model were explained in details in Sections 2.2 and 2.3, and we assume independence among model parameters in the prior (i.e. π ( ω ) is the product of the prior components specified previously). We provide further detail on less standard techniques for sampling from the posterior distribution when using a GP prior and issues in evaluating the survival portion of the likelihood function when time-varying covariates are incorporated into the model.

3.1. Evaluation of the Longitudinal Likelihood

The longitudinal component of our model uses the Gaussian process technique. Gaussian process models are typically computationally challenging to fit because in each iteration of the MCMC the evaluation of the log-posterior probability becomes computationally challenging as the number of measurements increases. In particular, consider our proposed longitudinal model introduced in Section 2.2 where

X i | W i , β i 0 ( L ) , κ i 2 , ρ 2 , σ 2 N ( β i 0 ( L ) + W i , σ 2 I l i × l i ) , (38)

W i | κ i 2 , t i G P m i ( 0 , κ i 2 K i ) , (39)

with X i denoting a vector of longitudinal measures on subject i, W i a Gaussian process stochastic vector, β i 0 ( L ) a subject specific intercept for subject i, κ i 2 a subject-specific measure of volatility for subject i, ρ 2 a fixed correlation length, σ 2 a measurement error that is shared across all subjects, I l i × l i denoting the identity matrix of size l i with l i the number of longitudinal measures on subject i, t i a vector of the time points at which longitudinal measures

on subject i were collected, and K i = e ρ 2 ( t i j t i j ) 2 , where t i j and t i j are the j t h and j t h element of the time vector t i .

In order to sample from the posterior distribution of κ i 2 and σ 2 parameters, one can consider a marginal distribution of the following form

X i | β i 0 ( L ) , κ i 2 , ρ 2 , σ 2 N ( β i 0 ( L ) , κ i 2 K i + σ 2 I l i × l i ) . (40)

The marginal distribution above has log-density of the form

log ( f ( X i | β i 0 ( L ) , κ i 2 , ρ 2 , σ 2 ) ) = constant 1 2 log | κ i 2 K i + σ 2 I l i × l i | 1 2 ( X i β i 0 ( L ) ) T ( κ i 2 K i + σ 2 I l i × l i ) 1 ( X i β i 0 ( L ) ) , (41)

that is the log contribution of subject i to the longitudinal likelihood (i.e. log ( f L ( i ) ) , where here and throughout, log ( ) denotes the natural log function).

Sampling from the posterior distribution of κ i 2 and σ 2 requires evaluation of the log-density in Equation (41) that involves evaluation of the determinant and the computation of the inverse of the covariance matrix at each iteration of the MCMC. This process requires O ( l i 2 ) memory space and a computation time of O ( l i 3 ) per subject, with l i as the number of within subject measurements.

For our proposed model, we defined K i = e ρ 2 ( t i j t i j ) with a fixed ρ 2 parameter. This means K i can be pre-computed before starting posterior sampling using MCMC. Furthermore, we propose using the eigenvalue decomposition technique for a faster log-posterior probability computation. Our proposed method was motivated by [25] and is as follows.

Consider the covariance matrix κ i 2 K i + σ 2 I l i × l i in the marginal log-density in Equation (41). Our goal is now to propose a method that makes computation of the inverse and the determinant of this covariance function as efficient as possible. As shown earlier, K i can be pre-computed before starting the MCMC process as it does not involve any parameter. Consider the eigenvalue decomposition of K i = Q Λ Q T , where Λ is a diagonal matrix with the eigenvalues of K i as the diagonal elements, and Q is the corresponding matrix of eigenvectors. κ i 2 is a scalar parameter that is sampled in each iteration of the MCMC. Multiplication of κ i 2 times the matrix K i implies the eigenvalues of this matrix will be κ i 2 times bigger where the eigenvectors remain the same. Hence, we can conclude that the eigenvalue decomposition of the matrix κ i 2 K i is of the form κ i 2 K i = Q ( κ i 2 Λ ) Q T , where Q and Λ are elements of the eigenvalue decomposition of the pre-computed matrix K i . Given the pre-computed eigenvalue decomposition of the matrix K i , at each iteration of the MCMC, the determinant of the covariance function of the marginal log-density in Equation (41) can be computed as

log | κ i 2 K i + σ 2 I l i × l i | = log | Q ( κ i 2 Λ ) Q T + σ 2 I l i × l i | = log | Q ( κ i 2 Λ + σ 2 I l i × l i ) Q T | = log ( k = 1 l i ( κ i 2 λ i k + σ 2 ) ) = k = 1 l i log ( κ i 2 λ i k + σ 2 ) . (42)

In Equation (42), λ i k ’s are pre-computed eigenvalues of the matrix K i whereas κ i and σ 2 are parameters sampled at each iteration of the MCMC.

Similarly and by using the same trick, we can compute the term ( X i β i 0 ( L ) ) T ( κ i 2 K i + σ 2 I l i × l i ) ( X i β i 0 ( L ) ) in a more computationally efficient as

( X i β i 0 ( L ) ) T ( κ i 2 K i + σ 2 I l i × l i ) 1 ( X i β i 0 ( L ) ) = ( X i β i 0 ( L ) ) T ( Q ( κ i 2 Λ ) Q T + σ 2 I l i × l i ) 1 ( X i β i 0 ( L ) ) = ( X i β i 0 ( L ) ) T ( Q ( κ i 2 Λ + σ 2 I l i × l i ) Q T ) 1 ( X i β i 0 ( L ) ) = ( X i β i 0 ( L ) ) T ( Q ( κ i 2 Λ + σ 2 I l i × l i ) 1 Q T ) ( X i β i 0 ( L ) ) . (43)

In Equation (43), X i is the data matrix and is fixed, Q and Λ are pre-computed eigenvector and diagonal eigenvalue matrices corresponding to the eigenvalue decomposition of the matrix K i . Finally, by utilizing an eigenvalue decomposition, instead of evaluating the term ( κ i 2 K i + σ 2 I l i × l i ) 1 , one can simply evaluate ( Q ( κ i 2 Λ + σ 2 I l i × l i ) 1 Q T ) , where the term ( κ i 2 Λ + σ 2 I l i × l i ) 1 in the middle is simply the inverse of a diagonal matrix.

3.2. Evaluation of the Survival Likelihood

Here we consider evaluation of the survival component of the decomposed joint likelihood. Consider the observed survival time for subject i that is denoted by t i and is distributed according to a Weibull distribution with shape parameter τ and scale parameter exp ( λ i ) , where λ i = ζ ( S ) Z i ( S ) + ζ ( L ) Z i ( L ) ( t ) , where Z i ( S ) and Z i ( L ) ( t ) are vectors of covariates for subject i, with potentially time-varying covariates, corresponding to the survival and the longitudinal covariates respectively, and ζ ( S ) and ζ ( L ) are vectors of survival and longitudinal coefficients respectively. One can write the hazard function h i ( t ) as

h i ( t ) = e λ i τ t τ 1 . (44)

The survival function S i ( t ) can be written as

S i ( t ) = e 0 t h i ( w ) d w = e exp ( λ i ) t τ . (45)

Consider survival data on n subjects, some of whom may have been censored. Let event indicator δ i that is 1 if the event is observed, and 0 otherwise. The survival likelihood contribution of subject i can be written in terms of the hazard function h i ( t ) and the survival function S i ( t ) as

f S | L ( i ) = h i ( t i ) δ i S i ( t i ) = h i ( t i ) δ i e 0 t i h i ( w ) d w . (46)

The overall survival log-likelihood can be written as

log ( L ) = i = 1 n log ( f S | L ( i ) ) = i = 1 n ( δ i log ( h i ( t i ) ) 0 t i h i ( w ) d w ) . (47)

The hazard function in the Equation (44) includes some time-varying covariates which often makes the integral of the hazard function non-tractable. In this case, one can estimate the integral using rectangular integration as follows Algorithm 1.

4. Simulation Studies

In this section, we evaluate our proposed models using a simulation study. We simulated 200 datasets that resembled the real data on end stage renal disease patients that was obtained from the United States Renal Data System (USRDS). To this end, we first simulated longitudinal trajectories with κ 2 ’s which are sampled from a uniform distribution from 0 to 1. We fixed ρ 2 = 0.1 for all subjects. The subject-specific intercepts for albumin trajectories were randomly sampled from the Normal distribution N ( μ = 5.0 , σ 2 = 0.5 ) . We simulate 9 to 12 longitudinal albumin values per subject. Using the simulated albumin trajectories, we generated survival times from the Weibull distribution in Equation (13) that is of the following form for each of the proposed models

• Model I:

T i | τ , X i ( t ) Weibull ( τ , λ i = β i 0 ( s ) + 0.5 X i ( t ) ) , (48)

• Model II:

T i | τ , X i ( t ) Weibull ( τ , λ i = β i 0 ( s ) + 0.3 X i ( t ) + 0.5 X A U C , i ( t ) ) , (49)

• Model III:

T i | τ , A g e , X i ( t ) , κ i 2 Weibull ( τ , λ i = β i 0 ( s ) + 0.5 A g e i + 0.3 β 0 i ( L ) + 0.7 κ i 2 ) . (50)

Algorithm 1. Integration of survival hazard with time-varying covariates.

In all simulations, β i 0 ( s ) are simulated from a mixture of two Normal distributions of the form θ i N ( μ = 1.5 , σ 2 = 1 ) + ( 1 θ i ) N ( μ = 1.5 , σ 2 = 1 ) , where θ i is distributed Bernoulli with parameter p = 0.5 .

Finally, the censoring times were sampled from a uniform distribution and independently from the simulated event times with an overall censoring rate of 20%.

All results are from 200 simulated datasets of size n = 300 subjects each. For each dataset, we fit our proposed joint models with 10,000 draws where the first 5000 considered as a burn-in period. Relatively diffuse priors were considered for all parameters. Details of the priors used in the simulations as well as the results follow.

4.1. Simulation Results for Model I

In order to compare our proposed joint longitudinal-survival model that is capable of flexibly modeling longitudinal trajectories with simpler models with explicit functional assumptions on the longitudinal trajectories, we simulated longitudinal data once from quadratic polynomial longitudinal trajectory curves and another time from random non-linear curves. We then fit our joint model with a Gaussian Process longitudinal component as well as a joint model with the explicit assumption that the longitudinal trajectories are from a quadratic polynomial curve. As a comparison model, we also fit a two-stage Cox model where in stage one longitudinal data are modeled using our proposed Gaussian process longitudinal model and in the second stage, given the posterior mean parameters from the longitudinal fit, a Cox proportional hazard will fit the survival data.

In particular, we generate synthetic longitudinal and survival data on 300 subjects, each with 9 to 12 within subject longitudinal albumin measures. Under the scenario where the longitudinal data are generated from quadratic polynomial longitudinal trajectories, we consider quadratic polynomial curves of the form

X i j = β 0 i ( L ) + β 1 i t + β 2 t 2 + ε i j , (51)

where the true value of β 0 i are simulated from the Normal distribution N ( μ = 5 , σ 2 = 1 ) , β 1 i are simulated from the Normal distribution N ( μ = 0.5 , σ 2 = 0.01 ) , β 2 is set to be equal to −0.1, and finally ε i j is the measurement error that is independent across measures and across subjects and are simulated from the Normal distribution N ( μ = 0 , σ = 0.1 ) .

Under the second scenario, longitudinal albumin values are generated from random non-linear curves. In particular, we generate random non-linear albumin trajectories that are realizations of a Gaussian process that are centered around the subject-specific random intercepts β 0 i ( L ) that are generated from the Normal distribution N ( μ = 5 , σ 2 = 1 ) . We consider a Gaussian process with the squared exponential covariance function with the correlation length of ρ 2 = 0.1 and the subject-specific measures of volatility κ i 2 that are generated from the uniform distribution U ( 0,1 ) .

For each simulation scenario, once longitudinal measures are generated, we generate survival data where survival times are distributed according to the Weibull distribution Weibull ( τ , λ i ) , where the shape parameter τ is set to 1.5 and λ i , which is the log of the scale parameter of the Weibull distribution, is set to β i 0 ( S ) + β 1 X i ( t ) , where β i 0 ( S ) are generated from an equally weighted mixture of two Normal distributions of N ( μ = 1.5 , σ 2 = 1 ) and N ( μ = 1.5 , σ 2 = 1 ) , β 1 is fixed to −0.5, and X i ( t ) is the longitudinal value for subject i at time t that is already simulated in the longitudinal step of the data simulation.

Our proposed joint longitudinal-survival model assumes the Normal prior N ( μ = 5 , σ 2 = 4 ) on the random intercepts β 0 i , the log-Normal prior log-Normal ( 1,2 ) on κ i 2 , the log-Normal prior log-Normal ( 1,1 ) on σ 2 , the log-Normal prior log-Normal ( 0,1 ) on τ , the Normal prior N ( μ = 0 , σ 2 = 25 ) on the survival shared intercept β 0 , the Normal prior N ( μ = 0 , σ 2 = 25 ) on the survival coefficient β 1 , the Gamma prior Γ ( 3,3 ) on the concentration parameter of the Dirichlet distribution, and the Normal prior N ( μ = 0 , σ 2 = 25 ) as the base distribution of the Dirichlet distribution.

As the results in Table 1 show, when data are simulated with a longitudinal trajectories that are quadratic polynomial curves, the joint polynomial model performs better in terms of estimating the albumin coefficient in the survival model with a smaller mean squared error compared to our proposed joint longitudinal-survival. In real world settings, however, the true functional forms of the trajectories of the biomarkers are not known and are unlikely to follow a simple parametric trajectory. Under a general case where the biomarker trajectories can be any random non-linear curve (Scenario 2), our proposed joint model outperforms the joint polynomial model. Further, our joint modeling

Table 1. Model I simulation results - joint longitudinal-survival data were generated under two scenarios: one when longitudinal measures are sampled from the quadratic polynomial trajectories (Scenario 1) and another when longitudinal measures are sampled from random non-linear curves (Scenario 2). Under each scenario, we considered three estimation approaches: a two-stage Cox proportional model with longitudinal trajectories with parameters that set to the posterior mean of a Gaussian process longitudinal model that is fit separately, a joint longitudinal-survival model with the assumption that longitudinal trajectories are quadratic polynomial (Joint Polynomial Model), and our proposed joint longitudinal-survival with a flexible Gaussian process longitudinal component (Joint Model).

framework that is capable of estimating differential subject-specific log baseline hazards provides significantly better coefficient estimates compared to the proportional hazard Cox model. Estimates under the Cox model are marginalized over all subjects and due to the non-collapsibility aspect of this model [26] [27] , coefficient estimates shrink toward 0.

4.2. Simulation Results for Model II

In Model II, not only do we adjust for the current value of albumin value, but we also adjust for a weighted average slope of albumin over a specified time interval. This new model differentiates between the risks of death for a patient whose albumin value is improving compared to another patient with the same albumin level whose albumin is deteriorating. We consider the two weighted derivative averages given in (31) and (32). The first weighting scheme in (31) leads to the area under the derivative curve. The second weighting scheme in (32) results in a point-wise derivative of albumin at each event time.

We generate synthetic data for 300 subjects each with 9 to 12 longitudinal measurements where longitudinal albumin values are generated from a Gaussian process that is centered around the subject-specific random intercepts β 0 i ( L ) which are generated from the Normal distribution N ( μ = 5 , σ 2 = 1 ) . We consider a Gaussian process with the squared exponential covariance function with the correlation length of ρ 2 = 0.1 and the subject-specific measures of volatility κ i 2 that are generated from the uniform distribution U ( 0,1 ) . Once longitudinal measures are generated, we generate survival data where survival times are distributed according to the Weibull distribution Weibull ( τ , λ i ) , where the shape parameter τ is set to 1.5 and λ i , which is the log of the scale parameter in Weibull distribution, is set to β i 0 ( S ) + β 1 X i ( t ) + β 2 X A U C , i ( t ) , where β i 0 ( S ) are generated from an equally weighted mixture of two Normal distributions of N ( μ = 1.5 , σ 2 = 1 ) and N ( μ = 1.5 , σ 2 = 1 ) , β 1 is fixed to 0.3, β 2 is fixed to 0.5, X i ( t ) is the longitudinal value for subject i at time t and X A U C , i ( t ) is the average slope of albumin.

Our proposed joint longitudinal-survival model assumes the Normal prior N ( μ = 5 , σ 2 = 4 ) on the random intercepts β 0 i , the log-Normal prior log-Normal ( 1,2 ) on κ i 2 , the log-Normal prior log-Normal ( 1,1 ) on σ 2 , the log-Normal prior log-Normal ( 0,1 ) on τ , the Normal prior N ( μ = 0 , σ 2 = 25 ) on the survival shared intercept β 0 , the Normal prior N ( μ = 0 , σ 2 = 25 ) on the survival coefficient β 1 , the Normal prior N ( μ = 0 , σ 2 = 25 ) on the survival coefficient β 2 , the Gamma prior Γ ( 3,3 ) on the concentration parameter of the Dirichlet distribution, and the Normal prior N ( μ = 0 , σ 2 = 25 ) as the base distribution of the Dirichlet distribution.

As a comparison to our proposed approach, we also fit a two-stage Cox model where the longitudinal curve of albumin and its derivative curve are estimated using hyper-parameters set as the posterior median of a Bayesian Gaussian Process model. As we can see from Table 2, our joint model provides closer

Table 2. Model II simulation results - joint longitudinal-survival data were generated for 300 subjects each with 9 to 12 within subject measurements where longitudinal albumin values are generated from a Gaussian process that is centered around the subject-specific random intercepts β 0 i ( L ) which are generated from the Normal distribution N ( μ = 5 , σ 2 = 1 ) . We consider a Gaussian process with the squared exponential covariance function with the correlation length of ρ 2 = 0.1 and the subject-specific measures of volatility κ i 2 that are generated from the uniform distribution U ( 0,1 ) . Once longitudinal measures are generated, we generate survival data where survival times are distributed according to the Weibull distribution Weibull ( τ , λ i ) , where the shape parameter τ is set to 1.5 and λ i , which is the log of the scale parameter in Weibull distribution, is set to β i 0 ( S ) + β 1 X i ( t ) + β 2 X A U C , i ( t ) , where β i 0 ( S ) are generated from an equally weighted mixture of two Normal distributions of N ( μ = 1.5 , σ 2 = 1 ) and N ( μ = 1.5 , σ 2 = 1 ) , β 1 is fixed to 0.3, β 2 is fixed to 0.5, X i ( t ) is the longitudinal value for subject i at time t and X A U C , i ( t ) is the average slope of albumin. We fit our proposed joint longitudinal-survival model as well as a two-stage Cox proportional hazard model as a comparison model.

estimates to the coefficient values with a smaller mean squared error compared with the two-stage Cox model. Our proposed model is capable of detecting differential subject-specific baseline hazards whereas the Cox model is not capable of differentiating between subjects and provides estimates that are marginalized across all subjects. Further, the simulation results show the capability of our method in detecting the true underlying longitudinal curves and the ability of our method on properly estimating the average derivative of those curves.

4.3. Simulation Results for Model III

In model III, we test the association between the summary measures of the longitudinal biomarker trajectories and the survival outcomes. In particular, we consider the relation between the summary measures of subject-specific random intercept β i 0 ( L ) and subject-specific measure of volatility κ i 2 and survival times.

We generate synthetic data for N = 300 subjects each with 9 to 12 longitudinal measurements where longitudinal albumin values are generated from a Gaussian process that is centered around the subject-specific random intercepts β 0 i ( L ) which are generated from the Normal distribution N ( μ = 5 , σ = 1 ) . We consider a Gaussian process with the squared exponential covariance function with the correlation length of ρ 2 = 0.1 and the subject-specific measures of volatility κ i 2 that are generated from the uniform distribution U ( 0,1 ) . Once longitudinal measures are generated, we generate survival data where survival times are distributed according to the Weibull distribution Weibull ( τ , λ i ) , where the shape parameter τ is set to 1.5 and λ i , which is the log of the scale parameter in Weibull distribution, is set to β i 0 ( S ) + β 1 A g e + β 2 β i 0 ( L ) + β 3 κ i 2 ( L ) , where β i 0 ( S ) are generated from an equally weighted mixture of two Normal distributions of N ( μ = 1.5 , σ 2 = 1 ) and N ( μ = 1.5 , σ 2 = 1 ) , β 1 is fixed to 0.5, β 2 is fixed to −0.3, β 3 is fixed to 0.7, A g e is a standardized covariate that is generated from the Normal distribution N ( μ = 0 , σ 2 = 1 ) , β i 0 ( L ) is subject-specific random intercepts of the longitudinal trajectories, and κ i 2 ( L ) are subject specific measure of volatility of the longitudinal trajectories.

Our proposed joint longitudinal-survival model assumes the Normal prior N ( μ = 5 , σ 2 = 4 ) on the random intercepts β 0 i , the log-Normal prior log-Normal ( 1,2 ) on κ i 2 , the log-Normal prior log-Normal ( 1,1 ) on σ 2 , the log-Normal prior log-Normal ( 0,1 ) on τ , the Normal prior N ( μ = 0 , σ 2 = 25 ) on the survival shared intercept β 0 , the Normal prior N ( μ = 0 , σ 2 = 25 ) on the survival coefficient β 1 , the Normal prior N ( μ = 0 , σ 2 = 25 ) on the survival coefficient β 2 , the Gamma prior Γ ( 3,3 ) on the concentration parameter of the Dirichlet distribution, and the Normal prior N ( μ = 0 , σ 2 = 25 ) as the base distribution of the Dirichlet distribution.

We fit our proposed joint survival-longitudinal model (model III) as well as a two-stage Cox proportional hazard model as a comparison model. The two-stage Cox model is a simple Cox proportional hazard model with covariate β 0 i ( L ) and κ i 2 ( L ) that are posterior medians from a separate longitudinal Gaussian process model. As the results in Table 3 show, our proposed joint model provides closer estimates to the true coefficients that also have significantly smaller mean squared error compared to the two-stage Cox model. Our proposed joint model is capable of detecting the differential subject-specific baseline hazards. Unlike our model, Cox model is blind to the subject-specific baseline hazards and hence, provides coefficient estimates that are marginalized over all subjects. These marginalized estimates from the Cox model shrink toward 0 as the Cox model with a multiplicative hazard function is non-collapsible.

As one can see in the joint model results in Table 3, the coefficient estimate for κ 2 ( L ) is not as close to the true coefficient value compared with other coefficient estimates. This is due to the fact that only 9 to 12 longitudinal measures per subject, there exists many plausible κ i 2 ( L ) values that flexibly characterize the trajectory of the measured albumin values. This additional variability in plausible κ i 2 ( L ) values has caused the coefficient estimate to shrink toward 0. Larger number of within subject longitudinal measures will provide more precision in estimating the true underlying κ i 2 ( L ) and will lead to a coefficient

Table 3. Model III simulation results - joint longitudinal-survival data were generated for 300 subjects each with 9 to 12 longitudinal measurements where longitudinal albumin values are generated from a Gaussian process that is centered around the subject-specific random intercepts β 0 i ( L ) which are generated from the Normal distribution N ( μ = 5 , σ 2 = 1 ) . We consider a Gaussian process with the squared exponential covariance function with the correlation length of ρ 2 = 0.1 and the subject-specific measures of volatility κ i 2 that are generated from the uniform distribution U ( 0,1 ) . Once longitudinal measures are generated, we generate survival data where survival times are distributed according to the Weibull distribution Weibull ( τ , λ i ) , where the shape parameter τ is set to 1.5 and λ i , which is the log of the scale parameter in Weibull distribution, is set to β i 0 ( S ) + β 1 A g e + β 2 β i 0 ( L ) + β 3 κ i 2 ( L ) , where β i 0 ( S ) are generated from an equally weighted mixture of two Normal distributions of N ( μ = 1.5 , σ 2 = 1 ) and N ( μ = 1.5 , σ 2 = 1 ) , β 1 is fixed to 0.5, β 2 is fixed to −0.3, β 3 is fixed to 0.7, A g e is a standardized covariate that is generated from the Normal distribution N ( μ = 0 , σ 2 = 1 ) , β i 0 ( L ) are subject-specific random intercepts of the longitudinal trajectories, and κ i 2 ( L ) are subject specific measures of volatility of the longitudinal trajectories. We fit our proposed joint longitudinal-survival model as well as a two-stage Cox proportional hazard model as a comparison model.

estimate closer to the true value. In order to confirm this fact, we simulated additional data once with 36 within subject measures and another time with 72 within subject measures. Table 4 shows the results of fitting our proposed joint longitudinal-survival model to datasets that include subjects with 9 to 12 within subject measurements, to datasets with subjects with 36 within subject measurements, and to datasets with subjects with 72 within subject measurements. As the results show, with larger number of within subject measurements, coefficient estimate for κ i 2 ( L ) is closer to the true value. This is due to the fact that with larger number of within subject albumin measurements, there exists a stronger likelihood to estimate the subject-specific volatility measures κ i 2 , and hence, there is less uncertainty about the estimated value of volatility measures.

5. Application of the Proposed Joint Models to Data from the USRDS

In this section, we apply our proposed joint longitudinal-survival models to data on n = 1112 end stage renal disease patients participating in the Dialysis Morbidity and Mortality Studies (DMMS) nutritional study that is obtained from the United States Renal Data System (USRDS) [21] . For every participating patient

Table 4. Model III simulation results with datasets with l i = 36 and l i = 72 within subject measurements. In order to test the sensitivity of the κ 2 ( L ) coefficient estimate to the number of within subject measurements, l i , we simulated joint longitudinal-survival data once when each subject has 36 within subject measurements and another time when each subject has 72 within subject measurements. Under each scenario, we simulated 200 datasets each with 300 subjects. Other simulation parameters remained the same as the simulation parameters used in Table 3. This means, we simulated longitudinal data from Gaussian process that is centered around the subject-specific random intercepts β 0 i ( L ) which are generated from the Normal distribution N ( μ = 5 , σ 2 = 1 ) . We consider a Gaussian process with the squared exponential covariance function with the correlation length of ρ 2 = 0.1 and the subject-specific measures of volatility κ i 2 that are generated from the uniform distribution U ( 0,1 ) . Once longitudinal measures are generated, we generate survival data where survival times are distributed according to the Weibull distribution Weibull ( τ , λ i ) , where the shape parameter τ is set to 1.5 and λ i , which is the log of the scale parameter in Weibull distribution, is set to β i 0 ( S ) + β 1 A g e + β 2 β i 0 ( L ) + β 3 κ i 2 ( L ) , where β i 0 ( S ) are generated from an equally weighted mixture of two Normal distributions of N ( μ = 1.5 , σ 2 = 1 ) and N ( μ = 1.5 , σ 2 = 1 ) , β 1 is fixed to 0.5, β 2 is fixed to −0.3, β 3 is fixed to 0.7, A g e is a standardized covariate that is generated from the Normal distribution N ( μ = 0 , σ 2 = 1 ) , β i 0 ( L ) are subject-specific random intercepts of the longitudinal trajectories, and κ i 2 ( L ) are subject specific measures of volatility of the longitudinal trajectories.

in the study, up to 12 albumin measurements were taken uniformly over two years of follow-up. The presented analyses are restricted to only the patients who had at least nine albumin measurements in order to provide sufficient data for modeling the trajectory and the volatility of albumin. The censoring rate in the data is at 43% over a maximal follow-up time of 4.5 years.

Using the same data, Fung et al. [1] showed that both baseline albumin level and the slope of albumin over time are significantly associated with mortality in ESRD patients. In this section, we recreate those results, but also implement the models described in Sections 2.4.2 and 2.4.3 to quantify the roles of the derivative of the albumin trajectory and the volatility of albumin as predictors of mortality.

In order to adjust for other potential confounding factors, our proposed models also include patient’s age, gender, race, smoking status, diabetes, an indicator of whether the patient appeared malnourished at baseline, BMI at baseline, baseline cholesterol, and baseline systolic blood pressure. The adjusted covariates are consistent with those originally presented in Fun et al. [1] . In the results that are presented, all joint models were run for 10,000 posterior samples where the initial 5000 samples are discarded as burn-in samples.

5.1. Association between Survival at Time t and Serum Albumin at Time t

Table 5 shows the estimated coefficients from our model and a last-observation carried forward (LOCF) Cox model, for comparison. The estimated relative risk

Table 5. Estimated relative risk and corresponding 95% credible region from our proposed joint model where we adjust for time-dependent albumin value that is imputed from the longitudinal component of the model. For comparison, we also fit a last-observation carried forward Cox proportional hazards model with last albumin value carried forward where we report coefficients estimates, 95% confidence interval, and p-value for the estimated coefficients. In both models, we adjust for potential confounding factors as reported by Fung et al. [1] .

associated with all time-invariant baseline survival covariates are similar between the two models. The relative risk associated with every one unit decrement in serum albumin is, however, larger under our proposed joint model compared to the last-observation carried forward Cox model. This is expected as our model is capable of estimating subject-specific albumin trajectories over time and is capable of accurately testing the association between albumin at time t and the risk of death at time t. The LOCF Cox model, on the other hand, uses the most recent albumin measure which in reality might be quite different than the albumin value at the time of death. In both models, albumin is identified as a significant risk factor of mortality. In particular, based on the results from our proposed joint, it is estimated that every 1 g/dL decrement in albumin is associated with a 4.5 times higher risk of death in ESRD patients.

5.2. Association between Survival at Time t and the Derivative of the Serum Albumin Trajectory

Other than the albumin value at the time of death, the average slope of albumin over time might also be a risk factor of mortality in ESRD patients. Here, we adjust for the area under the derivative curve of the albumin trajectory from the time that the follow-up starts until the survival time which is either the time of death or the censoring time. Table 6 shows the results from our proposed model. Based on the results, we estimate that every one g/dL decrement in albumin is associated with a 3.95-fold increase in the risk of death. Also, lower average slope of albumin, that is every 1 g/dL/month decrease in the average slope, is associated with a 2.3-fold higher risk of death. This is consistent with the results of Fung et al. [1] .

Table 6. Estimated relative risk and corresponding confidence intervals from our proposed joint models considering the relationship between survival and the derivative of albumin and the volatility of albumin. Potential confounding factors, as reported by Fung et al. [1] , were also adjusted in the model but we have been removed them from the table for brevity.

5.3. Association between Survival at Time t and the Volatility of Serum Albumin

Fung et al. [1] showed that the baseline albumin and the slope of albumin over time are two independent risk factors of mortality among the end-stage renal disease patients. It is natural to also hypothesize that the volatility of albumin could be a risk factor of mortality among these patients. Here we consider two summary measures of the trajectories of the longitudinal albumin values, one the baseline albumin measures ( β 0 i ( L ) ), and another the subject-specific volatility measure of albumin ( κ i 2 ( L ) ). The bottom rows of Table 6 show results from this model fit. These result indicate that both baseline serum albumin the volatility of albumin are also significant risk factors of mortality. Specifically, we estimate that a one unit increase in κ 2 , which indicates a higher volatility, is associated with 1.2 times higher risk of death. Figure 2 shows albumin trajectories of 10 randomly sampled individuals.

Figure 2. Actual longitudinal albumin trajectories of 10 randomly selected individuals with end-stage renal disease that were selected from the USRDS data. Hollow circles are the actual measured albumin values, red lines are the posterior median fitted curves from our proposed Model III, and the dashed blue lines are the corresponding 95% posterior prediction intervals for the fitted trajectories. The title of each plot shows the posterior median of the volatility measure κ 2 for the subject whose albumin measures are shown in the plot.

6. Discussion

Monitoring the health of patients often involves recording risk factors over time. In such situations, it is essential to evaluate the association between those longitudinal measurements and survival outcome. To this end, joint longitudinal-survival models provide an efficient inferential framework.

We proposed a joint longitudinal-survival framework that avoids some of the restrictive assumptions commonly used in the existing models. Further, our methods propose a stronger link between longitudinal and survival data through an introduction of new ways of adjusting for the biomarker value at time t, adjusting for the average derivative of the biomarker over time, and moving beyond the first-order trend and accounting for volatility of biomarker measures over time.

Our proposed method can be considered as an extension of the joint model proposed by [16] in that we use the same idea of dividing the joint likelihood into a marginal longitudinal likelihood and conditional survival likelihood. However, instead of fitting quadratic trajectories, we use a flexible longitudinal model based on the Gaussian processes. Further, for the survival outcome, instead of assuming a piecewise exponential model, we use a flexible survival model by incorporating the Dirichlet process mixture of Weibull distributions. Our proposed modeling framework is capable of modeling additional summary measures of longitudinally measured biomarkers and relating them to the survival outcome in a time-dependent fashion.

Despite its flexibility and novelty, our approach has some limitations. By using the Bayesian non-parameteric Dirichlet process and the Gaussian process techniques, we provide a flexible modeling framework that avoids common distributional assumptions; however, these techniques are generally not scalable when the number of subjects and the number of within subject measurements increase. Furthermore, the survival component of our model still relies on the proportional hazards assumption. In future work, our methodology can be extended to include a more general non-proportional hazards framework that can also include time-dependent coefficients inside the survival model. Additionally, the use of alternatives to common MCMC techniques, including parallel-MCMC methods and variational methods, may lead to greater computational efficiency and scalability for larger datasets.

Often times in monitoring the health of patients, multiple longitudinal risk factors are measured. One can use our proposed modeling framework in this paper in order to build a joint longitudinal-survival model with multiple longitudinal processes, where each process is modeled independently from other longitudinal processes. In reality, however, one expects a patient’s longitudinal risk factors to be correlated. A methodology that is capable of modeling multiple biomarkers simultaneously by taking the correlation between biomarkers into account would be beneficial and remains an area of open research.

Acknowledgements

Parts of this research were supported by the National Institutes of Health under Grant Nos. R01AG053555 and P30AG066519 (DLG).

Cite this paper: Masouleh, S. , Holsclaw, T. , Shahbaba, B. and Gillen, D. (2021) A Flexible Joint Longitudinal-Survival Model for Analyzing Longitudinally Sampled Biomarkers. Open Journal of Statistics, 11, 778-805. doi: 10.4236/ojs.2021.115046.
References

[1]   Fung, F., Sherrard, D.J., Gillen, D.L., Wong, C., Kestenbaum, B., Seliger, S., Ball, A. and Stehman-Breen, C. (2002) Increased Risk for Cardiovascular Mortality among Malnourished End-Stage Renal Disease Patients. American Journal of Kidney Diseases, 40, 307-314.
https://doi.org/10.1053/ajkd.2002.34509

[2]   Wong, C.S., Hingorani, S., Gillen, D.L., Sherrard, D.J., Watkins, S.L., Brandt, J.R., Ball, A. and Stehman-Breen, C.O. (2002) Hypoalbuminemia and Risk of Death in Pediatric Patients with End-Stage Renal Disease. Kidney International, 61, 630-637.
https://doi.org/10.1046/j.1523-1755.2002.00169.x

[3]   Prentice, R.L. (1982) Covariate Measurement Errors and Parameter Estimation in a Failure Time Regression Model. Biometrika, 69, 331-342.
https://doi.org/10.1093/biomet/69.2.331

[4]   Dafni, U.G. and Tsiatis, A.A. (1998) Evaluating Surrogate Markers of Clinical Outcome When Measured with Error. Biometrics, 54, 1445-1462.
https://doi.org/10.2307/2533670

[5]   Tsiatis, A.A., Degruttola, V. and Wulfsohn, M.S. (1995) Modeling the Relationship of Survival to Longitudinal Data Measured with Error. Applications to Survival and cd4 Counts in Patients with Aids. Journal of the American Statistical Association, 90, 27-37.
https://doi.org/10.1080/01621459.1995.10476485

[6]   Verbeke, G. and Molenberghs, G. (2000) Linear Mixed Models for Longitudinal Data. Springer, New York.
https://doi.org/10.1007/978-1-4419-0300-6

[7]   Pinheiro, J.C. and Bates, D.M. (2000) Mixed Effects Models in S and S-Plus. Springer, New York.
https://doi.org/10.1007/b98882

[8]   Christensen, R., Johnson, W., Branscum, A. and Hanson, T.E. (2010) Bayesian Ideas and Data Analysis: An Introduction for Scientists and Statisticians. CRC Texts in Statistical Science, CRC Press, Boca Raton.

[9]   Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (2003) Bayesian Data Analysis. Chapman and Hall, London.
https://doi.org/10.1201/9780429258480

[10]   Shi, J.Q., Wang, B., Murray-Smith, R. and Titterington, D.M. (2007) Gaussian Process Functional Regression Modeling for Batch Data. Biometrics, 63, 714-723.
https://doi.org/10.1111/j.1541-0420.2007.00758.x

[11]   Liu, Q., Lin, K.K., Andersen, B., Smyth, P. and Ihler, A. (2010) Estimating Replicate Time Shifts Using Gaussian Process Regression. Bioinformatics, 26, 770-776.
https://doi.org/10.1093/bioinformatics/btq022

[12]   Bycott, P. and Taylor, J. (1998) A Comparison of Smoothing Techniques for cd4 Data Measured with Error in a Time-Dependent Cox Proportional Hazards Model. Statistics in Medicine, 17, 2061-2077.
https://doi.org/10.1002/(SICI)1097-0258(19980930)17:18%3C2061::AID-SIM896%3E3.0.CO;2-O

[13]   Hanson, T.E., Branscum, A.J. and Johnson, W.O. (2011) Predictive Comparison of Joint Longitudinal-Survival Modeling: A Case Study Illustrating Competing Approaches. Lifetime Data Analysis, 17, 3-28.
https://doi.org/10.1007/s10985-010-9162-0

[14]   Wang, Y. and Taylor, J.M.G. (2001) Jointly Modeling Longitudinal and Event Time Data with Application to Acquired Immunodeficiency Syndrome. Journal of the American Statistical Association, 96, 895-905.
https://doi.org/10.1198/016214501753208591

[15]   Faucett, C.L. and Thomas, D.C. (1996) Simultaneously Modelling Censored Survival Data and Repeatedly Measured Covariates: A Gibbs Sampling Approach. Statistics in Medicine, 15, 1663-1685.
https://doi.org/10.1002/(SICI)1097-0258(19960815)15:15%3C1663::AID-SIM294%3E3.0.CO;2-1

[16]   Brown, E. and Ibrahim, J. (2003) A Bayesian Semiparametric Joint Hierarchical Model for Longitudinal and Survival Data. Biometrics, 59, 221-228.
https://doi.org/10.1111/1541-0420.00028

[17]   Wulfsohn, M.S. and Tsiatis, A.A. (1997) A Joint Model for Survival and Longitudinal Data Measured with Error. Biometrics, 53, 330-339.
https://doi.org/10.2307/2533118

[18]   Song, X., Davidian, M. and Tsiatis, A.A. (2002) An Estimator for the Proportional Hazards Model with Multiple Longitudinal Covariates Measured with Error. Biostatistics, 3, 511-528.
https://doi.org/10.1093/biostatistics/3.4.511

[19]   Law, N.J., Taylor, J.M.G. and Sandler, H. (2002) The Joint Modeling of a Longitudinal Disease Progression Marker and the Failure Time Process in the Presence of cure. Biostatistics, 3, 547-563.
https://doi.org/10.1093/biostatistics/3.4.547

[20]   Erango, M.A. and Goshu, A.T. (2019) Bayesian Joint Modelling of Survival Time and Longitudinal CD4 Cell Counts Using Accelerated Failure Time and Generalized Error Distributions. Open Journal of Modelling and Simulation, 7, 79-95.
https://doi.org/10.4236/ojmsi.2019.71004

[21]   USRD (U S Renal Data System) (2010) USRDS 2010 Annual Data Report: Atlas of Chronic Kidney Disease and End-Stage Renal Disease in the United States. National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda.

[22]   Banerjee, S., Carlin, B.P. and Gelfand, A.E. (2014) Hierarchical Modeling and Analysis for Spatial Data. CRC Press, New York.
https://doi.org/10.1201/b17115

[23]   Diggle, P.J. and Ribeiro, P.J. (2007) Model-Based Geostatistics. 1st Edition, Springer Series in Statistics, Springer, New York.
https://doi.org/10.1007/978-0-387-48536-2

[24]   Neal, R.M. (2011) Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC Press, Boca Raton.

[25]   Flaxman, S, Gelman, A., Neill, D., Smola, A., Vehtari, A. and Gordon Wilson, A. (2015) Fast Hierarchical Gaussian Processes. Manuscript in Preparation.

[26]   Struthers, C. and Kalbiesch, J. (1986) An Estimator for the Proportional Hazards Model with Multiple Longitudinal Covariates Measured with Error. Biometrika, 73, 363-369.
https://doi.org/10.1093/biomet/73.2.363

[27]   Martinussen, T. and Vansteelandt, S. (2013) On Collapsibility and Confounding Bias in Cox and Aalen Regression Models. Lifetime Data Analysis, 19, 279-296.
https://doi.org/10.1007/s10985-013-9242-z

 
 
Top