Parametric Survival Analysis

Adam J Sullivan
Assistant Professor of Biostatistics
Brown University

Estimation of a Survival Curves

Parametric Survival Models

For binary data we have used the binomial distribution and logistic regression.
For count data we have used the Poisson distribution and Poisson/Log-linear regression.
Survival data is positive and usually continuous

For this reason we use continuous distributions but ones other than normal. We will discuss a couple of these:

The exponential Distribution
The Weibull Distribution

The Exponential Distribution

This is the simplest situation

Assumption of Constant hazard
This may work over short time periods but is unlikely in the long run.
Survival analysis with the exponential distribution is equivalent to Poisson Regression.

The Exponential Distribution

We have a random variable \(T\) which is the failure time

Hazard Function: \(h(t)=\lambda\)
Survivor Function: \(S(t) = e^{-\lambda t}\)
Median: \(T_{med} = \dfrac{\log(2)}{\lambda}\)
Mean: \(E(T) = \dfrac{1}{\lambda}\)

If treatment cuts the hazard in half, the median survival time is doubled

Regression

If survival times follow an exponential distribution with the hazard \(\lambda\) then the number of events \(t\) follows a Poisson distribution with mortality rate \(\lambda\).
Thus when we use Poisson regression the estimated regression coefficients are interpreted as log-hazard ratios.

Regression

For example:

Let

\(X=1\) for subjects with diabetes
\(X=0\) for subjects without diabetes

Regression

Then the exponential regression model is

\[h(t|X=x) = \exp(\beta_0+\beta_1x) = \exp(\beta_0)\exp(\beta_1x)\] This means

\[\begin{aligned} \log\left[h(t|X=0)\right] &= \beta_0\\ &= \log\left( \text{ hazard among people without diabetes}\right)\\ \log\left[h(t|X=1)\right] &= \beta_0 + \beta_1\\ &= \log\left( \text{ hazard among people with diabetes}\right)\\ \beta_1 &= \log\left[h(t|X=1)\right]- \log\left[h(t|X=0)\right]\\ &= \dfrac{\log\left[h(t|X=1)\right]}{\log\left[h(t|X=0)\right]}\\ &= \log\left(\text{ hazard ratio, diabetes vs no diabetes}\right)\\ \end{aligned}\]

Checking if we can run exponential model?

We typically test this assumption with a graph

The cumulative hazard function for exponential is given by

\[\Lambda(t) = \int_0^th(u)du=\lambda t\] This implies that it is a straight line with slope \(\lambda\) and intercept 0.

We can then use the Nelson-Aalen estimator which gives a non parametric estimation of cumulative hazard.
If the model is exponential we can fit a straight line through this data.

Checking if we can run exponential model?

plot of chunk unnamed-chunk-2

Limitations of Exponential?

Assumption of constant hazard is not realistic
The hazard may be higher after an intervention then decrease after a certain time.
In healthy populations the hazards for most outcomes increase with age.
The aging process by nature has an increasing hazard of death.

Limitations of Exponential?

Recall our Colorectal Cancer data and smoking:

suppressMessages(library(foreign))
phscrc <- read.dta("https://drive.google.com/uc?export=download&id=0B8CsRLdwqzbzSno1bFF4SUpVQWs")
crc.haz <- survfit(Surv(cayrs, crc) ~ csmok, data= phscrc, type='fleming')

suppressMessages(library(survminer))
ggsurvplot(crc.haz, conf.int = TRUE,
           risk.table = TRUE, risk.table.col = "strata",
           fun = "cumhaz", legend.labs = c("Non-smoker", "<1 pack/day", ">1 pack/day"),
          break.time.by=1)

Limitations of Exponential?

plot of chunk unnamed-chunk-5

The Weibull Distribution

We use the Weibull many times as an alternative to the exponential

The weibull has

\(h(t) = \lambda\left(\gamma t^{\gamma-1}\right)\)
\(\Lambda(t) = \lambda t^{\gamma}\)
\(S(t) = e^{-\lambda t^\gamma}\)
Then
- If \(\gamma=1\) we have the exponential
- If \(\gamma>1\) the hazard increases over time.
- If \(\gamma<1\) the hazard decreases over time.

Regression

The model for weibull hazard is

\[h(t|X_1,\ldots,X_p ) = \lambda\gamma t ^{\gamma-1}\exp\left(\beta_0+\beta_1x_1+\cdots\beta_px_p\right)\]

Then

baseline hazard is \(\lambda\gamma t ^{\gamma-1}\) and depends on \(t\)
Covariates have a multiplicative effect on baseline hazard
- hazards for 2 covariate levels are proportional
Can be fit in R using survreg()

When can we do Weibull?

Weibull is not always applicable in some settings. For example

If hazard increases of time (\(\lambda>1\)) then hazard is 0 at time \(t=0\)
If hazard decreases of time (\(\lambda<1\)) then hazard is \(\infty\) at time \(t=0\)

Why do we use parametric models?

If the model for \(T\) is correct that we gain power with parametric models over KM and logrank.
If the model for \(T\) is incorrect, parametric methods will be biased and non-parametric models will not be
It is easier to control for confounding and to detect effect modification with parametric modeling.

Parametric Regression

Exponential

For a time to event \(T\) and covariates \(X_1, \ldots , X_p\) the exponential regression model is:

\[h(t|X_1,\ldots, X_p) = \exp\left(\beta_0 + \beta_1x_1 + \cdots + \beta_px_p\right)\]

The baseline hazard function is defined as the hazard function when all covariates are 0

Exponential

\[h(t|X_1=0,\ldots, X_p=0) = \exp\left(\beta_0 \right) = h_0(t) = h_0\]

Thus we can rewrite the model as: \[h(t|X_1,\ldots, X_p) = h_0\exp\left( \beta_1x_1 + \cdots + \beta_px_p\right)\]

This suggest that covariate effects are **multiplicative* on the constant baseline hazard, \(h_0\).

Weibull

For a time to event \(T\) and covariates \(X_1, \ldots , X_p\) the Weibull regression model is:

\[h(t|X_1,\ldots, X_p) = \gamma t^{\gamma-1}\exp\left(\beta_0 + \beta_1x_1 + \cdots + \beta_px_p\right)\] with baseline hazard \[h_0(t) = \exp(\beta_0)\gamma t^{\gamma-1} = \lambda\gamma t^{\gamma-1}\] We can rewrite this as \[h(t|X_1,\ldots, X_p) = h_0(t)\exp\left(\beta_1x_1 + \cdots + \beta_px_p\right)\]

Weibull

The general Proportional Hazards Model is \[h(t|X_1,\ldots, X_p) = h_0(t)\exp\left(\beta_1x_1 + \cdots + \beta_px_p\right)\]

\[\text{or}\]

\[\log\left[h(t|X_1,\ldots, X_p)\right] = \log\left[h_0(t)\right] + \beta_1x_1 + \cdots + \beta_px_p\]

where \(h_0(t)\) is the baseline hazard function and the "intercept" is \(\log\left[h_0(t)\right]\).

Semi-Parametric Regression

Weibull and Exponential are examples of parametric proportional hazards models, where \(h_0(t)\) is a specified function.
In 1972, Cox generalized these types of models so that we can make inferences on the \(\beta_1, \ldots,\beta_p\) without specifying \(h_0(t)\).
We call Cox a semi-parametric regression model
We fit this using something called Partial Likelihood Estimation
Once again we use an algorithm to maximize the partial likelihood.

Interpeting the Model

Let

\(X=0\) be the control group
\(X=1\) be the treatment group

Then

\[\begin{aligned} h(t|X=x) &= h_0(t)\exp(\beta x)\\ h(t|X=0) &= h_0(t)\\ &= \text{ baseline hazard for control group}\\ h(t|X=1) &= h_0(t)\exp(\beta)\\ &= \text{ hazard for treated group}\\ \exp(\beta) &= \dfrac{h(t|X=1)}{h(t|X=0)}\\ \end{aligned}\]

Interpeting the Model

This means that the hazard ratio is constant over time (Proportional Hazards)
\(\beta\) is the log hazard ratio or log-relative risk
According to the Cox model

\[\begin{aligned} \log\left[h]h(t|X=0)\right] &= \log\left[h_0(t)\right]\\ \log\left[h]h(t|X=1)\right] &= \log\left[h_0(t)\right] + \beta\\ \end{aligned}\]

This means the log of the hazard functions are parallel over time.
We make no assumptions about \(h_0(t)\).

Verifying Proportional Hazards Assumption

Recall \[S(t) = \exp\left(-\Lambda(t)\right)\] with a binary \(X\) we have that

\[\begin{aligned} \Lambda_1(t) &= \Lambda_0(t)\exp(\beta)\\ S_0(t) &= \exp(\Lambda_0(t))\\ -\log(S_0(t)) &= \Lambda_0(t)\\ \log(-\log(S_0(t))) &=\log(\Lambda_0(t))\\ \end{aligned}\]

Verifying Proportional Hazards Assumption

\[ \begin{aligned} S_1(t) = exp(-\Lambda_1(t)) &= \exp\left[\Lambda_0(t)\exp(\beta)\right]\\ -\log(S_1(t)) &= \Lambda_0(t)\exp(\beta)\\ \log(-\log(S_1(t))) &= \log(\Lambda_0(t)) + \beta\\ \end{aligned} \]

Verifying Proportional Hazards Assumption

Thus we can see that under the assumption of proportional hazards

\(\log(-\log(K-M))\) should be parallel over time.
We typically verify this graphically.
Recall the Leukemia study:

ggsurvplot(leukemia.surv,      
      legend.labs = c("Placebo", "Treatment"),break.time.by=5, 
            fun="cloglog")

Verifying Proportional Hazards Assumption

plot of chunk unnamed-chunk-7

Cox PH in R

cox.leukemia <- coxph(Surv(T.all, event.all) ~ group.all , data=leuk2 )
summary(cox.leukemia)

Cox PH in R

## Call:
## coxph(formula = Surv(T.all, event.all) ~ group.all, data = leuk2)
## 
##   n= 42, number of events= 30 
## 
##             coef exp(coef) se(coef)     z Pr(>|z|)    
## group.all -1.572     0.208    0.412 -3.81  0.00014 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##           exp(coef) exp(-coef) lower .95 upper .95
## group.all     0.208       4.82    0.0925     0.466
## 
## Concordance= 0.69  (se = 0.041 )
## Rsquare= 0.322   (max possible= 0.988 )
## Likelihood ratio test= 16.4  on 1 df,   p=0.00005
## Wald test            = 14.5  on 1 df,   p=0.0001
## Score (logrank) test = 17.2  on 1 df,   p=0.00003

This would suggest that the hazard for those in a placebo group is 4.8 times that of those in the treated group.

Cox PH in R

For another example could look at Colorectal Cancer in the PHS:

crc.cox <- coxph(Surv(cayrs, crc) ~ csmok + age , data= phscrc)
summary(crc.cox)

Then we could say that for two people with the same smoking status a one year increase in age would lead to an 8.2% increase in the hazard of colorectal cancer with a 95% CI of 6.9% to 9.6%.

Cox PH in R

## Call:
## coxph(formula = Surv(cayrs, crc) ~ csmok + age, data = phscrc)
## 
##   n= 16018, number of events= 254 
##    (16 observations deleted due to missingness)
## 
##          coef exp(coef) se(coef)     z Pr(>|z|)    
## csmok 0.31715   1.37320  0.09767  3.25   0.0012 ** 
## age   0.07904   1.08224  0.00628 12.58   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##       exp(coef) exp(-coef) lower .95 upper .95
## csmok      1.37      0.728      1.13      1.66
## age        1.08      0.924      1.07      1.10
## 
## Concordance= 0.724  (se = 0.015 )
## Rsquare= 0.01   (max possible= 0.262 )
## Likelihood ratio test= 160  on 2 df,   p=<2e-16
## Wald test            = 163  on 2 df,   p=<2e-16
## Score (logrank) test = 177  on 2 df,   p=<2e-16

Weibull Regression

library(SurvRegCensCov)
crc.exp <- survreg(Surv(cayrs, crc) ~ csmok + age, data= phscrc, dist='weibull')
ConvertWeibull(crc.exp)

Weibull Regression

## $vars
##          Estimate         SE
## lambda 0.00000664 0.00000295
## gamma  1.30451991 0.07929445
## csmok  0.31804318 0.09766224
## age    0.07916154 0.00627915
## 
## $HR
##         HR   LB   UB
## csmok 1.37 1.13 1.66
## age   1.08 1.07 1.10
## 
## $ETR
##         ETR    LB    UB
## csmok 0.784 0.675 0.910
## age   0.941 0.930 0.952