Title: | Predictive Evaluation Metrics in Survival Analysis |
---|---|
Description: | An implementation of popular evaluation metrics that are commonly used in survival prediction including Concordance Index, Brier Score, Integrated Brier Score, Integrated Square Error, Integrated Absolute Error and Mean Absolute Error. For a detailed information, see (Ishwaran H, Kogalur UB, Blackstone EH and Lauer MS (2008) <doi:10.1214/08-AOAS169>) and (Moradian H, Larocque D and Bellavance F (2017) <doi:10.1007/s10985-016-9372-1>) for different evaluation metrics. |
Authors: | Hanpu Zhou [aut, cre], Xuewei Cheng [aut], Sizheng Wang [aut], Yi Zou [aut], Hong Wang [aut] |
Maintainer: | Hanpu Zhou <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.5.0 |
Built: | 2025-01-09 23:25:38 UTC |
Source: | https://github.com/cranhaven/cranhaven.r-universe.dev |
The Brier Score was proposed by Glenn W. Brier in 1950 which is a proper score function that measures the accuracy of probabilistic predictions, usually used to measure the accuracy of a model fit for survival data. Brier can calculate the value of Brier Score at any timepoint, regardless of whether it is the event time.
Brier(object, pre_sp, t_star = -1)
Brier(object, pre_sp, t_star = -1)
object |
object of class |
pre_sp |
If the input of the parameter |
t_star |
the timepoint at which the Brier Score you want to calculate.
If the input of the parameter |
The Brier Score is the mean square difference between the true classes and the predicted probabilities. So the Brier Score can be thought of as a cost function. Therefore, the lower the Brier Score is for a set of predictions, the better the predictions are calibrated. The Brier Score takes on a value between zero and one, since this is the square of the largest possible difference between a predicted probability and the actual outcome. As we all know, for the cencoring samples, we do not know the real time of death, so the residual cannot be directly calculated when making the prediction. So the Brier Score is widely used in survival analysis.
The Brier Score is a strictly proper score (Gneiting and Raftery, 2007), which means that it takes its minimal value only when the predicted probabilities match the empirical probabilities.
Judging from the sparse empirical evidence, predictions of duration of survival tend to be rather inaccurate. More precision is achieved by using patient-specific survival probabilities and the Brier score as predictions to discriminate future survivors from failures.
the Brier Score at time t_star between the true classes and the predicted probabilities.
Hanpu Zhou [email protected]
Graf, Erika, Schmoor, Claudia, Sauerbrei, & Willi, et al. (1999). Assessment and comparison of prognostic classification schemes for survival data. Statist. Med., 18(1718), 2529-2545.
Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78.
Gneiting, T. , & Raftery, A. E. . (2007). Strictly Proper Scoring Rules, Prediction, and Estimation.
library(survival) time <- rexp(50) status <- sample(c(0, 1), 50, replace = TRUE) pre_sp <- runif(50) t_star <- runif(1) Brier(Surv(time, status), pre_sp, t_star)
library(survival) time <- rexp(50) status <- sample(c(0, 1), 50, replace = TRUE) pre_sp <- runif(50) t_star <- runif(1) Brier(Surv(time, status), pre_sp, t_star)
Concordance index is a rank correlation measures between a variable X and a possibly censored variable Y, with event/censoring indicator. In survival analysis, a pair of patients is called concordant if the risk of the event predicted by a model is lower for the patient who experiences the event at a later timepoint. The concordance probability (C-index) is the frequency of concordant pairs among all pairs of subjects. It can be used to measure and compare the discriminative power of a risk prediction models.
Cindex(object, predicted, t_star = -1)
Cindex(object, predicted, t_star = -1)
object |
object of class |
predicted |
If the input of the parameter |
t_star |
the timepoint at which the C-index you want to calculate.
If the input of the parameter |
Pairs with identical observed times, where one is uncensored and one is
censored, are always considered usuable (independent of the value of
tiedOutcomeIn
), as it can be assumed that the event occurs at a later
timepoint for the censored observation.
For uncensored response the result equals the one obtained with the
functions rcorr.cens
and rcorrcens
from the Hmisc
package (see examples).
Estimates of the C-index between the survival time and the predicted result.
Hanpu Zhou [email protected]
Ishwaran, H. , Kogalur, U. B. , Blackstone, E. H. , & Lauer, M. S. . (2008). Random survival forests. Journal of Thoracic Oncology Official Publication of the International Association for the Study of Lung Cancer, 2(12), 841-860.
Kang, L. , Chen, W. , Petrick, N. A. , & Gallas, B. D. . (2015). Comparing two correlated c indices with right-censored survival outcome: a one-shot nonparametric approach. Statistics in Medicine, 34(4).
TA Gerds, MW Kattan, M Schumacher, and C Yu. Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring. Statistics in Medicine, Ahead of print:to appear, 2013. DOI = 10.1002/sim.5681
Wolbers, M and Koller, MT and Witteman, JCM and Gerds, TA (2013) Concordance for prognostic models with competing risks Research report 13/3. Department of Biostatistics, University of Copenhagen
Andersen, PK (2012) A note on the decomposition of number of life years lost according to causes of death Research report 12/2. Department of Biostatistics, University of Copenhagen
Paul Blanche, Michael W Kattan, and Thomas A Gerds. The c-index is not proper for the evaluation of-year predicted risks. Biostatistics, 20(2): 347–357, 2018.
library(survival) time <- c(1, 1, 2, 2, 2, 2, 2, 2) status <- c(0, 1, 1, 0, 1, 1, 0, 1) predicted <- c(2, 3, 3, 3, 4, 2, 4, 3) Cindex(Surv(time, status), predicted)
library(survival) time <- c(1, 1, 2, 2, 2, 2, 2, 2) status <- c(0, 1, 1, 0, 1, 1, 0, 1) predicted <- c(2, 3, 3, 3, 4, 2, 4, 3) Cindex(Surv(time, status), predicted)
The C-index (Concordance index) of the prognostic model in the presence of competing risks according to Marcel, W et al.(2014).
CindexCR(time, status, predicted, Cause_int = 1)
CindexCR(time, status, predicted, Cause_int = 1)
time |
minimum value of deletion time and survival time. |
status |
the status indicator, for models with competing risks, the status indicator is 0=censored, 1=event at |
predicted |
a vector of predicted values or the survival time of survival probabilities of each observation. |
Cause_int |
event type of interest, the default value is 1. |
Estimates of the C-index in the presence of competing risks.
HanPu Zhou [email protected]
Marcel, W. , Paul, B. , Koller, M. T. , Witteman, J. , & Gerds, T. A. . (2014).Concordance for prognostic models with competing risks. Biostatistics(3), 526.
Ishwaran, H. , Kogalur, U. B. , Blackstone, E. H. , & Lauer, M. S. . (2008). Random survival forests. Journal of Thoracic Oncology Official Publication of the International Association for the Study of Lung Cancer, 2(12), 841-860.
time <- c(4, 7, 5, 8) status <- rep(1, 4) predicted <- c(3, 5, 7, 10) Cause_int <- 1 CindexCR(time, status, predicted, Cause_int)
time <- c(4, 7, 5, 8) status <- rep(1, 4) predicted <- c(3, 5, 7, 10) Cause_int <- 1 CindexCR(time, status, predicted, Cause_int)
G(t)=P(C>t) denote the Kaplan-Meier estimate of the censoring distribution which is used to adjust for censoring. Gt is used to calculate G(t) at any timepoint you want.
Gt(object, timepoint)
Gt(object, timepoint)
object |
object of class |
timepoint |
any point in time you want to get the Kaplan–Meier estimate of the censoring. |
The Kaplan–Meier estimate of the censoring distribution and the value of G(t) is between 0 and 1.
Hanpu Zhou [email protected]
Graf, Erika, Schmoor, Claudia, Sauerbrei, & Willi, et al. (1999). Assessment and comparison of prognostic classification schemes for survival data. Statist. Med., 18(1718), 2529-2545.
Kaplan, E. L. , & Meier, P. . (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53, 457-481.
library(survival) time <- rexp(50) status <- sample(c(0, 1), 50, replace = TRUE) pre_sp <- runif(50) timepoint <- runif(1) Gt(Surv(time, status), timepoint)
library(survival) time <- rexp(50) status <- sample(c(0, 1), 50, replace = TRUE) pre_sp <- runif(50) timepoint <- runif(1) Gt(Surv(time, status), timepoint)
Two ways of the continuous-time approach to continuous-time identification based on least-squares and least-absolute errors are proposed. Integrate Absolute Error and Integrate Square Error.To evaluate the performance of survival models methods Lower values of IAE or ISE indicate better performances.
IAEISE(object, sp_matrix, IRange = c(-2, -1))
IAEISE(object, sp_matrix, IRange = c(-2, -1))
object |
object of class |
sp_matrix |
a matrix or data.frame of predicted values of survival probabilities for the testing set. rows denote different samples, columns denote different time points, and the values in row i and column j of the matrix denote the predicted survival probability of the ith sample at the time point corresponding to the jth column. |
IRange |
a vector contains all discrete time points corresponding to the predicted probability in sp_matrix. Or the scale you want to get the IAE and ISE; . |
Estimates of the Integrate Absolute Error and the Integrate Square Error of the predicted values of survival probabilities.
Hanpu Zhou [email protected]
Marron, J. S. , & Wand, M. P. . (1992). Exact mean integrated squared error. Annals of Statistics, 20(2), 712-736.
HooraMoradian, DenisLarocque, & FranoisBellavance. (2017). L1 splitting rules in survival forests. Lifetime Data Analysis, 23(4), 671–691.
Kowalczuk, & Z. (1998). Integrated squared error and integrated absolute error in recursive identification of continuous-time plants. Control 98 Ukacc International Conference on (Vol.1998, pp.693-698). IET.
library(survival) library(SurvMetrics) set.seed(123) N <- 100 mydata <- SDGM4(N, p = 20, c_step = -0.5) index.train <- sample(1:N, 2 / 3 * N) data.train <- mydata[index.train, ] data.test <- mydata[-index.train, ] time_interest <- sort(data.train$time[data.train$status == 1]) sp_matrix <- matrix(sort(runif(nrow(data.test) * length(time_interest)), decreasing = TRUE ), nrow = nrow(data.test)) object <- Surv(data.test$time, data.test$status) # a vector for all the distinct time IAEISE(object, sp_matrix, time_interest) # a range IAEISE(object, sp_matrix, c(12, 350))
library(survival) library(SurvMetrics) set.seed(123) N <- 100 mydata <- SDGM4(N, p = 20, c_step = -0.5) index.train <- sample(1:N, 2 / 3 * N) data.train <- mydata[index.train, ] data.test <- mydata[-index.train, ] time_interest <- sort(data.train$time[data.train$status == 1]) sp_matrix <- matrix(sort(runif(nrow(data.test) * length(time_interest)), decreasing = TRUE ), nrow = nrow(data.test)) object <- Surv(data.test$time, data.test$status) # a vector for all the distinct time IAEISE(object, sp_matrix, time_interest) # a range IAEISE(object, sp_matrix, c(12, 350))
IBS is an integrated version of the Brier which is used to calculate the integration of the Brier Score. The Brier Score is the mean square difference between the true classes and the predicted probabilities. Basically, the IBS is an integrated weighted squared distance between the estimated survival function and the empirical survival function. The inverse probability censoring weighting(IPCW) is used to adjust for censoring.
IBS(object, sp_matrix, IBSrange = c(-2, -1))
IBS(object, sp_matrix, IBSrange = c(-2, -1))
object |
object of class |
sp_matrix |
a matrix or data.frame of predicted values of survival probabilities for the testing set. Rows denote different samples, columns denote different time points, and the values in entry (i,j) of the matrix denote the predicted survival probability of the ith sample at the time point corresponding to the jth column. |
IBSrange |
a vector contains all discrete time points corresponding to the predicted probability in sp_matrix. Or the scale you want to get the IBS; and if it is a single point the return value will be the Brier Score at the timepoint. |
The percentage of censored observations increases in time, and this will surely affect the dispersion of the empirical Brier Score. The question of how censoring in finite samples acts on the distribution of our measures of inaccuracy is an interesting subject. Our recommendation is to choose t* in a way that censoring is not too heavy (for example, the median follow-up time). We also prefer measures with integrated loss functions since they will reflect inaccuracy over an interval rather than just at one point in time. In addition, the corresponding empirical measures are likely to have lower dispersion, because censored observations contribute their estimated event-free probabilities to the integrand until the censoring occurs.
The integration of the Brier score of the predicted values of survival probabilities on the discrete time points or the time scale of interest to users.
Hanpu Zhou [email protected]
HooraMoradian, DenisLarocque, & FranoisBellavance. (2017). \(l_1\) splitting rules in survival forests. Lifetime Data Analysis, 23(4), 671–691.
Graf, Erika, Schmoor, Claudia, Sauerbrei, & Willi, et al. (1999). Assessment and comparison of prognostic classification schemes for survival data. Statist. Med., 18(1718), 2529-2545.
Brier, G. W. . (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78.
Gneiting, T. , & Raftery, A. E. . (2007). Strictly Proper Scoring Rules, Prediction, and Estimation.
library(survival) library(SurvMetrics) set.seed(123) N <- 100 mydata <- SDGM4(N, p = 20, c_step = -0.5) index.train <- sample(1:N, 2 / 3 * N) data.train <- mydata[index.train, ] data.test <- mydata[-index.train, ] time_interest <- sort(data.train$time[data.train$status == 1]) sp_matrix <- matrix(sort(runif(nrow(data.test) * length(time_interest)), decreasing = TRUE ), nrow = nrow(data.test)) object <- Surv(data.test$time, data.test$status) # the default time points IBS(object, sp_matrix, time_interest) # a time range IBS(object, sp_matrix, c(18:100))
library(survival) library(SurvMetrics) set.seed(123) N <- 100 mydata <- SDGM4(N, p = 20, c_step = -0.5) index.train <- sample(1:N, 2 / 3 * N) data.train <- mydata[index.train, ] data.test <- mydata[-index.train, ] time_interest <- sort(data.train$time[data.train$status == 1]) sp_matrix <- matrix(sort(runif(nrow(data.test) * length(time_interest)), decreasing = TRUE ), nrow = nrow(data.test)) object <- Surv(data.test$time, data.test$status) # the default time points IBS(object, sp_matrix, time_interest) # a time range IBS(object, sp_matrix, c(18:100))
A somewhat naive criterion that is sometimes used consists of simply omitting all censored cases from the data set. For survival analysis problems, the mean absolute error (MAE) can be defined as an average of the differences between the predicted time values and the actual observation time values. Only the samples for which the event occurs are being considered in this metric.
MAE(object, pre_time)
MAE(object, pre_time)
object |
object of class |
pre_time |
a vector of predicted values of survival time of each observation. |
Condition: MAE can only be used for the evaluation of survival models which can provide the event time as the predicted target value.
the value of the Mean Absolute Error between the survival time and the predicted result.
Hanpu Zhou [email protected]
Matsuo, K. , Purushotham, S. , Jiang, B. , Mandelbaum, R. S. , Takiuchi, T. , & Liu, Y. , et al. (2018). Survival outcome prediction in cervical cancer: cox models vs deep-learning model. American Journal of Obstetrics & Gynecology. Coyle, E. J. , & Lin, J. H. . (1988). Stack filters and the mean absolute error criterion. IEEE Trans Acoustics Speech Signal Processing, 36(8), 1244-1254.
library(survival) time <- rexp(50) status <- sample(c(0, 1), 50, replace = TRUE) pre_time <- rexp(50) MAE(Surv(time, status), pre_time)
library(survival) time <- rexp(50) status <- sample(c(0, 1), 50, replace = TRUE) pre_time <- rexp(50) MAE(Surv(time, status), pre_time)
Function to extract survival probability predictions from survreg
modeling approach.
predictSurvProb2survreg(object, newdata, time_days)
predictSurvProb2survreg(object, newdata, time_days)
object |
A model fitted by |
newdata |
A data frame containing predictor variable combinations for which to compute predicted survival probabilities. |
time_days |
A vector of times in the range of the response variable, We.g. times when the response is a survival object, at which to return the survival probabilities. |
A matrix with as many rows as NROW(newdata) and as many columns as length(time_days). Each entry should be a probability and in rows the values should be decreasing.
Hanpu Zhou [email protected]
library(survival) set.seed(1234) mydata <- kidney[, -1] train_index <- sample(1:nrow(mydata), 0.7 * nrow(mydata)) train_data <- mydata[train_index, ] test_data <- mydata[-train_index, ] survregfit <- survreg(Surv(time, status) ~ ., dist = 'weibull', data = train_data) pre_sb <- predictSurvProb2survreg(survregfit, test_data, c(10, 20))
library(survival) set.seed(1234) mydata <- kidney[, -1] train_index <- sample(1:nrow(mydata), 0.7 * nrow(mydata)) train_data <- mydata[train_index, ] test_data <- mydata[-train_index, ] survregfit <- survreg(Surv(time, status) ~ ., dist = 'weibull', data = train_data) pre_sb <- predictSurvProb2survreg(survregfit, test_data, c(10, 20))
Survival data generation method. An example of the proportional hazards model where in the Cox model is expected to perform best.
SDGM1(N = 200, p = 15, c_mean = 0.4)
SDGM1(N = 200, p = 15, c_mean = 0.4)
N |
The sample size of the simulated dataset. |
p |
The covariate dimension of the simulated dataset. |
c_mean |
The parameter which is used to control the censoring rate. |
the simulated dataset
Hanpu Zhou [email protected]
Steingrimsson, J. A. , Diao, L. , & Strawderman, R. L. . (2019). Censoring unbiased regression trees and ensembles. Journal of the American Statistical Association, 114.
Zhu, R. , & Kosorok, M. R. . (2012). Recursively imputed survival trees. Journal of the American Statistical Association, 107(497), 331-340.
Ishwaran, H. , Kogalur, U. B. , Gorodeski, E. Z. , Minn, A. J. , & Lauer, M. S. . (2010). High-dimensional variable selection for survival data. Journal of the American Statistical Association, 105(489), 205-217.
SDGM1(N = 200, p = 15, c_mean = 0.4)
SDGM1(N = 200, p = 15, c_mean = 0.4)
Survival data generation method. The dataset represents mild violations of the proportional hazards assumption.
SDGM2(N = 200, p = 15, u_max = 4)
SDGM2(N = 200, p = 15, u_max = 4)
N |
The sample size of the simulated dataset. |
p |
The covariate dimension of the simulated dataset. |
u_max |
The parameter which is used to control the censoring rate. |
the simulated dataset
Hanpu Zhou [email protected]
Steingrimsson, J. A. , Diao, L. , & Strawderman, R. L. . (2019). Censoring unbiased regression trees and ensembles. Journal of the American Statistical Association, 114.
Zhu, R. , & Kosorok, M. R. . (2012). Recursively imputed survival trees. Journal of the American Statistical Association, 107(497), 331-340.
Ishwaran, H. , Kogalur, U. B. , Gorodeski, E. Z. , Minn, A. J. , & Lauer, M. S. . (2010). High-dimensional variable selection for survival data. Journal of the American Statistical Association, 105(489), 205-217.
SDGM2(N = 200, p = 15, u_max = 4)
SDGM2(N = 200, p = 15, u_max = 4)
Survival data generation method. The proportional hazards assumption is strongly violated in this dataset.
SDGM3(N = 200, p = 15, u_max = 7)
SDGM3(N = 200, p = 15, u_max = 7)
N |
The sample size of the simulated dataset. |
p |
The covariate dimension of the simulated dataset. |
u_max |
The parameter which is used to control the censoring rate. |
the simulated dataset
Hanpu Zhou [email protected]
Steingrimsson, J. A. , Diao, L. , & Strawderman, R. L. . (2019). Censoring unbiased regression trees and ensembles. Journal of the American Statistical Association, 114.
Zhu, R. , & Kosorok, M. R. . (2012). Recursively imputed survival trees. Journal of the American Statistical Association, 107(497), 331-340.
Ishwaran, H. , Kogalur, U. B. , Gorodeski, E. Z. , Minn, A. J. , & Lauer, M. S. . (2010). High-dimensional variable selection for survival data. Journal of the American Statistical Association, 105(489), 205-217.
SDGM3(N = 200, p = 15, u_max = 7)
SDGM3(N = 200, p = 15, u_max = 7)
Survival data generation method. An example of the proportional hazards model where in the Cox model is expected to perform best.
SDGM4(N = 200, p = 15, c_step = 0.4)
SDGM4(N = 200, p = 15, c_step = 0.4)
N |
The sample size of the simulated dataset. |
p |
The covariate dimension of the simulated dataset. |
c_step |
The parameter which is used to control the censoring rate. |
the simulated dataset
Hanpu Zhou [email protected]
Steingrimsson, J. A. , Diao, L. , & Strawderman, R. L. . (2019). Censoring unbiased regression trees and ensembles. Journal of the American Statistical Association, 114.
Zhu, R. , & Kosorok, M. R. . (2012). Recursively imputed survival trees. Journal of the American Statistical Association, 107(497), 331-340.
Ishwaran, H., Kogalur, U. B., Gorodeski, E.Z., Minn, A.J., & Lauer, M. S. . (2010). High-dimensional variable selection for survival data. Journal of the American Statistical Association, 105(489), 205-217.
SDGM4(N = 200, p = 15, c_step = 0.4)
SDGM4(N = 200, p = 15, c_step = 0.4)