Module auton_survival.metrics
Tools to compute metrics used to assess survival outcomes and survival model performance.
Functions
def treatment_effect(metric, outcomes, treatment_indicator, weights=None, horizons=None, risk_levels=None, interpolate=True, weights_clip=0.01, n_bootstrap=None, size_bootstrap=1.0, random_seed=0)-
Compute metrics for comparing population level survival outcomes across treatment arms.
Parameters
metric:str- The metric to evalute for comparing survival outcomes.
- Options include:
- -
median - -
tar - -
hazard_ratio - -
restricted_mean - -
survival_at outcomes:pd.DataFrame- A pandas dataframe with rows corresponding to individual samples and columns 'time' and 'event'.
treatment_indicator:np.array- Boolean numpy array of treatment indicators. True means individual was assigned a specific treatment.
weights:pd.Series, default=None- Treatment assignment propensity scores, \widehat{\mathbb{P}}(A|X=x) .
If
None, all weights are set to 0.5 . Default isNone. horizons:floatorintorarrayoffloatsorints, default=None- Time horizon(s) at which to compute the metric. Must be specified for metric 'restricted_mean' and 'survival_at'. For 'hazard_ratio' this is ignored.
risk_levels:floatorarrayoffloats- The risk level (0-1) at which to compare times between treatment arms. Must be specified for metric 'tar'. Ignored for other metrics.
interpolate:bool, default=True- Whether to interpolate the survival curves.
weights_clip:float- Weights below this value are clipped. This is to ensure IPTW estimation is numerically stable. Large weights can result in estimator with high variance.
n_bootstrap:int, default=None- The number of bootstrap samples to use. If None, bootrapping is not performed.
size_bootstrap:float, default=1.0- The fraction of the population to sample for each bootstrap sample.
random_seed:int, default=0- Controls the reproducibility random sampling for bootstrapping.
Returns
float or list: The metric value(s) for the specified metric.
def survival_regression_metric(metric, outcomes, predictions, times, outcomes_train=None)-
Compute metrics to assess survival model performance.
Parameters
metric:string- Measure used to assess the survival regression model performance.
Options include:
-
brs: brier score -ibs: integrated brier score -auc: cumulative dynamic area under the curve -ctd: concordance index inverse probability of censoring weights (ipcw) outcomes:pd.DataFrame- A pandas dataframe with rows corresponding to individual samples and columns 'time' and 'event' for evaluation data.
predictions:np.array- A numpy array of survival time predictions for the samples.
times:np.array- The time points at which to compute metric value(s).
outcomes_train:pd.DataFrame- A pandas dataframe with rows corresponding to individual samples and columns 'time' and 'event' for training data.
Returns
float: The metric value for the specified metric.
def phenotype_purity(phenotypes_train, outcomes_train, phenotypes_test=None, outcomes_test=None, strategy='instantaneous', horizons=None, bootstrap=None)-
Compute the brier score to assess survival model performance for phenotypes.
Parameters
phenotypes_train:np.array- A numpy array containing an array of integers that define subgroups for the train set.
outcomes_train:pd.DataFrame- A pandas dataframe with rows corresponding to individual samples and columns 'time' and 'event' for the train set.
phenotypes_test:np.array- A numpy array containing an array of integers that define subgroups for the test set.
outcomes_test:pd.DataFrame- A pandas dataframe with rows corresponding to individual samples and columns 'time' and 'event' for the test set.
strategy:string, default='instantaneous'- Options include:
-
instantaneous: Compute the brier score. -integrated: Compute the integrated brier score. horizons:floatorintoran arrayoffloatsorints, default=None- Event horizon(s) at which to compute the metric
bootstrap:integer, default=None- The number of bootstrap iterations.
Returns
list:- Columns are metric values computed for each event horizon. If bootstrapping, rows are bootstrap results.