---
title: "Model Evaluation and Comparison"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Model Evaluation and Comparison}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5,
  warning = FALSE,
  message = FALSE,
  error = TRUE
)
```

```{r setup}
library(clinpubr)
library(dplyr)
library(survival)
library(ggplot2)
```

## Introduction

Evaluating clinical prediction models requires assessing discrimination, calibration, and clinical utility. This vignette demonstrates a comprehensive evaluation workflow:

1. **Model Comparison** --- Compare multiple models using discrimination and calibration metrics
2. **Time-Dependent ROC** --- Evaluate survival prediction models at specific time points
3. **C-Index** --- Global discrimination measure for survival models
4. **Variable Importance** --- Identify key predictive features

## Preparing Data

We'll use the NCCTG Lung Cancer dataset and fit three logistic regression models of increasing complexity:

```{r prepare-data}
data(cancer, package = "survival")
cancer$dead <- cancer$status == 2
cancer$event <- ifelse(cancer$status == 2, 1, 0)

cancer_clean <- cancer %>%
  mutate(sex = factor(sex, labels = c("Male", "Female"))) %>%
  na.omit()

# Fit three models with increasing complexity
cancer_clean$pred_model1 <- predict(
  glm(dead ~ age + sex + ph.karno, data = cancer_clean, family = binomial),
  type = "response"
)
cancer_clean$pred_model2 <- predict(
  glm(dead ~ age + sex + ph.karno + pat.karno, data = cancer_clean, family = binomial),
  type = "response"
)
cancer_clean$pred_model3 <- predict(
  glm(dead ~ age + sex + ph.karno + wt.loss, data = cancer_clean, family = binomial),
  type = "response"
)

knitr::kable(head(cancer_clean[, c("dead", "pred_model1", "pred_model2", "pred_model3")]),
  caption = "Model Predictions vs Actual Outcomes (First 6 Rows)"
)
```

## Model Comparison

`classif_model_compare()` provides a comprehensive comparison using multiple metrics (AUC, Accuracy, Sensitivity, Specificity, PPV, NPV, PRAUC, Brier Score) and generates ROC, PR, calibration, and DCA plots:

```{r model-compare}
model_comparison <- classif_model_compare(
  data = cancer_clean,
  target_var = "dead",
  model_names = c("pred_model1", "pred_model2", "pred_model3"),
  save_output = FALSE
)

knitr::kable(model_comparison$metric_table, caption = "Model Performance Comparison")
```

### Visualization

```{r model-plots-roc}
model_comparison$roc_plot
```

```{r model-plots-pr}
model_comparison$pr_plot
```

```{r model-plots-calibration}
model_comparison$calibration_plot
```

```{r model-plots-dca}
model_comparison$dca_plot
```

## Time-Dependent ROC Analysis

For survival outcomes, time-dependent ROC accounts for the timing of events. First, we fit a Cox proportional hazards model and obtain risk scores:

```{r fit-survival-model}
# Fit Cox model for survival prediction
cox_model <- coxph(Surv(time, event) ~ age + sex + ph.karno, data = cancer_clean)
cancer_clean$risk_score <- predict(cox_model, type = "risk")
```

Now we can evaluate the model using time-dependent ROC at specific time points:

```{r time-roc}
time_roc_result <- time_roc_plot(
  data = cancer_clean,
  event_var = "event",
  time_var = "time",
  marker_var = "risk_score",
  times = c(200, 365),
  time_unit = "days",
  save_plot = FALSE
)

time_roc_result
```

## C-Index

The C-index (concordance index) measures discrimination for survival models. Interpretation: 0.5 = no discrimination, 0.7-0.8 = acceptable, >0.8 = excellent.

```{r c-index}
c_index <- calc_cindex(cancer_clean, "time", "event", "risk_score")
c_index
```

## Variable Importance
The variable importance list could be obtained from the different models like random forest, gradient boosting machine, etc.
Here, we assume we have the variable importance scores from some random model.
```{r importance-plot}
importance_scores <- c(
  Age = 0.25, Sex = 0.15, Karnofsky = 0.30,
  WeightLoss = 0.12, Calories = 0.08, PatKarnofsky = 0.10
)

importance_plot(x = importance_scores, x_lab = "Variable Importance")
```

Show only top variables with custom styling:

```{r custom-importance}
importance_plot(
  x = importance_scores,
  x_lab = "Relative Importance",
  top_n = 4,
  color = "steelblue"
)
```

## Summary

### Key Functions

- **`classif_model_compare()`**: Comprehensive model comparison with ROC, PR, calibration, and DCA plots
- **`time_roc_plot()`**: Time-dependent ROC analysis for survival outcomes
- **`calc_cindex()`**: Calculate concordance index for survival models
- **`importance_plot()`**: Visualize variable importance

### Evaluation Checklist

1. **Discrimination**: AUC (binary) or C-index (survival), sensitivity/specificity at clinical thresholds
2. **Calibration**: Calibration plots, Brier score
3. **Clinical utility**: Decision curve analysis, net benefit at relevant thresholds
4. **External validation**: Performance in independent populations