Analysing EQ-5D data

Introduction

This vignette demonstrates how eq5dsuite supports the full EQ-5D analytical workflow recommended by Devlin et al. (2020), using the NHS Patient Reported Outcome Measures (PROMs) dataset bundled with the package. The dataset includes patients undergoing hip replacement, knee replacement, groin hernia repair, and varicose vein surgery, who completed the EQ-5D-3L before and after their procedure.

# The example dataset is bundled with the package
data(example_data)
head(example_data)
#>    id    time mo sc ua pd ad vas providercode       procedure    year ageband
#> 1  86  Pre-op  2  2  3  3  1  85        NT213 Hip Replacement 2013/14    <NA>
#> 2  86 Post-op  1  1  1  1  1  88        NT213 Hip Replacement 2013/14    <NA>
#> 3 121  Pre-op  2  2  2  2  2  85        NT218 Hip Replacement 2013/14    <NA>
#> 4 121 Post-op  1  1  2  2  2  70        NT218 Hip Replacement 2013/14    <NA>
#> 5 123  Pre-op  2  2  2  2  1  35        NT218 Hip Replacement 2013/14    <NA>
#> 6 123 Post-op  2  2  2  2  2  65        NT218 Hip Replacement 2013/14    <NA>
#>   gender
#> 1   <NA>
#> 2   <NA>
#> 3   <NA>
#> 4   <NA>
#> 5   <NA>
#> 6   <NA>

Worked example 1: Hip replacement outcomes

This example illustrates a typical single-group longitudinal workflow, addressing three practical questions: which problems do patients report before surgery, how does health change following the procedure, and how large is the overall health gain?

Data preparation

dim_names <- c("mo", "sc", "ua", "pd", "ad")

# Subset to hip replacement patients
hip_data <- example_data[example_data$procedure == "Hip Replacement", ]

# Add profile code and EQ-5D value
hip_data$profile_code <- toEQ5Dindex(
  x         = hip_data,
  dim.names = dim_names
)
hip_data$value <- eq5d3l(
  hip_data[, dim_names],
  country   = "UK",
  dim.names = dim_names
)

# Pre-operative subset
hip_preop <- hip_data[hip_data$time == "Pre-op", ]

Pre-operative profile distribution

eq5d_profile_level_summary(
  df           = hip_preop,
  names_eq5d   = dim_names,
  eq5d_version = "3L"
)
#> No ordering of time suppled. The time variable will be factorised according to the order in the data frame.
#> Warning in .prep_eq5d(df = df, names = names_eq5d, eq5d_version =
#> eq5d_version): 346 observations were coerced to NAs as they were not
#> interpretable as integer values in the range allowed by the EQ-5D descriptive
#> system.
#>                                        level n_All_mo freq_All_mo n_All_sc
#> 1                                          1      158 0.086956522      870
#> 2                                          2     1646 0.905888828      915
#> 3                                          3       13 0.007154651       23
#> 4                                      Total     1817 1.000000000     1808
#> 5 Number reporting any problems (levels 2+3)     1659 0.913043478      938
#> 6                               Missing data       61 0.032481363       70
#>   freq_All_sc n_All_ua freq_All_ua n_All_pd freq_All_pd n_All_ad freq_All_ad
#> 1  0.48119469      128  0.07067918       17  0.00942873     1107  0.61329640
#> 2  0.50608407     1360  0.75096632     1044  0.57903494      610  0.33795014
#> 3  0.01272124      323  0.17835450      742  0.41153633       88  0.04875346
#> 4  1.00000000     1811  1.00000000     1803  1.00000000     1805  1.00000000
#> 5  0.51880531     1683  0.92932082     1786  0.99057127      698  0.38670360
#> 6  0.03727370       67  0.03567625       75  0.03993610       73  0.03887114

Most patients report problems with pain/discomfort, usual activities, and mobility, consistent with the profile expected in patients awaiting hip replacement.

Most common health states

eq5d_profile_top_states(
  df           = hip_preop,
  names_eq5d   = dim_names,
  eq5d_version = "3L",
  n            = 5
)
#> Warning in .prep_eq5d(df = df, names = names_eq5d, eq5d_version = eq5d_version,
#> : 346 observations were coerced to NAs as they were not interpretable as
#> integer values in the range allowed by the EQ-5D descriptive system.
#>   Health state Frequency  Percentage Cumulative percentage
#> 1        21221       325 0.184031710             0.1840317
#> 2        22221       221 0.125141563             0.3091733
#> 3        22231       144 0.081540204             0.3907135
#> 4        22232       138 0.078142695             0.4688562
#> 5        22222       116 0.065685164             0.5345413
#> 6          ...        NA          NA                    NA
#> 7        33333         3 0.001698754             1.0000000
#> 8      Missing       112 0.059637913                    NA

Change between time points: PCHC

The Paretian Classification of Health Change (PCHC) classifies each patient as Better, Worse, Mixed, or No Change between time points. All plot functions in eq5dsuite return ggplot2 objects that can be customised directly:

eq5d_profile_pchc_by_group_plot(
  df         = hip_data,
  name_id    = "id",
  names_eq5d = dim_names,
  name_fu    = "time",
  levels_fu  = c("Pre-op", "Post-op")
)$p +
  ggplot2::labs(
    title = "PCHC: hip replacement patients",
    x     = NULL,
    y     = "Percentage of respondents"
  ) +
  ggplot2::theme_minimal()
#> Warning in .prep_eq5d(df = df, names = names_eq5d): 535 observations were
#> coerced to NAs as they were not interpretable as integer values in the range
#> allowed by the EQ-5D descriptive system.
PCHC classification for hip replacement patients.
PCHC classification for hip replacement patients.

Dimensions driving improvement

eq5d_profile_better_dimensions_by_group_plot(
  df         = hip_data,
  name_id    = "id",
  names_eq5d = dim_names,
  name_fu    = "time",
  levels_fu  = c("Pre-op", "Post-op")
)$p +
  ggplot2::theme_minimal()
#> Warning in .prep_eq5d(df = df, names = names_eq5d): 535 observations were
#> coerced to NAs as they were not interpretable as integer values in the range
#> allowed by the EQ-5D descriptive system.
Dimensions improved among patients classified as Better.
Dimensions improved among patients classified as Better.

EQ-5D value summary

eq5d_utility_summary(
  df           = hip_data,
  name_fu      = "time",
  levels_fu    = c("Pre-op", "Post-op"),
  names_eq5d   = dim_names,
  eq5d_version = "3L",
  country      = "UK"
)
#> Warning in .prep_eq5d(df = df, names = names_eq5d, add_state = TRUE,
#> add_utility = TRUE, : 535 observations were coerced to NAs as they were not
#> interpretable as integer values in the range allowed by the EQ-5D descriptive
#> system.
#>                  name       Post-op        Pre-op
#> 1                Mean  7.927837e-01  3.654281e-01
#> 2      Standard error  5.868664e-03  7.785437e-03
#> 3              Median  8.480001e-01  5.160000e-01
#> 4                Mode  1.000000e+00  6.910000e-01
#> 5  Standard deviation  2.491937e-01  3.271736e-01
#> 6            Kurtosis  6.093719e+00  1.642925e+00
#> 7            Skewness -1.650594e+00 -2.961227e-01
#> 8             Minimum -3.490000e-01 -5.940000e-01
#> 9             Maximum  1.000000e+00  1.000000e+00
#> 10              Range  1.349000e+00  1.594000e+00
#> 11       Observations  1.803000e+03  1.766000e+03
#> 12        Missing (n)  7.500000e+01  1.120000e+02
#> 13       Total sample  1.878000e+03  1.878000e+03
#> 14        Missing (%)  3.993610e-02  5.963791e-02

EQ-VAS summary

eq5d_vas_summary(
  df        = hip_data,
  name_vas  = "vas",
  name_fu   = "time",
  levels_fu = c("Pre-op", "Post-op")
)
#> Warning in .prep_vas(df = df, name = name_vas): TRUE observations were coerced
#> to NAs as they were not interpretable as integer values in the range allowed by
#> the EQ-5D descriptive system.
#>                  name       Post-op       Pre-op
#> 1                Mean   77.01895206   65.5425849
#> 2      Standard error    0.42801380    0.5326207
#> 3              Median   80.00000000   70.0000000
#> 4                Mode   90.00000000   80.0000000
#> 5  Standard deviation   18.12879736   21.8244527
#> 6            Kurtosis    4.92332191    2.9313944
#> 7            Skewness   -1.33560730   -0.7415802
#> 8             Minimum    0.00000000    0.0000000
#> 9             Maximum  100.00000000  100.0000000
#> 10              Range  100.00000000  100.0000000
#> 11       Observations 1794.00000000 1679.0000000
#> 12        Missing (n)   84.00000000  199.0000000
#> 13       Total sample 1878.00000000 1878.0000000
#> 14        Missing (%)    0.04472843    0.1059638

Worked example 2: Comparing two procedures

This example demonstrates a cross-sectional group comparison using pre-operative data from knee replacement and groin hernia patients — two groups with very different levels of pre-operative health burden.

Data preparation

# Subset to two procedures, pre-operative only
procs <- c("Knee Replacement", "Groin Hernia")
comparison_data <- example_data[
  example_data$procedure %in% procs &
  example_data$time == "Pre-op", ]

# Add profile code and EQ-5D value
comparison_data$profile_code <- toEQ5Dindex(
  x         = comparison_data,
  dim.names = dim_names
)
comparison_data$value <- eq5d3l(
  comparison_data[, dim_names],
  country   = "UK",
  dim.names = dim_names
)

Profile comparison by group

eq5d_profile_level_summary_by_group(
  df           = comparison_data,
  names_eq5d   = dim_names,
  name_cat     = "procedure",
  eq5d_version = "3L"
)
#> No ordering of time suppled. The time variable will be factorised according to the order in the data frame.
#> Warning in .prep_eq5d(df = df, names = names_eq5d, eq5d_version =
#> eq5d_version): 427 observations were coerced to NAs as they were not
#> interpretable as integer values in the range allowed by the EQ-5D descriptive
#> system.
#>                                        level n_Knee Replacement_mo
#> 1                                          1                   133
#> 2                                          2                  1789
#> 3                                          3                     9
#> 4                                      Total                  1931
#> 5 Number reporting any problems (levels 2+3)                  1798
#> 8                               Missing data                    65
#>   freq_Knee Replacement_mo n_Groin Hernia_mo freq_Groin Hernia_mo
#> 1              0.068876230               735          0.823068309
#> 2              0.926462973               158          0.176931691
#> 3              0.004660798                NA                   NA
#> 4              1.000000000               893          1.000000000
#> 5              0.931123770               158          0.176931691
#> 8              0.032565130                 6          0.006674082
#>   n_Knee Replacement_sc freq_Knee Replacement_sc n_Groin Hernia_sc
#> 1                  1353              0.702492212               853
#> 2                   559              0.290238837                32
#> 3                    14              0.007268951                 3
#> 4                  1926              1.000000000               888
#> 5                   573              0.297507788                35
#> 8                    70              0.035070140                11
#>   freq_Groin Hernia_sc n_Knee Replacement_ua freq_Knee Replacement_ua
#> 1          0.960585586                   164               0.08523909
#> 2          0.036036036                  1518               0.78898129
#> 3          0.003378378                   242               0.12577963
#> 4          1.000000000                  1924               1.00000000
#> 5          0.039414414                  1760               0.91476091
#> 8          0.012235818                    72               0.03607214
#>   n_Groin Hernia_ua freq_Groin Hernia_ua n_Knee Replacement_pd
#> 1               665          0.746352413                    18
#> 2               209          0.234567901                  1130
#> 3                17          0.019079686                   756
#> 4               891          1.000000000                  1904
#> 5               226          0.253647587                  1886
#> 8                 8          0.008898776                    92
#>   freq_Knee Replacement_pd n_Groin Hernia_pd freq_Groin Hernia_pd
#> 1              0.009453782               287           0.32283465
#> 2              0.593487395               560           0.62992126
#> 3              0.397058824                42           0.04724409
#> 4              1.000000000               889           1.00000000
#> 5              0.990546218               602           0.67716535
#> 8              0.046092184                10           0.01112347
#>   n_Knee Replacement_ad freq_Knee Replacement_ad n_Groin Hernia_ad
#> 1                  1205               0.63023013               759
#> 2                   624               0.32635983               118
#> 3                    83               0.04341004                13
#> 4                  1912               1.00000000               890
#> 5                   707               0.36976987               131
#> 8                    84               0.04208417                 9
#>   freq_Groin Hernia_ad
#> 1           0.85280899
#> 2           0.13258427
#> 3           0.01460674
#> 4           1.00000000
#> 5           0.14719101
#> 8           0.01001112

EQ-5D value comparison

eq5d_utility_summary_by_group(
  df            = comparison_data,
  names_eq5d    = dim_names,
  name_groupvar = "procedure",
  eq5d_version  = "3L",
  country       = "UK"
)
#> Warning in .prep_eq5d(df = df, names = names_eq5d, add_state = TRUE,
#> add_utility = TRUE, : 427 observations were coerced to NAs as they were not
#> interpretable as integer values in the range allowed by the EQ-5D descriptive
#> system.
#>             name Groin Hernia Knee Replacement   All groups
#> 1           Mean 7.877045e-01     4.039469e-01 5.262618e-01
#> 2 Standard error 6.945801e-03     7.163948e-03 6.355527e-03
#> 3         Median 7.960000e-01     5.870000e-01 6.910000e-01
#> 4           25th 7.270000e-01     8.799999e-02 1.590000e-01
#> 5           75th 1.000000e+00     6.910000e-01 7.600000e-01
#> 6              N 8.730000e+02     1.866000e+03 2.739000e+03
#> 7        Missing 2.600000e+01     1.300000e+02 1.560000e+02
eq5d_utility_by_group_plot(
  df            = comparison_data,
  names_eq5d    = dim_names,
  name_groupvar = "procedure",
  eq5d_version  = "3L",
  country       = "UK"
)$p +
  ggplot2::theme_minimal()
#> Warning in .prep_eq5d(df = df, names = names_eq5d, add_state = TRUE,
#> add_utility = TRUE, : 427 observations were coerced to NAs as they were not
#> interpretable as integer values in the range allowed by the EQ-5D descriptive
#> system.
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.
#> ℹ The deprecated feature was likely used in the eq5dsuite package.
#>   Please report the issue to the authors.
#> This warning is displayed once per session.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
Mean pre-operative EQ-5D values by procedure group.
Mean pre-operative EQ-5D values by procedure group.

LSS analysis by group

The Level Sum Score (LSS) summarises overall severity as the sum of level scores across the five dimensions (range 5–15 for 3L, 5–25 for 5L). The plot below shows how EQ-5D values relate to LSS within each procedure group:

hernia_data <- comparison_data[
  comparison_data$procedure == "Groin Hernia", ]

eq5d_profile_lss_utility_plot(
  hernia_data,
  names_eq5d   = dim_names,
  eq5d_version = "3L",
  country      = "UK"
)$p +
  ggplot2::labs(
    title = "Groin Hernia",
    x     = "Level Sum Score (LSS)",
    y     = "EQ-5D value"
  ) +
  ggplot2::scale_x_continuous(limits = c(5, 15),
                               breaks = seq(5, 15, 2)) +
  ggplot2::theme_minimal()
#> Warning in .prep_eq5d(df = df, names = names_eq5d, add_state = TRUE, add_lss =
#> TRUE, : 44 observations were coerced to NAs as they were not interpretable as
#> integer values in the range allowed by the EQ-5D descriptive system.
#> Scale for x is already present.
#> Adding another scale for x, which will replace the existing scale.
#> Warning: Removed 3 rows containing missing values or values outside the scale range
#> (`geom_segment()`).
EQ-5D values by LSS — Groin Hernia.
EQ-5D values by LSS — Groin Hernia.
knee_data <- comparison_data[
  comparison_data$procedure == "Knee Replacement", ]

eq5d_profile_lss_utility_plot(
  knee_data,
  names_eq5d   = dim_names,
  eq5d_version = "3L",
  country      = "UK"
)$p +
  ggplot2::labs(
    title = "Knee Replacement",
    x     = "Level Sum Score (LSS)",
    y     = "EQ-5D value"
  ) +
  ggplot2::scale_x_continuous(limits = c(5, 15),
                               breaks = seq(5, 15, 2)) +
  ggplot2::theme_minimal()
#> Warning in .prep_eq5d(df = df, names = names_eq5d, add_state = TRUE, add_lss =
#> TRUE, : 383 observations were coerced to NAs as they were not interpretable as
#> integer values in the range allowed by the EQ-5D descriptive system.
#> Scale for x is already present.
#> Adding another scale for x, which will replace the existing scale.
#> Warning: Removed 3 rows containing missing values or values outside the scale range
#> (`geom_segment()`).
#> Warning: Removed 3 rows containing missing values or values outside the scale range
#> (`geom_segment()`).
EQ-5D values by LSS — Knee Replacement.
EQ-5D values by LSS — Knee Replacement.

Further reading

References

Devlin N, Parkin D, Janssen B (2020). Methods for Analysing and Reporting EQ-5D Data. Springer, Cham. https://doi.org/10.1007/978-3-030-47622-9