vtreat: A Statistically Sound 'data.frame' Processor/Conditioner

A 'data.frame' processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. 'vtreat' prepares variables so that data has fewer exceptional cases, making it easier to safely use models in production. Common problems 'vtreat' defends against: 'Inf', 'NA', too many categorical levels, rare categorical levels, and new categorical levels (levels seen during application, but not during training). Reference: "'vtreat': a data.frame Processor for Predictive Modeling", Zumel, Mount, 2016, <doi:10.5281/zenodo.1173313>.

Version: 1.6.4
Depends: R (≥ 3.4.0), wrapr (≥ 2.0.9)
Imports: stats, digest
Suggests: rquery (≥ 1.4.9), rqdatatable (≥ 1.3.2), data.table (≥ 1.12.2), isotone, lme4, knitr, rmarkdown, parallel, DBI, RSQLite, datasets, R.rsp, tinytest
Published: 2023-08-19
Author: John Mount [aut, cre], Nina Zumel [aut], Win-Vector LLC [cph]
Maintainer: John Mount <jmount at win-vector.com>
BugReports: https://github.com/WinVector/vtreat/issues
License: GPL-2 | GPL-3
URL: https://github.com/WinVector/vtreat/, https://winvector.github.io/vtreat/
NeedsCompilation: no
Materials: README NEWS
CRAN checks: vtreat results

Documentation:

Reference manual: vtreat.pdf
Vignettes: Multi Class vtreat
Saving Treatment Plans
vtreat Variable Importance
vtreat package
vtreat cross frames
vtreat grouping example
vtreat overfit
vtreat Rare Levels
vtreat scale mode
vtreat significance
vtreat data splitting
Variable Types
vtreat Formal Article

Downloads:

Package source: vtreat_1.6.4.tar.gz
Windows binaries: r-devel: vtreat_1.6.4.zip, r-release: vtreat_1.6.4.zip, r-oldrel: vtreat_1.6.4.zip
macOS binaries: r-release (arm64): vtreat_1.6.4.tgz, r-oldrel (arm64): vtreat_1.6.4.tgz, r-release (x86_64): vtreat_1.6.4.tgz
Old sources: vtreat archive

Reverse dependencies:

Reverse imports: crispRdesignR
Reverse suggests: mlr3pipelines

Linking:

Please use the canonical form https://CRAN.R-project.org/package=vtreat to link to this page.