This package is planned to make it compatible for any machine
learning task, even time series and image classification cam be
supported. Yes, you can do both linear regression and logistic
regression with extra steps: heavily customized optimizer and loss
functions. The train_nn() function (available on
>v0.3.x) supports this { optimizer \(\leftrightarrow\)
optimizer_args } and { loss }. For both cases,
the key is to remove all hidden layers and rely entirely on the output
layer and the appropriate loss function to recover the classical model’s
behavior.
A standard linear regression model predicts a continuous outcome as a weighted sum of inputs — no nonlinearity, no hidden layers. A neural network recovers this exactly when:
hidden_neurons = integer(0) or simply omit it),loss = "mse").Under these conditions, gradient descent minimizes the same objective as ordinary least squares, and the learned weights converge to the OLS solution given sufficient epochs and a small learning rate.
We use mtcars to predict fuel efficiency
(mpg) from the other variables.
To create no hidden units, the hidden_neuron parameter
from train_nn() considers the following to achieve:
NULLc()In this example, the empty vector c() is used and will
collapse the network to a single linear layer from inputs to output. The
optimizer = "rmsprop" with a small learn_rate
mirrors classical gradient descent for OLS.
lm()lm_fit = lm(mpg ~ ., data = train)
tibble(
truth = test$mpg,
estimate = predict(lm_fit, newdata = test)
) |>
metric_set(rmse, rsq)(truth = truth, estimate = estimate)The two models should produce very similar RMSE and \(R^2\) values. Any small gap reflects that
gradient descent is an iterative approximation, while lm()
solves for the exact OLS coefficients directly. Increasing
epochs or switching to optimizer = "lbfgs" (if
supported) will close the gap further.
Logistic regression models a binary or multiclass outcome by passing a linear combination of inputs through a sigmoid or softmax activation. A neural network with:
loss = "cross_entropy")
for the loss functionis mathematically equivalent to logistic regression.
We use the Sonar dataset from {mlbench} to
distinguish rocks from mines (binary outcome).
data("Sonar", package = "mlbench")
sonar = Sonar
set.seed(42)
split_s = initial_split(sonar, prop = 0.8, strata = Class)
train_s = training(split_s)
test_s = testing(split_s)
rec_s = recipe(Class ~ ., data = train_s) |>
step_normalize(all_numeric_predictors())glm() /
nnet::multinom()box::use(nnet[multinom])
glm_fit = glm(Class ~ ., data = train_s, family = binomial())
tibble(
truth = test_s$Class,
estimate = {
as.factor({
preds = predict(glm_fit, newdata = test_s, type = "response")
ifelse(preds < 0.5, "M", "R")
})
}
) |>
accuracy(truth = truth, estimate = estimate)Again, accuracy should be comparable between the two approaches. The neural network version converges iteratively, so the match is not guaranteed to be exact, but both are optimizing the same cross-entropy objective over a linear model.