Cross-Validation

Lab 4C

Directions: Follow along with the slides and answer the questions in red font in your journal.

Predictions

Why cross-validate?

Splitting the data

set.seed(123)
train_rows <- sample(1:____, size = 85)
train <- slice(arm_span, ____)
test <- slice(____, - ____)

set.seed then split

Whenever you split data into training and testing, always use set.seed first.

Building on training

Predicting on testing

test <- mutate(test, ____ = predict(best_train, newdata = ____))

Avoiding being too specific