Lab 4C
Directions: Follow along with the slides, completing
the questions in blue on your
computer, and answering the questions in red in your
journal.
Space, Click, Right Arrow or swipe left to move to
the next slide.
height from the
arm_span data (4A).height on the
arm_span data by computing mean squared error (MSE)
(4B).training and
test sets.training set.test
set.arm_span data into
training and test sets using the following two
steps.arm_span will go into the
training set.slice function to
create two dataframes: one called training consisting of
the training_rows, and another called test
consisting of the remaining rows of arm_span.training and test datasets.set.seed.set.seed, we’re able to reproduce the random
splitting so that each person’s model outputs the same results.Whenever you split data into training and test, always use
set.seed first.
training and test
sets, we need to have enough observations in our data so that we can
build a good model.
training
data.test with.height and
armspan using the training data.training data and assign it the name
best_training.training
data to make predictions on the test data.predict() function we introduced in the last lab to make
predictions.
test data:predict function without the argument
newdata will output predictions on the
training data. To output predictions on the
test data, supply the test data to the
newdata argument.training and
test sets.test set, and use these predictions to
compute test MSE.arm_span dataset
and fit two models: a linear model, and a polynomial model.
training points, and two
curves representing the value of height each model would predict given a
value of armspan.training points?arm_span dataset,
along with the predictions each model would make.arm_span dataset?