Lab 4C
Directions: Follow along with the slides, completing
the questions in blue on your
computer, and answering the questions in red in your
journal.
Space, Click, Right Arrow or swipe left to move to
the next slide.
height
from the
arm_span
data (4A).height
on the
arm_span
data by computing mean squared error (MSE)
(4B).training
and
test
sets.training
set.test
set.arm_span
data into
training
and test
sets using the following two
steps.arm_span
will go into the
training
set.slice
function to
create two dataframes: one called training
consisting of
the training_rows
, and another called test
consisting of the remaining rows of arm_span
.training
and test
datasets.set.seed
.set.seed
, we’re able to reproduce the random
splitting so that each person’s model outputs the same results.Whenever you split data into training and test, always use
set.seed
first.
training
and test
sets, we need to have enough observations in our data so that we can
build a good model.
training
data.test
with.height
and
armspan
using the training
data.training
data and assign it the name
best_training
.training
data to make predictions on the test
data.predict()
function we introduced in the last lab to make
predictions.
test
data:predict
function without the argument
newdata
will output predictions on the
training
data. To output predictions on the
test
data, supply the test
data to the
newdata
argument.training
and
test
sets.test
set, and use these predictions to
compute test MSE.arm_span
dataset
and fit two models: a linear model, and a polynomial model.
training
points, and two
curves representing the value of height each model would predict given a
value of armspan.training
points?arm_span
dataset,
along with the predictions each model would make.arm_span
dataset?