Lab 1F
Directions: Follow along with the slides, completing
the questions in blue on your
computer, and answering the questions in red in your
journal.
Space, Click, Right Arrow or swipe left to move to
the next slide.
caseid
: Anonymous ID of survey taker.V1
: The age of the respondent.V2
: The gender of the respondent.V3
: Whether the person is employed full-time or
part-time.V4
: Whether the person has a physical difficulty.V5
: How long the person sleeps, in minutes.V6
: How long the survey taker spent on homework, in
minutes.V7
: How long the respondent spent socializing, in
minutes.rename
function:atu_dirty
.atu_dirty
.
R
will treat values that look like
numbers as if they were strings.Yes/No
variables as
"1"
/"0"
.str
ucture of your data and the variable
descriptions from a few slides back:
R
to think of our
“numeric” variables as numeric variables.as.numeric
function.
## [1] 3.14
"3.14"
, but
as.numeric
was able to turn it back into a number.gender
variable uses "01"
and "02"
for "Male"
and "Female"
,
respectively."Male"
and
"Female"
.R
has a special name for categorical
variables, called factors.R
also has a special name for the different
categories of a categorical variable.
gender
and their counts type:01
’ means ‘Male
’ and
‘02
’ means ‘Female
’ then we can use the
following code to recode the levels of gender.atu_cleaner
…gender
variable’s levels …"01"
will now be "Male"
…"02"
will now be "Female"
.Recode the categorical variable about whether the person surveyed had a physical challenge or not. The coding is currently:
"01"
: Person surveyed did not have a physical
challenge."02"
: Person surveyed did have a physical
challenge.Write a script that:
atu_dirty
datasetNOTE: You can watch this video to learn about RScripts:
The last few lines of your script are extremely important because they will save all of your work.
Be sure to View
your data and check its
str
ucture to make sure it looks clean and tidy before
saving.
Run the code below:
This code will create a new data frame in your
Environment called atu_clean
which is a final copy
of atu_cleaner
.
atu_clean
is swept from your Environment
all of the changes you made will NOT be saved.To permanently save your changes you need to save the file as an
R
data file or .Rda
.
Run the code below:
atu_clean.Rda
file.
Now that you have learned some cleaning data basics, it’s time to
revisit the food
data.
Run the code below:
Use the as.factor()
function to
convert healthy_level
into a categorical variable and
re-run the histogram
function.
healthy_level
categories are now
numbers as opposed to tick-marks. This is an improvement but an even
better solution would be to recode
the categories.Recode
the
healthy_level
categories and re-run the
histogram
function.
If your food
data is cleared from your
Environment, the changes that you made to the
healthy_level
variable will not be saved.
To save your changes permanently, save your
food
file as an R
data file.