Lab 2A

- Most of the labs thus far have covered how to visualize, summarize,
and manipulate data.
- We used visualizations to explore how your class spends their time.
- We also learned how to clean data to prepare it for analyzing.

- Starting with this lab, we’ll learn to use R to answer statistical questions that can be answered by calculating the mean, median and MAD.

- When we make plots of our data, we usually want to know:
- Where is the
*bulk*of the data? - Where is the data more
*sparse*, or*thin*? - What values are
*typical*? - How much does the data
*vary*?

- Where is the
- To answer these questions, we want to look at the
*distribution*of our data.- We describe
*distributions*by talking about where the*center*of the data are, how*spread*out the data are, and what sort of*shape*the data has.

- We describe

*Export*,*upload*and*import*your class’*Personality Color*data.- Name your data
`colors`

when you load it.

- Name your data
- Before analyzing a new dataset, it’s often helpful to get familiar
with it. So:
**Write down the**`names`

of the 4 variables that contain the point-totals, or*scores*, for each personality color.**Write down the**`names`

of the variables that tell us an observation’s introvert/extrovert designation and whether they are involved in*sports*.**How many variables are in the dataset?****How many observations are in the dataset?**

- Create a
`dotPlot`

of the scores for your*predominant color*.- Pro-tip: If the
`dotPlot`

comes out looking wonky, include the`nint`

and`cex`

options.

- Pro-tip: If the
- Based on your
`dotPlot`

:**Which values came up the most frequently? About how many people in your class had a score similar to yours?****What, would you say, was a***typical*score for a person in your class for your predominant color? How does your own score for this color compare?

*Means*and*medians*are usually good ways to describe the*typical*value of our data.- Fill in the blank to calculate the
`mean`

value of your predominant color score:

- Use a similar line of code to calculate the
`median`

value of*your*predominant color.**Are the**`mean`

and`median`

roughly the same? If not, use the`dotPlot`

you made in the last slide to describe why.

- Now that we know how to describe our data’s
*typical*value we might also like to describe how closely the rest of the data are to this*typical*value.- We often refer to this as the
**variability**of the data. - Variability is seen in a
`histogram`

or`dotPlot`

as the horizontal*spread*.

- We often refer to this as the
- Re-create a
`dotPlot`

of the scores for your*predominant color*and then run the code below filling in the blank with the name of your predominant color:

**Look at the spread of the scores from the mean score then complete the sentence below:**

*Data points in my plot will usually fall within* ____
*units of the center.*

- The
**mean absolute deviation**finds how far away, on average, the data are from the mean.- We often write
*mean absolute deviation*as*MAD*.

- We often write
- Calculate the MAD of your
*predominant color*by filling in the blank:

**How close was your estimate of the spread for your predominant color (from the previous slide) to the actual value?**

- Do introverts and extroverts differ in their typical scores for your
predominant color?
- Answer this investigative question using a dotPlot and numerical summaries.

- Make a
`dotPlot`

of your*predominant color*again; but this time, facet the plot by the introvert/extrovert variable. Include the`layout`

option to stack the plots as well as the`nint`

and`cex`

options. **Describe the shape of the distribution of scores for the extroverts. Do the same for the introverts.**- Using similar syntax to how you facet plots,
*calculate*either the`mean`

or`median`

to describe the*center*of your predominant color for introverts and extroverts. **Do introverts and extroverts differ in their typical scores for your predominant color?****Based on the MAD, which group (introverts or extroverts) has more variability for your predominant color’s scores?**

- Do introverts and extroverts in your class differ in their color
scores?
- Perform an analysis that produces
*numerical summaries*and*graphs*. **Then, write a few sentences that address this statistical question and considers the***shape*,*center*and*spread*of the distributions of the graphs you create.

- Perform an analysis that produces