Scraping web data

Lab 3E

Directions: Follow along with the slides and answer the questions in red font in your journal.

The web as a data source

Our first web scraper

https://labs.idsucla.org/extras/webdata/mountains.html

HTML

<TABLE>
  <TR>
    <TH>peak</TH>
    <TH>range</TH>
    <TH>state</TH>
    <TH>long</TH>
    <TH>lat</TH>
    <TH>elev_ft</TH>
    <TH>elev_m</TH>
    <TH>prominence_ft</TH>
    <TH>prominence_m</TH>
    <TH>rank</TH>
  </TR>
  <TR>
    <TD>Denali (Mount McKinley)</TD>
    <TD>Alaska Range</TD>
    <TD>Alaska</TD>
    <TD>-151.0063</TD>
    <TD>63.0690</TD>
    <TD>20236</TD>
    <TD>6168</TD>
    <TD>20174</TD>
    <TD>6149</TD>
    <TD>1</TD>
  </TR>
</TABLE>

Get to scraping!

tables <- readHTMLTable(____)

Find our data

Saving tables

Check, save and use!

save(____, file = "____.Rda")