Scraping Links from a List of Web Pages

My colleague Sherrie just asked me if I could extract the links from a specific set of pages, then store them in a spreadsheet. She just needs the links from the area in the center of the page and not on the top or bottom “boilerplate” area. I also plan to put the results for each page into a different worksheet of the spreadsheet. I will use a bunch of functions from the Text Analytics Toolbox that I have used before.

Features covered in this code-along style video include:

  • webread
  • htmlTree, extractHTMLText, findElement, getAttribute

