Stuart’s MATLAB Videos

Watch and Learn

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Handling Multiple Match Tokens in My Web App

Posted by Stuart McGarrity,

Now I want to modify the MATLAB app I recently made to scrape web pages, so that it can handle multiple multi-part patterns. i.e. where there are more than one token in the regular expression match pattern.

I’ll use this example where I want to find the href attributes and the link text for all the links on a web page (for multiple pages).

So, for the example string:

 <a href="/academia.html?s_tid=gn_acad">Academia</a>

I want to extract:

  1. /academia.html?s_tid=gn_acad
  2. Academia

So I will use this regex pattern:

<a href="([^"]*)">([^<]*)</a>

Features covered in this code-along style video include:

Follow me (@stuartmcgarrity) if you want to be notified via Twitter when I post.

Play the video in full screen mode for a better viewing experience.