Handling Multiple Match Tokens in My Web App
Now I want to modify the MATLAB app I recently made to scrape web pages, so that it can handle multiple multi-part patterns. i.e. where there are more than one token in the regular expression match pattern.
I’ll use this example where I want to find the href attributes and the link text for all the links on a web page (for multiple pages).
So, for the example string:
<a href="/academia.html?s_tid=gn_acad">Academia</a>
I want to extract:
- /academia.html?s_tid=gn_acad
- Academia
So I will use this regex pattern:
<a href="([^"]*)">([^<]*)</a>
Features covered in this code-along style video include:
Play the video in full screen mode for a better viewing experience.
评论
要发表评论,请点击 此处 登录到您的 MathWorks 帐户或创建一个新帐户。