{"id":4946,"date":"2021-06-10T11:34:17","date_gmt":"2021-06-10T15:34:17","guid":{"rendered":"https:\/\/blogs.mathworks.com\/videos\/?p=4946"},"modified":"2021-06-21T19:46:45","modified_gmt":"2021-06-21T23:46:45","slug":"using-parfor-to-make-many-web-requests","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/videos\/2021\/06\/10\/using-parfor-to-make-many-web-requests\/","title":{"rendered":"Using parfor to Make Many Web Requests"},"content":{"rendered":"<p>My colleague asked me to access all the pages on a web server in order to populate it&#8217;s cache. I plan to use <tt>parfor<\/tt> to get through the more than 250k pages in a timely manner. I will also need to not go too fast and overload the server.<\/p>\n<p>I&#8217;ve used <tt>parfor<\/tt> before for web page access and found that it is one of the rare situations where you can use more MATLAB workers than available physical or even logical processors, which is not normally recommended. It works because web requests require a lot of waiting, and often the processing I need to do in MATLAB is little.<\/p>\n<p>I&#8217;ll admit this video gets a little boring near the end as I try out differed numbers of workers. Remember that you can increase the playback speed of the video in the lower right corner of the player.<\/p>\n<p>Features covered in this <a href=\"https:\/\/blogs.mathworks.com\/videos\/2015\/10\/29\/matlab-code-along-videos\/\">code-along<\/a> style video include:<\/p>\n<ul>\n<li><tt>parfor<\/tt><\/li>\n<\/ul>\n<p><div class=\"row\"><div class=\"col-xs-12 containing-block\"><div class=\"bc-outer-container add_margin_20\"><videoplayer><div class=\"video-js-container\"><video data-video-id=\"6258235887001\" data-video-category=\"blog\" data-autostart=\"false\" data-account=\"62009828001\" data-omniture-account=\"mathwgbl\" data-player=\"rJ9XCz2Sx\" data-embed=\"default\" id=\"mathworks-brightcove-player\" class=\"video-js\" controls><\/video><script src=\"\/\/players.brightcove.net\/62009828001\/rJ9XCz2Sx_default\/index.min.js\"><\/script><script>if (typeof(playerLoaded) === 'undefined') {var playerLoaded = false;}(function isVideojsDefined() {if (typeof(videojs) !== 'undefined') {videojs(\"mathworks-brightcove-player\").on('loadedmetadata', function() {playerLoaded = true;});} else {setTimeout(isVideojsDefined, 10);}})();<\/script><\/div><\/videoplayer><\/div><\/div><\/div><\/p>\n<p>Play the video in full screen mode for a better viewing experience.\u00a0Final code is here:<\/p>\n<pre>\n%% Make Requests to Set of URLs<\/p>\n<p>% Assumes a spreasdsheet with a \"urls\" variable\/column<br \/>\npagesFileName=\"FILEPATH\\all-aem-pages.xlsx\";<br \/>\nenvironments=[\"dev2\" \"dev3\"];<br \/>\nenvironment=environments(2);<br \/>\noptions=weboptions('Timeout',60);<br \/>\ntotalStartTime=clock;<br \/>\n%% Get List of Pages<br \/>\n% Re-use table if already in base workspace<br \/>\nif ~exist('pages','var')<br \/>\n    pages=readtable(pagesFileName,'TextType','string');<br \/>\nend<br \/>\n%% Create list of URLs<br \/>\n% Convert environment<br \/>\nurls=replace(pages.urls,\".mathworks\",\"-\" + environment + \".mathworks\");<br \/>\n%% Start Workers<br \/>\n% 12 for dev server<br \/>\nstartPool(12);<br \/>\n%% Make Requests<br \/>\nsuccess=false(height(pages),1);<br \/>\nparfor k=1:height(pages)<br \/>\n    startTime=[]; % Initialize for parfor<br \/>\n    url=urls(k);<br \/>\n    try<br \/>\n        startTime=clock;<br \/>\n        content=webread(url,options);<br \/>\n        success(k)=true;<br \/>\n        fprintf('Succeeded accessing (%d of %d): %s(%2.1f sec).\\n',k,height(pages),url,etime(clock,startTime));<br \/>\n    catch<br \/>\n        fprintf('Failed accessing (%d of %d): %s(%2.1f sec).\\n',k,height(pages),url,etime(clock,startTime));<br \/>\n    end<br \/>\nend<\/p>\n<p>%% Finish<br \/>\nfprintf('Finished %s\\n',myETimeStr(totalStartTime))<\/p>\n<p>%%  Local functions<br \/>\nfunction p=startPool(numWorkers)<br \/>\n    p = gcp('nocreate');<br \/>\n    if isempty(p)<br \/>\n        p=parpool(numWorkers);<br \/>\n    elseif p.NumWorkers~=numWorkers<br \/>\n        delete(p);<br \/>\n        p=parpool(numWorkers);<br \/>\n    end<br \/>\nend<\/p>\n<p>function y=myETimeStr(startTime)<br \/>\n% Return a string (mm:ss) from an elapsed time in seconds.<\/p>\n<p>y= char(duration(0,0,etime(clock,startTime),'Format','mm:ss'));<\/p>\n<p>end<\/p>\n<pre>\n","protected":false},"excerpt":{"rendered":"<div class=\"thumbnail thumbnail_asset asset_overlay video\"><a href=\"https:\/\/blogs.mathworks.com\/videos\/2021\/06\/10\/using-parfor-to-make-many-web-requests\/?dir=autoplay\"><img decoding=\"async\" src=\"https:\/\/cf-images.us-east-1.prod.boltdns.net\/v1\/static\/62009828001\/2217f11d-3f32-42d2-9ecf-a12e42027df9\/5a0313b3-0ec4-49d2-9099-c0b07e825a5f\/1280x720\/match\/image.jpg\" onError=\"this.style.display ='none';\"\/><\/p>\n<div class=\"overlay_container\">\n      <span class=\"icon-video icon_color_null\"><time class=\"video_length\">47:27<\/time><\/span>\n      <\/div>\n<p>      <\/a><\/div>\n<p>My colleague asked me to access all the pages on a web server in order to populate it&#8217;s cache. I plan to use parfor to get through the more than 250k pages in a timely manner. I will also need&#8230; <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/videos\/2021\/06\/10\/using-parfor-to-make-many-web-requests\/\">read more >><\/a><\/p>\n","protected":false},"author":133,"featured_media":4979,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[27,4],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/posts\/4946"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/users\/133"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/comments?post=4946"}],"version-history":[{"count":8,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/posts\/4946\/revisions"}],"predecessor-version":[{"id":4970,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/posts\/4946\/revisions\/4970"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/media\/4979"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/media?parent=4946"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/categories?post=4946"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/tags?post=4946"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}