{"id":933,"date":"2014-01-07T07:00:35","date_gmt":"2014-01-07T12:00:35","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/?p=933"},"modified":"2019-11-01T09:43:20","modified_gmt":"2019-11-01T13:43:20","slug":"automating-data-extraction-2","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2014\/01\/07\/automating-data-extraction-2\/","title":{"rendered":"Automating the extraction of real data from an image of the data &#8211; part 2"},"content":{"rendered":"<div class=\"content\"><!--introduction--><p><i>I'd like to welcome back my fellow MATLAB Central blogger Brett Shoelson for the second in a three-part series on extracting curve values from a plot. You can find Brett over at the <a href=\"https:\/\/blogs.mathworks.com\/pick\/\">File Exchange Pick of the Week blog<\/a>, or you can check out his many <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/authors\/911\">File Exchange contributions<\/a>. -Steve<\/i><\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#0c8880f2-9ed9-40d0-9c71-c98faa4c0524\">Quick recap<\/a><\/li><li><a href=\"#efd8f65b-b2dc-4c12-b251-5b95cb3005e3\">Next up, segmentation<\/a><\/li><li><a href=\"#5bd5924a-4d03-46f0-88e4-dd06d6f5b522\">Morphologically clean the image<\/a><\/li><li><a href=\"#40b82ca5-afe0-42d1-bee1-0ebb71e54fb3\">Extracting the curve of interest<\/a><\/li><li><a href=\"#19495711-3d63-4c8c-9160-6dad69196383\">BWTRACEBOUNDARY<\/a><\/li><li><a href=\"#dd339f62-7b39-4616-9f82-6e715af8578e\">Using <tt>regionprops<\/tt><\/a><\/li><li><a href=\"#d1926b4f-e6ae-497e-9220-924f5a7945be\">Next up: fitting a curve<\/a><\/li><li><a href=\"#1ca56367-2370-4747-ac82-9b9abe5a8062\">The complete series<\/a><\/li><\/ul><\/div><h4>Quick recap<a name=\"0c8880f2-9ed9-40d0-9c71-c98faa4c0524\"><\/a><\/h4><p>Continuing from last week's guest post, we're heading down the path of extracting real data from the \"efficiency curve\" in a graphical representation of those data. To recap, here's the original image:<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/pick\/files\/OriginalChart.png\" alt=\"\"> <\/p><p>Here are the steps we need to repeat from part 1<\/p><pre class=\"codeinput\">url = <span class=\"string\">'https:\/\/blogs.mathworks.com\/images\/steve\/2013\/example-pump-perf-curve.jpg'<\/span>;\r\nchart = imread(url);\r\nbw = im2bw(imcomplement(rgb2gray(chart)));\r\nfilled = imfill(bw,<span class=\"string\">'holes'<\/span>);\r\ncc = bwconncomp(filled);\r\nstats = regionprops(cc,<span class=\"string\">'Area'<\/span>);\r\nA = [stats.Area];\r\n[~,biggest] = max(A);\r\nfilled(labelmatrix(cc)~=biggest) = 0;\r\nbb = regionprops(filled,<span class=\"string\">'BoundingBox'<\/span>); <span class=\"comment\">%We'll use this later!<\/span>\r\nbb = bb.BoundingBox;\r\nchart(~repmat(filled,[1 1 3])) = 0;\r\ncurveOfInterest = rgb2ycbcr(chart);\r\ncurveOfInterest = curveOfInterest(:,:,2);\r\n<\/pre><p>By the end of the first post, I had managed to reduce the SNR so that I ended up with an image (curveOfInterest) that contains a single region...that partially isolates the two blue curves in the top axes:<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/pick\/files\/ExploreRGBChart4.png\" alt=\"\"> <\/p><h4>Next up, segmentation<a name=\"efd8f65b-b2dc-4c12-b251-5b95cb3005e3\"><\/a><\/h4><p>Many (perhaps most) image processing problems require some approach to \"segmenting\" the image--creating a binary mask that shows white (\"true,\" or 1) where we have signal, and black (\"false,\" or 0) elsewhere. Now that I've masked and converted the original image to YCbCr (\"luminance\/chrominance\/chrominance\"), and selected the second (Cb) plane of the resulting image, segmentation is relatively easy. I'm going to use <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/38484-segmenttool--an-interactive-gui-for-segmenting-images\">SegmentTool<\/a> to explore options for doing that:<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/pick\/files\/SegmentToolChart.png\" alt=\"\"> <\/p><p>Simple thresholding works well here (try it!), but I'm going to use a different approach--one that sometimes succeeds where thresholding fails.<\/p><pre class=\"codeinput\">curveOfInterest = imextendedmax(im2double(curveOfInterest),0.30,8);\r\nimshow(curveOfInterest)\r\ntitle(<span class=\"string\">'Binary Curves'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2013\/extractEfficiencyWriteup2_01.png\" alt=\"\"> <h4>Morphologically clean the image<a name=\"5bd5924a-4d03-46f0-88e4-dd06d6f5b522\"><\/a><\/h4><p>For subsequent analysis, and extraction of data, it will be useful to <a href=\"https:\/\/www.mathworks.com\/help\/images\/morphological-filtering.html\">morphologically alter<\/a> the image. <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/23697-image-morphology\">MorphTool<\/a> will be your best friend in that regard! ;)<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/pick\/files\/MorphToolChart.png\" alt=\"\"> <\/p><p>After playing around a bit, I decided to dilate the image, and then to skeletonize it and remove spurs. (<a href=\"https:\/\/www.mathworks.com\/help\/images\/ref\/bwmorph.html\"><tt>bwmorph<\/tt><\/a> makes both of those latter processes easy.):<\/p><pre class=\"codeinput\">curveOfInterest = imdilate(curveOfInterest, strel(<span class=\"string\">'Disk'<\/span>,6));\r\ncurveOfInterest = bwmorph(curveOfInterest, <span class=\"string\">'skeleton'<\/span>, Inf);\r\ncurveOfInterest = bwmorph(curveOfInterest, <span class=\"string\">'spur'<\/span>, 11);\r\nimshow(curveOfInterest)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2013\/extractEfficiencyWriteup2_02.png\" alt=\"\"> <p>If the curves in that image don't look contiguous, recognize that that's just a screen-resolution issue. There's now exactly one region in this image:<\/p><pre class=\"codeinput\">cc = bwconncomp(curveOfInterest);\r\ncc.NumObjects\r\n<\/pre><pre class=\"codeoutput\">\r\nans =\r\n\r\n     1\r\n\r\n<\/pre><h4>Extracting the curve of interest<a name=\"40b82ca5-afe0-42d1-bee1-0ebb71e54fb3\"><\/a><\/h4><p><b>And now for the difficult part<\/b><\/p><p>Okay, we're well on our way. We've eliminated the black and red curves that we don't need, and we've got the curve of interest reduced to a skeleton that contains the minimum information that we <i>do<\/i> need. But we still have to find a way to extract <i>only<\/i> the efficiency curve, and to discard the curve representing \"334-mm\" of head. We can't use color; the curves in the original image are identical. So we're left with differentiating the curves based on their morphological (shape) characteristics.<\/p><p><b>I'm going to do this two different ways<\/b><\/p><p>Sometimes it's useful to be able to trace the boundary of a region, or (as we do here), to trace along a particular path. <tt>bwtraceboundary<\/tt> will allow us to do just that.<\/p><h4>BWTRACEBOUNDARY<a name=\"19495711-3d63-4c8c-9160-6dad69196383\"><\/a><\/h4><p>If you've never used it before, take a moment to read the <a href=\"https:\/\/www.mathworks.com\/help\/images\/ref\/bwtraceboundary.html\">documentation for <tt>bwtraceboundary<\/tt><\/a>. You'll find that it allows you to operate on a binary image by specifying a starting point (<i>P<\/i>), an initial trajectory (<i>fstep<\/i>) along a path in the image, and a direction (<i>dir<\/i>) in which you want to trace the path. (\"Direction\" here allows you to specify the overall orientation of the boundary you want to trace.)<\/p><p><b>Automating the determination of the starting point<\/b><\/p><p>Recalling that the curve of interest always starts in the lower left of the upper axes, we can readily automate the determination of the starting point. (It will be the point closest to the bottom left of the masked axes.) I'm going to use <tt>regionprops<\/tt> to determine the bounding box of the masked axes (currently contained in variable \"filled\"). However, the doc for regionprops indicates that BoundingBox is measured from the upper left; I need the lower left so I'll have to manipulate the calculation of the bounding box a bit:<\/p><pre class=\"codeinput\">targetPoint = regionprops(filled,<span class=\"string\">'BoundingBox'<\/span>);\r\ntargetPoint = targetPoint.BoundingBox(1:2) + [0 targetPoint.BoundingBox(4)];\r\n[ys,xs] = find(curveOfInterest);\r\n<\/pre><p>Now it remains to get the distance from all points to the lower left corner of the bounding box, and to find the point that minimizes that distance. (I'm going to use a Statistics Toolbox function here, but this is a relatively easy function to write if you don't have that Toolbox.)<\/p><pre class=\"codeinput\">[~,ind] = min(pdist2(targetPoint,[xs,ys]));\r\nP = [ys(ind),xs(ind)]; <span class=\"comment\">%(Careful here: P is represented as [row,col], not [x,y]!)<\/span>\r\n<\/pre><p><b>Now we need to <i>orient<\/i> the trace<\/b><\/p><p>Consider the results using two different orientations:<\/p><pre class=\"codeinput\">N = 1000; <span class=\"comment\">%If I leave this unbounded (Inf), all points on the curve will be detected!<\/span>\r\n<span class=\"comment\">% Counterclockwise<\/span>\r\nBccw = bwtraceboundary(curveOfInterest,P,<span class=\"string\">'NE'<\/span>,8,N,<span class=\"string\">'counterclockwise'<\/span>);\r\nimshow(curveOfInterest);\r\nhold <span class=\"string\">on<\/span>\r\nplot(P(2),P(1),<span class=\"string\">'go'<\/span>,<span class=\"string\">'MarkerSize'<\/span>,12)\r\nplot(Bccw(:,2),Bccw(:,1),<span class=\"string\">'r'<\/span>,<span class=\"string\">'LineWidth'<\/span>,5);\r\ntitle(<span class=\"string\">'Counterclockwise'<\/span>)\r\nhold <span class=\"string\">off<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2013\/extractEfficiencyWriteup2_03.png\" alt=\"\"> <p>Now clockwise.<\/p><pre class=\"codeinput\">Bcw = bwtraceboundary(curveOfInterest,P,<span class=\"string\">'NE'<\/span>,8,N,<span class=\"string\">'clockwise'<\/span>);\r\nimshow(curveOfInterest);\r\nhold <span class=\"string\">on<\/span>\r\nplot(P(2),P(1),<span class=\"string\">'go'<\/span>,<span class=\"string\">'MarkerSize'<\/span>,12)\r\nplot(Bcw(:,2),Bcw(:,1),<span class=\"string\">'r'<\/span>,<span class=\"string\">'LineWidth'<\/span>,5);\r\ntitle(<span class=\"string\">'Clockwise'<\/span>)\r\nhold <span class=\"string\">off<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2013\/extractEfficiencyWriteup2_04.png\" alt=\"\"> <p>That's interesting; I see that if I specify a clockwise orientation for my curve (and if I use a sufficiently large <i>N<\/i>), I can constrain the boundary-tracing algorith to <i>roughly<\/i> follow the efficiency curve I'm after. However, there's a significant spur on the data that may present difficulties when we try to fit the data to a curve that reasonably represents the efficiency-versus-flow rate. If I break this up into smaller pieces and examine the output at each stage, I find that I can significantly improve this result. Here, for instance, I \"step\" in an easterly direction, initially specifying a counterclockwise orientation. Then I iterate over small chunks, and evaluate the results after each iteration. When the curve stops increasing in the _x-_direction (i.e., when the largest column value isn't equal to the step size), I change the orientation to keep the trace following the desired curve. (Note that there are lots of combinations of parameters that will work here; this step took some experimentation!)<\/p><pre class=\"language-matlab\">sz = 25;\r\nind = sz;\r\nnIters = 100;\r\nallBs = nan(sz*nIters,2);\r\nPnew = P; <span class=\"comment\">%reset<\/span>\r\nh = text(200,385,<span class=\"string\">'Direction:'<\/span>,<span class=\"string\">'FontSize'<\/span>,14,<span class=\"keyword\">...<\/span>\r\n    <span class=\"string\">'FontWeight'<\/span>,<span class=\"string\">'bold'<\/span>,<span class=\"string\">'color'<\/span>,<span class=\"string\">'w'<\/span>,<span class=\"string\">'HorizontalAlignment'<\/span>,<span class=\"string\">'left'<\/span>)\r\n<span class=\"keyword\">for<\/span> ii = 1:nIters\r\n    <span class=\"keyword\">if<\/span> ind == sz\r\n        direction = <span class=\"string\">'counterclockwise'<\/span>;\r\n    <span class=\"keyword\">else<\/span>\r\n        direction = <span class=\"string\">'clockwise'<\/span>;\r\n    <span class=\"keyword\">end<\/span>\r\n    set(h, <span class=\"string\">'string'<\/span>,[<span class=\"string\">'Direction: '<\/span>,direction])\r\n    <span class=\"comment\">%B yields [ROW,COL]:<\/span>\r\n    B = bwtraceboundary(curveOfInterest,Pnew,<span class=\"string\">'E'<\/span>,8,sz,direction);\r\n    hold <span class=\"string\">on<\/span>; plot(B(:,2),B(:,1),<span class=\"string\">'r.'<\/span>);drawnow;\r\n    allBs((ii-1)*sz+1:(ii-1)*sz+sz,:) = B;\r\n    [~,ind] = max(B(:,2));\r\n    <span class=\"comment\">% Update the starting point, Pnew:<\/span>\r\n    Pnew = B(ind,:);\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/pick\/files\/BWTRACEBOUNDARY.gif\" alt=\"\"> <\/p><p>That result looks great, and will facilitate a good curve fit!<\/p><h4>Using <tt>regionprops<\/tt><a name=\"dd339f62-7b39-4616-9f82-6e715af8578e\"><\/a><\/h4><p>Anyone who has ever heard me lecture about image processing has likely heard me mention that <a href=\"https:\/\/www.mathworks.com\/help\/images\/ref\/regionprops.html\"><tt>regionprops<\/tt><\/a> can make difficult problems easy. Let's first use <tt>bwmorph<\/tt> to find the intersection of the two curves; it will present as a branch point in the skeletonized curve.<\/p><pre class=\"codeinput\">branchPoints = bwmorph(curveOfInterest,<span class=\"string\">'branchpoints'<\/span>);\r\n<\/pre><p>Now dilate that point and use it to break the curves at their branches.<\/p><pre class=\"codeinput\">branchPoints = imdilate(branchPoints,strel(<span class=\"string\">'disk'<\/span>,3));\r\ncurveOfInterest = curveOfInterest &amp; ~branchPoints;\r\nimshow(curveOfInterest)\r\naxis([100 420 180 400])\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2013\/extractEfficiencyWriteup2_05.png\" alt=\"\"> <p>In this zoomed view, we can see that we've now broken the curve into branches. We can also verify that we now have multiple ROIs...<\/p><pre class=\"codeinput\">cc = bwconncomp(curveOfInterest)\r\n<\/pre><pre class=\"codeoutput\">\r\ncc = \r\n\r\n    Connectivity: 8\r\n       ImageSize: [738 1067]\r\n      NumObjects: 11\r\n    PixelIdxList: {1x11 cell}\r\n\r\n<\/pre><p>...and we can use their shape properties to select the ones we want. (I will select the ones comprising more than 10 connected pixels, and that have a positive orientation (angle from the horizontal).<\/p><pre class=\"codeinput\">stats = regionprops(cc, <span class=\"string\">'Area'<\/span>,<span class=\"string\">'Orientation'<\/span>,<span class=\"string\">'Centroid'<\/span>);\r\nidx = find([stats.Area] &gt; 10 &amp; [stats.Orientation] &gt; 0);\r\ncurveOfInterest = ismember(labelmatrix(cc),idx);\r\nclf\r\nimshow(curveOfInterest)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2013\/extractEfficiencyWriteup2_06.png\" alt=\"\"> <pre class=\"codeinput\">figure\r\n[ys,xs] = find(curveOfInterest);\r\nplot(xs,ys,<span class=\"string\">'r.'<\/span>,<span class=\"string\">'MarkerSize'<\/span>,7)\r\naxis <span class=\"string\">ij<\/span>\r\naxis <span class=\"string\">equal<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2013\/extractEfficiencyWriteup2_07.png\" alt=\"\"> <h4>Next up: fitting a curve<a name=\"d1926b4f-e6ae-497e-9220-924f5a7945be\"><\/a><\/h4><p>So that essentially wraps up the \"image processing aspect\" of the problem, and from here, it becomes a problem of fitting a curve. For completeness, I'll hijack Steve's blog one more time to show how the Curve-Fitting Toolbox facilitates that process. See you next week!<\/p><h4>The complete series<a name=\"1ca56367-2370-4747-ac82-9b9abe5a8062\"><\/a><\/h4><div><ul><li><a href=\"https:\/\/blogs.mathworks.com\/steve\/2013\/12\/31\/automating-data-extraction-1\/\">Part 1<\/a><\/li><li><a href=\"https:\/\/blogs.mathworks.com\/steve\/2014\/01\/07\/automating-data-extraction-2\/\">Part 2<\/a><\/li><li><a href=\"https:\/\/blogs.mathworks.com\/steve\/2014\/01\/14\/automating-data-extraction-3\/\">Part 3<\/a><\/li><\/ul><\/div><script language=\"JavaScript\"> <!-- \r\n    function grabCode_5619775b69ae47678e596dab7c40962e() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='5619775b69ae47678e596dab7c40962e ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 5619775b69ae47678e596dab7c40962e';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2014 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_5619775b69ae47678e596dab7c40962e()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2013b<br><\/p><p class=\"footer\"><br>\r\n      Published with MATLAB&reg; R2013b<br><\/p><\/div><!--\r\n5619775b69ae47678e596dab7c40962e ##### SOURCE BEGIN #####\r\n%% Automating the extraction of real data from an image of the data, Part 2\r\n%\r\n% _I'd like to welcome back my fellow MATLAB Central blogger Brett Shoelson\r\n% for the second in a three-part series on extracting curve values from a\r\n% plot. You can find Brett over at the <https:\/\/blogs.mathworks.com\/pick\/ \r\n% File Exchange Pick of the Week blog>, or you \r\n% can check out his many <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/authors\/911 \r\n% File Exchange contributions>. -Steve_\r\n%\r\n%% Quick recap\r\n% Continuing from\r\n% <https:\/\/blogs.mathworks.com\/steve\/2013\/12\/13\/automating-data-extraction-1\/\r\n% last week's guest post>, we're heading down the path of extracting real\r\n% data from the \"efficiency curve\" in a graphical representation of those\r\n% data. To recap, here's the original image:\r\n\r\n%%\r\n% \r\n% <<https:\/\/blogs.mathworks.com\/pick\/files\/OriginalChart.png>>\r\n% \r\n\r\n%%\r\n% Here are the steps we need to repeat from\r\n% <https:\/\/blogs.mathworks.com\/steve\/2013\/12\/13\/automating-data-extraction-1\/\r\n% part 1>\r\nurl = 'https:\/\/blogs.mathworks.com\/images\/steve\/2013\/example-pump-perf-curve.jpg';\r\nchart = imread(url);\r\nbw = im2bw(imcomplement(rgb2gray(chart)));\r\nfilled = imfill(bw,'holes');\r\ncc = bwconncomp(filled);\r\nstats = regionprops(cc,'Area');\r\nA = [stats.Area];\r\n[~,biggest] = max(A);\r\nfilled(labelmatrix(cc)~=biggest) = 0;\r\nbb = regionprops(filled,'BoundingBox'); %We'll use this later!\r\nbb = bb.BoundingBox;\r\nchart(~repmat(filled,[1 1 3])) = 0;\r\ncurveOfInterest = rgb2ycbcr(chart);\r\ncurveOfInterest = curveOfInterest(:,:,2);\r\n\r\n%%\r\n% By the end of the first post, I had managed to reduce the SNR so that I\r\n% ended up with an image (curveOfInterest) that contains a single\r\n% region...that partially isolates the two blue curves in the top axes:\r\n\r\n%%\r\n% \r\n% <<https:\/\/blogs.mathworks.com\/pick\/files\/ExploreRGBChart4.png>>\r\n% \r\n\r\n%% Next up, segmentation\r\n% Many (perhaps most) image processing problems require some approach to\r\n% \"segmenting\" the imageREPLACE_WITH_DASH_DASHcreating a binary mask that shows white (\"true,\"\r\n% or 1) where we have signal, and black (\"false,\" or 0) elsewhere. Now that\r\n% I've masked and converted the original image to YCbCr\r\n% (\"luminance\/chrominance\/chrominance\"), and selected the second (Cb) plane\r\n% of the resulting image, segmentation is relatively easy. I'm going to use\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/38484-segmenttool--an-interactive-gui-for-segmenting-images SegmentTool>\r\n% to explore options for doing that:\r\n\r\n%%\r\n% \r\n% <<https:\/\/blogs.mathworks.com\/pick\/files\/SegmentToolChart.png>>\r\n% \r\n%%\r\n% Simple thresholding works well here (try it!), but I'm going to use a\r\n% different approachREPLACE_WITH_DASH_DASHone that sometimes succeeds where thresholding fails.\r\ncurveOfInterest = imextendedmax(im2double(curveOfInterest),0.30,8);\r\nimshow(curveOfInterest)\r\ntitle('Binary Curves')\r\n\r\n%% Morphologically clean the image\r\n% For subsequent analysis, and extraction of data, it will be useful to\r\n% <https:\/\/www.mathworks.com\/help\/images\/morphological-filtering.html morphologically alter>\r\n% the image.\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/23697-image-morphology MorphTool>\r\n% will be your best friend in that regard! ;)\r\n\r\n%%\r\n% \r\n% <<https:\/\/blogs.mathworks.com\/pick\/files\/MorphToolChart.png>>\r\n% \r\n\r\n%%\r\n% After playing around a bit, I decided to dilate the image, and then to\r\n% skeletonize it and remove spurs.\r\n% (<https:\/\/www.mathworks.com\/help\/images\/ref\/bwmorph.html |bwmorph|>\r\n% makes both of those latter processes easy.):\r\ncurveOfInterest = imdilate(curveOfInterest, strel('Disk',6));\r\ncurveOfInterest = bwmorph(curveOfInterest, 'skeleton', Inf);\r\ncurveOfInterest = bwmorph(curveOfInterest, 'spur', 11);\r\nimshow(curveOfInterest)\r\n\r\n%%\r\n% If the curves in that image don't look contiguous, recognize that that's\r\n% just a screen-resolution issue. There's now exactly one region in this\r\n% image:\r\ncc = bwconncomp(curveOfInterest);\r\ncc.NumObjects\r\n\r\n%% Extracting the curve of interest\r\n% *And now for the difficult part*\r\n%\r\n% Okay, we're well on our way. We've eliminated the black and red curves\r\n% that we don't need, and we've got the curve of interest reduced to a\r\n% skeleton that contains the minimum information that we _do_ need. But we\r\n% still have to find a way to extract _only_ the efficiency curve, and to\r\n% discard the curve representing \"334-mm\" of head. We can't use color; the\r\n% curves in the original image are identical. So we're left with\r\n% differentiating the curves based on their morphological (shape)\r\n% characteristics.\r\n\r\n%% \r\n% *I'm going to do this two different ways*\r\n% \r\n% Sometimes it's useful to be able to trace the boundary of a region, or\r\n% (as we do here), to trace along a particular path. |bwtraceboundary| will\r\n% allow us to do just that.\r\n\r\n%% BWTRACEBOUNDARY\r\n% If you've never used it before, take a moment to read the\r\n% <https:\/\/www.mathworks.com\/help\/images\/ref\/bwtraceboundary.html documentation for |bwtraceboundary|>.\r\n% You'll find that it allows you to operate on a binary image by specifying \r\n% a starting point (_P_), an initial trajectory (_fstep_) along a path in\r\n% the image, and a direction (_dir_) in which you want to trace the path.\r\n% (\"Direction\" here allows you to specify the overall orientation of the\r\n% boundary you want to trace.)\r\n\r\n%%\r\n% *Automating the determination of the starting point*\r\n% \r\n% Recalling that the curve of interest always starts in the lower left of\r\n% the upper axes, we can readily automate the determination of the starting\r\n% point. (It will be the point closest to the bottom left of the masked\r\n% axes.) I'm going to use |regionprops| to determine the bounding box of\r\n% the masked axes (currently contained in variable \"filled\"). However, the\r\n% doc for regionprops indicates that BoundingBox is measured from the upper\r\n% left; I need the lower left so I'll have to manipulate the calculation of\r\n% the bounding box a bit:\r\ntargetPoint = regionprops(filled,'BoundingBox');\r\ntargetPoint = targetPoint.BoundingBox(1:2) + [0 targetPoint.BoundingBox(4)];\r\n[ys,xs] = find(curveOfInterest);\r\n\r\n%%\r\n% Now it remains to get the distance from all points to the lower left corner\r\n% of the bounding box, and to find the point that minimizes that distance.\r\n% (I'm going to use a Statistics Toolbox function here, but this is a \r\n% relatively easy function to write if you don't have that Toolbox.)\r\n[~,ind] = min(pdist2(targetPoint,[xs,ys]));\r\nP = [ys(ind),xs(ind)]; %(Careful here: P is represented as [row,col], not [x,y]!)\r\n\r\n%% \r\n% *Now we need to _orient_ the trace*\r\n%\r\n% Consider the results using two different orientations:\r\nN = 1000; %If I leave this unbounded (Inf), all points on the curve will be detected!\r\n% Counterclockwise\r\nBccw = bwtraceboundary(curveOfInterest,P,'NE',8,N,'counterclockwise');\r\nimshow(curveOfInterest);\r\nhold on\r\nplot(P(2),P(1),'go','MarkerSize',12)\r\nplot(Bccw(:,2),Bccw(:,1),'r','LineWidth',5);\r\ntitle('Counterclockwise')\r\nhold off\r\n\r\n%%\r\n% Now clockwise.\r\nBcw = bwtraceboundary(curveOfInterest,P,'NE',8,N,'clockwise');\r\nimshow(curveOfInterest);\r\nhold on\r\nplot(P(2),P(1),'go','MarkerSize',12)\r\nplot(Bcw(:,2),Bcw(:,1),'r','LineWidth',5);\r\ntitle('Clockwise')\r\nhold off\r\n\r\n%%\r\n% That's interesting; I see that if I specify a clockwise orientation for\r\n% my curve (and if I use a sufficiently large _N_), I can constrain the\r\n% boundary-tracing algorith to _roughly_ follow the efficiency curve I'm\r\n% after. However, there's a significant spur on the data that may present\r\n% difficulties when we try to fit the data to a curve that reasonably\r\n% represents the efficiency-versus-flow rate. If I break this up into\r\n% smaller pieces and examine the output at each stage, I find that I can\r\n% significantly improve this result. Here, for instance, I \"step\" in an\r\n% easterly direction, initially specifying a counterclockwise orientation.\r\n% Then I iterate over small chunks, and evaluate the results after each\r\n% iteration. When the curve stops increasing in the _x-_direction (i.e.,\r\n% when the largest column value isn't equal to the step size), I change the\r\n% orientation to keep the trace following the desired curve. (Note that\r\n% there are lots of combinations of parameters that will work here; this\r\n% step took some experimentation!)\r\n\r\n%% \r\n%   sz = 25; \r\n%   ind = sz;\r\n%   nIters = 100;\r\n%   allBs = nan(sz*nIters,2); \r\n%   Pnew = P; %reset\r\n%   h = text(200,385,'Direction:','FontSize',14,...\r\n%       'FontWeight','bold','color','w','HorizontalAlignment','left')\r\n%   for ii = 1:nIters\r\n%       if ind == sz\r\n%           direction = 'counterclockwise';\r\n%       else\r\n%           direction = 'clockwise';\r\n%       end\r\n%       set(h, 'string',['Direction: ',direction])\r\n%       %B yields [ROW,COL]:\r\n%       B = bwtraceboundary(curveOfInterest,Pnew,'E',8,sz,direction);\r\n%       hold on; plot(B(:,2),B(:,1),'r.');drawnow;\r\n%       allBs((ii-1)*sz+1:(ii-1)*sz+sz,:) = B;\r\n%       [~,ind] = max(B(:,2));\r\n%       % Update the starting point, Pnew:\r\n%       Pnew = B(ind,:);\r\n%   end\r\n\r\n%%\r\n% \r\n% <<https:\/\/blogs.mathworks.com\/pick\/files\/BWTRACEBOUNDARY.gif>>\r\n% \r\n\r\n%% \r\n% That result looks great, and will facilitate a good curve fit!\r\n\r\n%% Using |regionprops|\r\n% Anyone who has ever heard me lecture about image processing has likely\r\n% heard me mention that\r\n% <https:\/\/www.mathworks.com\/help\/images\/ref\/regionprops.html |regionprops|>\r\n% can make difficult problems easy. Let's first use |bwmorph| to find the\r\n% intersection of the two curves; it will present as a branch point in the\r\n% skeletonized curve.\r\nbranchPoints = bwmorph(curveOfInterest,'branchpoints');\r\n\r\n%%\r\n% Now dilate that point and use it to break the curves at their branches.\r\nbranchPoints = imdilate(branchPoints,strel('disk',3));\r\ncurveOfInterest = curveOfInterest & ~branchPoints;\r\nimshow(curveOfInterest)\r\naxis([100 420 180 400])\r\n\r\n%%\r\n% In this zoomed view, we can see that we've now broken the curve into\r\n% branches. We can also verify that we now have multiple ROIs...\r\ncc = bwconncomp(curveOfInterest)\r\n\r\n%%\r\n% ...and we can use their shape properties to select the ones we want. (I\r\n% will select the ones comprising more than 10 connected pixels, and that\r\n% have a positive orientation (angle from the horizontal).\r\n%\r\nstats = regionprops(cc, 'Area','Orientation','Centroid');\r\nidx = find([stats.Area] > 10 & [stats.Orientation] > 0);\r\ncurveOfInterest = ismember(labelmatrix(cc),idx);\r\nclf\r\nimshow(curveOfInterest)\r\n\r\n%%\r\nfigure\r\n[ys,xs] = find(curveOfInterest);\r\nplot(xs,ys,'r.','MarkerSize',7)\r\naxis ij\r\naxis equal\r\n\r\n%% Next up: fitting a curve\r\n% So that essentially wraps up the \"image processing aspect\" of the\r\n% problem, and from here, it becomes a problem of fitting a curve. For\r\n% completeness, I'll hijack Steve's blog one more time to show how the\r\n% Curve-Fitting Toolbox facilitates that process. See you next week!\r\n\r\n%% The complete series\r\n%\r\n% * <https:\/\/blogs.mathworks.com\/steve\/2013\/12\/31\/automating-data-extraction-1\/ Part 1> \r\n% * <https:\/\/blogs.mathworks.com\/steve\/2014\/01\/07\/automating-data-extraction-2\/ Part 2>\r\n% * <https:\/\/blogs.mathworks.com\/steve\/2014\/01\/14\/automating-data-extraction-3\/ Part 3>\r\n\r\n##### SOURCE END ##### 5619775b69ae47678e596dab7c40962e\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2013\/extractEfficiencyWriteup2_07.png\" onError=\"this.style.display ='none';\" \/><\/div><!--introduction--><p><i>I'd like to welcome back my fellow MATLAB Central blogger Brett Shoelson for the second in a three-part series on extracting curve values from a plot. You can find Brett over at the <a href=\"https:\/\/blogs.mathworks.com\/pick\/\">File Exchange Pick of the Week blog<\/a>, or you can check out his many <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/authors\/911\">File Exchange contributions<\/a>. -Steve<\/i>... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2014\/01\/07\/automating-data-extraction-2\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[50,561,86,442,178,725,348,90,84,390,146,124,142,136,76,36,422,563,122,380,575,1057,68,168,116,104,1055,106,78,52],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/933"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=933"}],"version-history":[{"count":12,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/933\/revisions"}],"predecessor-version":[{"id":2303,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/933\/revisions\/2303"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=933"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=933"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=933"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}