{"id":9246,"date":"2018-02-02T09:00:08","date_gmt":"2018-02-02T14:00:08","guid":{"rendered":"https:\/\/blogs.mathworks.com\/pick\/?p=9246"},"modified":"2018-02-16T16:47:09","modified_gmt":"2018-02-16T21:47:09","slug":"new-mathworks-tools","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/pick\/2018\/02\/02\/new-mathworks-tools\/","title":{"rendered":"New MathWorks Tools"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\n   <introduction><\/p>\n<p><a href=\"https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/3208495\">Sean<\/a>&#8216;s pick this week is to revisit three prior Picks of the Week.  While reading through the 2017 review and reviews for the previous years, I saw a few picks where MathWorks has now incorporated similar functionality into the product.\n      <\/p>\n<p>   <\/introduction><\/p>\n<h3>Contents<\/h3>\n<div>\n<ul>\n<li><a href=\"#1\">Extract Text from PDF Documents<\/a><\/li>\n<li><a href=\"#3\">Base 64 Encoding<\/a><\/li>\n<li><a href=\"#5\">Word Cloud<\/a><\/li>\n<li><a href=\"#8\">Comments<\/a><\/li>\n<\/ul><\/div>\n<h3>Extract Text from PDF Documents<a name=\"1\"><\/a><\/h3>\n<p>Jiro&#8217;s original pick is here: <a href=\"https:\/\/blogs.mathworks.com\/pick\/2017\/07\/21\/extract-text-from-pdf-documents\/\">https:\/\/blogs.mathworks.com\/pick\/2017\/07\/21\/extract-text-from-pdf-documents\/<\/a>.\n   <\/p>\n<p>This functionality was added in the <a href=\"https:\/\/www.mathworks.com\/products\/text-analytics.html\">Text Analytics Toolbox<\/a>, released in R2017b.  The function to use is <tt><a href=\"https:\/\/www.mathworks.com\/help\/textanalytics\/ref\/extractfiletext.html\">extractFileText<\/a><\/tt>.  Note that this is a generic text reading function that can read from PDF, Microsoft Word, or text files.\n   <\/p>\n<p>Here, I&#8217;ll read the second page of the 2017 Pick of the Week index exported to pdf.<\/p>\n<pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">txt = extractFileText(<span style=\"color: #A020F0\">'2017review.pdf'<\/span>, <span style=\"color: #A020F0\">'Pages'<\/span>, 3);\r\nlines = splitlines(txt);\r\nlines(strlength(lines) &gt; 0)<\/pre>\n<pre style=\"font-style:oblique\">ans = \r\n  46&times;1 string array\r\n    \"1\/3\/2018 Looking back: 2017 in review &raquo; File Exchange Pick of the Week\"\r\n    \"https:\/\/blogs.mathworks.com\/pick\/2017\/12\/29\/looking-back-2017-in-review\/ 3\/5\"\r\n    \"Deep Learning Tutorial Series\"\r\n    \"Johanna Pingel\"\r\n    \"Download code and watch video series to learn and implement deep learning techniques\"\r\n    \"__________________________________________________________________________\"\r\n    \"Process Manager\"\r\n    \"Brian Lau\"\r\n    \"Matlab class for launching and managing asynchronous processes\"\r\n    \"__________________________________________________________________________\"\r\n    \"CatStruct\"\r\n    \"Jos (10584)\"\r\n    \"Concatenate\/merge structures (v4.1, feb 2015).\"\r\n    \"__________________________________________________________________________\"\r\n    \"Source Control Information Block\"\r\n    \"Gavin Walker\"\r\n    \"Display Simulink project source control information in the Simulink editor\"\r\n    \"__________________________________________________________________________\"\r\n    \"CNN for Old Japanese Character Classification\"\r\n    \"Akira Agata\"\r\n    \"Create Simple Deep Learning Network for Old Japanese Character Classification\"\r\n    \"__________________________________________________________________________\"\r\n    \"Fidget Spinner (Simscape Multibody)\"\r\n    \"Pavel Roslovets\"\r\n    \"3DOF gyro psysical model of fidger spinner\"\r\n    \"__________________________________________________________________________\"\r\n    \"Signature Tool\"\r\n    \"McSCert\"\r\n    \"The Signature Tool extracts the interface of a Simulink subsystem.\"\r\n    \"__________________________________________________________________________\"\r\n    \"&#8220;Read text from a PDF document&#8221;\"\r\n    \"Derek Wood\"\r\n    \"Read the text from a simple PDF document into MATLAB as a string\"\r\n    \"__________________________________________________________________________\"\r\n    \"Real-Time Pacer for Simulink\"\r\n    \"Gautam Vallabha\"\r\n    \"Simulink block for forcing a simulation to run in real (wall clock) time\"\r\n    \"__________________________________________________________________________\"\r\n    \"impressionism\"\r\n    \"David Mills\"\r\n    \"impressionism takes an RGB image and &#8220;paints&#8221; it as though it were an impressionist painting.\"\r\n    \"__________________________________________________________________________\"\r\n    \"OOP example\"\r\n    \"per isakson\"\r\n    \"tracer4m traces calls to methods and functions.\"\r\n    \"__________________________________________________________________________\"\r\n<\/pre>\n<h3>Base 64 Encoding<a name=\"3\"><\/a><\/h3>\n<p>Jiro&#8217;s original pick is here: <a href=\"https:\/\/blogs.mathworks.com\/pick\/2016\/12\/23\/encode-images-as-base64\/\">https:\/\/blogs.mathworks.com\/pick\/2016\/12\/23\/encode-images-as-base64\/<\/a><\/p>\n<p>We were unaware that this was added as part of the HTTP interface in R2016b. The functions to use are <tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.net.base64encode.html\">matlab.net.base64encode<\/a><\/tt> and <tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.net.base64decode.html\">matlab.net.base64decode<\/a><\/tt> to encode and decode images.\n   <\/p>\n<p>Here I will encode and decode an image of two cars on my desk.<\/p>\n<pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">import <span style=\"color: #A020F0\">matlab.net.*<\/span>\r\nI = imread(<span style=\"color: #A020F0\">'lambos.jpg'<\/span>);\r\nbase64 = base64encode(I(:));\r\ncars = base64decode(base64);\r\nimshow(reshape(cars, size(I)))<\/pre>\n<p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/pick\/Sean\/mainReview1plain\/mainReview1plain_01.png\"> <\/p>\n<h3>Word Cloud<a name=\"5\"><\/a><\/h3>\n<p>My original pick is here: <a href=\"https:\/\/blogs.mathworks.com\/pick\/2015\/10\/09\/word-data-visualization\/\">https:\/\/blogs.mathworks.com\/pick\/2015\/10\/09\/word-data-visualization\/<\/a>.\n   <\/p>\n<p>This capability was added to MATLAB in R2017b as the <tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/wordcloud.html\">wordcloud<\/a><\/tt> function.  The Text Analytics Toolbox further <a href=\"https:\/\/www.mathworks.com\/help\/textanalytics\/ref\/ldamodel.wordcloud.html\">enhances it<\/a> and provides <a href=\"https:\/\/www.mathworks.com\/help\/textanalytics\/display-and-presentation.html\">other ways to display text<\/a> as well.  Let&#8217;s see the word cloud of the Pick of the Week review.\n   <\/p>\n<pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">txt = extractFileText(<span style=\"color: #A020F0\">'2017review.pdf'<\/span>);\r\nwordcloud(txt);<\/pre>\n<p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/pick\/Sean\/mainReview1plain\/mainReview1plain_02.png\"> <\/p>\n<p>It makes me happy to see &#8220;Cannibals&#8221; in there!<\/p>\n<h3>Comments<a name=\"8\"><\/a><\/h3>\n<p>Give the MathWorks versions a try and let us know what you think <a href=\"https:\/\/blogs.mathworks.com\/pick\/?p=9246#respond\">here<\/a>.\n   <\/p>\n<p><script language=\"JavaScript\">\n<!--\n\n    function grabCode_21fb33cad34547f3ac2034989cd1fc1e() {\n        \/\/ Remember the title so we can use it in the new page\n        title = document.title;\n\n        \/\/ Break up these strings so that their presence\n        \/\/ in the Javascript doesn't mess up the search for\n        \/\/ the MATLAB code.\n        t1='21fb33cad34547f3ac2034989cd1fc1e ' + '##### ' + 'SOURCE BEGIN' + ' #####';\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 21fb33cad34547f3ac2034989cd1fc1e';\n    \n        b=document.getElementsByTagName('body')[0];\n        i1=b.innerHTML.indexOf(t1)+t1.length;\n        i2=b.innerHTML.indexOf(t2);\n \n        code_string = b.innerHTML.substring(i1, i2);\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\n\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \n        \/\/ in the XML parser.\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\n        \/\/ doesn't go ahead and substitute the less-than character. \n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\n\n        author = 'Sean de Wolski';\n        copyright = 'Copyright 2018 The MathWorks, Inc.';\n\n        w = window.open();\n        d = w.document;\n        d.write('\n\n<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\n\n\\n');\n      \n      d.title = title + ' (MATLAB code)';\n      d.close();\n      }   \n      \n-->\n<\/script><\/p>\n<p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><a href=\"javascript:grabCode_21fb33cad34547f3ac2034989cd1fc1e()\"><span style=\"font-size: x-small;        font-style: italic;\">Get<br \/>\n            the MATLAB code<br \/>\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><\/p>\n<p>      Published with MATLAB&reg; R2018a<\/p>\n<\/div>\n<p><!--\n21fb33cad34547f3ac2034989cd1fc1e ##### SOURCE BEGIN #####\n%% New MathWorks Tools\n% <https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/3208495 Sean>'s pick \n% this week is to revisit three prior Picks of the Week.  While reading through \n% the 2017 review and reviews for the previous years, I saw a few picks where \n% MathWorks has now incorporated similar functionality into the product.\n%% Extract Text from PDF Documents\n% Jiro's original pick is here: <https:\/\/blogs.mathworks.com\/pick\/2017\/07\/21\/extract-text-from-pdf-documents\/ \n% https:\/\/blogs.mathworks.com\/pick\/2017\/07\/21\/extract-text-from-pdf-documents\/>.\n% \n% This functionality was added in the <https:\/\/www.mathworks.com\/products\/text-analytics.html \n% Text Analytics Toolbox>, released in R2017b.  The function to use is |<https:\/\/www.mathworks.com\/help\/textanalytics\/ref\/extractfiletext.html \n% extractFileText>|.  Note that this is a generic text reading function that can \n% read from PDF, Microsoft Word, or text files.  There is also |<https:\/\/www.mathworks.com\/help\/textanalytics\/ref\/readpdfformdata.html \n% readPDFFormData>| that will pull out the contents of a PDF form.   \n% \n% Here, I'll read the second page of the 2017 Pick of the Week index exported \n% to pdf.\n%%\ntxt = extractFileText('2017review.pdf', 'Pages', 3);\nlines = splitlines(txt);\nlines(strlength(lines) > 0)\n%% Base 64 Encoding\n% Jiro's original pick is here: <https:\/\/blogs.mathworks.com\/pick\/2016\/12\/23\/encode-images-as-base64\/ \n% https:\/\/blogs.mathworks.com\/pick\/2016\/12\/23\/encode-images-as-base64\/>\n% \n% We were unaware that this was added as part of the HTTP interface in R2016b.  \n% The functions to use are |<https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.net.base64encode.html \n% matlab.net.base64encode>| and |<https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.net.base64decode.html \n% matlab.net.base64decode>| to encode and decode images.\n% \n% Here I will encode and decode an image of a three of my dogs.\n%%\nimport matlab.net.*\nI = imread('lambos.jpg');\nbase64 = base64encode(I(:));\ndogs = base64decode(base64);\nimshow(reshape(dogs, size(I)))\n%% Word Cloud\n% My original pick is here: <https:\/\/blogs.mathworks.com\/pick\/2015\/10\/09\/word-data-visualization\/ \n% https:\/\/blogs.mathworks.com\/pick\/2015\/10\/09\/word-data-visualization\/>.\n% \n% This capability was added to MATLAB in R2017b as the |<https:\/\/www.mathworks.com\/help\/matlab\/ref\/wordcloud.html \n% wordcloud>| function.  The Text Analytics Toolbox further <https:\/\/www.mathworks.com\/help\/textanalytics\/ref\/ldamodel.wordcloud.html \n% enhances it> and provides <https:\/\/www.mathworks.com\/help\/textanalytics\/display-and-presentation.html \n% other ways to display text> as well.  Let's see the word cloud of the Pick of \n% the Week review.\n%%\ntxt = extractFileText('2017review.pdf');\nwordcloud(txt);\n%% \n% It makes me happy to see \"Cannibals\" in there!\n%% Comments\n% Give the MathWorks versions a try and let us know what you think <https:\/\/blogs.mathworks.com\/pick\/?p=9246#respond \n% here>.\n##### SOURCE END ##### 21fb33cad34547f3ac2034989cd1fc1e\n--><\/p>\n","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/images\/pick\/Sean\/mainReview1plain\/mainReview1plain_01.png\" onError=\"this.style.display ='none';\" \/><\/div>\n<p>Sean&#8216;s pick this week is to revisit three prior Picks of the Week.  While reading through the 2017 review and reviews for the previous years, I saw a few picks where MathWorks has now&#8230; <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/pick\/2018\/02\/02\/new-mathworks-tools\/\">read more >><\/a><\/p>\n","protected":false},"author":87,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[16],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/posts\/9246"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/users\/87"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/comments?post=9246"}],"version-history":[{"count":7,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/posts\/9246\/revisions"}],"predecessor-version":[{"id":9484,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/posts\/9246\/revisions\/9484"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/media?parent=9246"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/categories?post=9246"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/tags?post=9246"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}