{"id":4292,"date":"2019-07-03T11:04:18","date_gmt":"2019-07-03T16:04:18","guid":{"rendered":"https:\/\/blogs.mathworks.com\/videos\/?p=4292"},"modified":"2019-07-03T11:04:18","modified_gmt":"2019-07-03T16:04:18","slug":"using-detectimportoptions-with-large-text-files","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/videos\/2019\/07\/03\/using-detectimportoptions-with-large-text-files\/","title":{"rendered":"Using detectImportOptions with Large Text Files"},"content":{"rendered":"<p>Yesterday, I was loading a CSV file of about 1 million rows and 300 columns, comprising lots of string variables. It took a while to load, then I remembered I only needed 1 or 2 columns and how <tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/detectimportoptions.html\">detectImportOptions<\/a><\/tt> helps you specify which columns to load. It lets you specify the variable names in the header to include, which is much easier than specifying column indices, especially if columns move around.<\/p>\n<p>In fact, <tt>detectImportOptions<\/tt> combined with <tt>readtable<\/tt> is now my main method of loading subsets of data from large text files. Gone are the days of trying to calculate format strings with <tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/textscan.html\">textscan<\/a><\/tt>. In the past, I even made this submission <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/16075-textscantool\">textscantool<\/a> on the File Exchange to calculate the format strings for text files with many columns.<\/p>\n<p>Features covered in this <a href=\"https:\/\/blogs.mathworks.com\/videos\/2015\/10\/29\/matlab-code-along-videos\/\">code-along<\/a> style video include:<\/p>\n<ul>\n<li><tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/detectimportoptions.html\">detectImportOptions<\/a><\/tt><\/li>\n<li><tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/readtable.html\">readtable<\/a><\/tt><\/li>\n<\/ul>\n<p><div class=\"row\"><div class=\"col-xs-12 containing-block\"><div class=\"bc-outer-container add_margin_20\"><videoplayer><div class=\"video-js-container\"><video data-video-id=\"6055173583001\" data-video-category=\"blog\" data-autostart=\"false\" data-account=\"62009828001\" data-omniture-account=\"mathwgbl\" data-player=\"rJ9XCz2Sx\" data-embed=\"default\" id=\"mathworks-brightcove-player\" class=\"video-js\" controls><\/video><script src=\"\/\/players.brightcove.net\/62009828001\/rJ9XCz2Sx_default\/index.min.js\"><\/script><script>if (typeof(playerLoaded) === 'undefined') {var playerLoaded = false;}(function isVideojsDefined() {if (typeof(videojs) !== 'undefined') {videojs(\"mathworks-brightcove-player\").on('loadedmetadata', function() {playerLoaded = true;});} else {setTimeout(isVideojsDefined, 10);}})();<\/script><\/div><\/videoplayer><\/div><\/div><\/div><\/p>\n<p>Play the video in full screen mode for a better viewing experience.\u00a0<\/p>\n","protected":false},"excerpt":{"rendered":"<div class=\"thumbnail thumbnail_asset asset_overlay video\"><a href=\"https:\/\/blogs.mathworks.com\/videos\/2019\/07\/03\/using-detectimportoptions-with-large-text-files\/?dir=autoplay\"><img decoding=\"async\" src=\"https:\/\/cf-images.us-east-1.prod.boltdns.net\/v1\/static\/62009828001\/bd7f7c18-dc6c-4a0a-889a-0a16ffbcb0fd\/b4180cd6-ea10-44d8-8274-362543f03857\/1280x720\/match\/image.jpg\" onError=\"this.style.display ='none';\"\/><\/p>\n<div class=\"overlay_container\">\n      <span class=\"icon-video icon_color_null\"><time class=\"video_length\">12:15<\/time><\/span>\n      <\/div>\n<p>      <\/a><\/div>\n<p>Yesterday, I was loading a CSV file of about 1 million rows and 300 columns, comprising lots of string variables. It took a while to load, then I remembered I only needed 1 or 2 columns and how&#8230; <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/videos\/2019\/07\/03\/using-detectimportoptions-with-large-text-files\/\">read more >><\/a><\/p>\n","protected":false},"author":133,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[27,4],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/posts\/4292"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/users\/133"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/comments?post=4292"}],"version-history":[{"count":12,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/posts\/4292\/revisions"}],"predecessor-version":[{"id":4316,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/posts\/4292\/revisions\/4316"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/media?parent=4292"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/categories?post=4292"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/videos\/wp-json\/wp\/v2\/tags?post=4292"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}