{"id":207,"date":"2009-11-26T20:31:31","date_gmt":"2009-11-26T20:31:31","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/2009\/11\/26\/unique-values-without-rearrangement\/"},"modified":"2018-01-08T15:24:52","modified_gmt":"2018-01-08T20:24:52","slug":"unique-values-without-rearrangement","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2009\/11\/26\/unique-values-without-rearrangement\/","title":{"rendered":"Unique Values Without Rearrangement"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <introduction>\r\n      <p>In MATLAB, the simplest form of the function <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2009b\/techdoc\/ref\/unique.html\"><tt>unique<\/tt><\/a> returns the unique values contained in a numeric vector, with the results sorted.  This is often acceptable, but sometimes\r\n         <a>a user<\/a> prefers the results in the order originally found in the data.\r\n      <\/p>\r\n   <\/introduction>\r\n   <h3>Contents<\/h3>\r\n   <div>\r\n      <ul>\r\n         <li><a href=\"#1\">Algorithm for unique<\/a><\/li>\r\n         <li><a href=\"#2\">Avoid the Sorted Output<\/a><\/li>\r\n         <li><a href=\"#4\">Code in Action<\/a><\/li>\r\n         <li><a href=\"#9\">Do You Unique Data Values Unsorted?<\/a><\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <h3>Algorithm for unique<a name=\"1\"><\/a><\/h3>\r\n   <p>The reason the results are sorted is because of the algorithm used by <tt>unique<\/tt>.  Conceptually, the input data is sorted, and then adjacent elements are compared.  If there are equal elements, all elements\r\n      except the first or the last are removed (depending on how you call the function).  Hence, the output is sorted.\r\n   <\/p>\r\n   <h3>Avoid the Sorted Output<a name=\"2\"><\/a><\/h3>\r\n   <p>To avoid the sorted output, you can simply <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2009b\/techdoc\/ref\/sort.html\"><tt>sort<\/tt><\/a> the data first, retaining the indices from the sorting operation.  Study the examples for <tt>sort<\/tt> to see how to use the second output of indices.\r\n   <\/p>\r\n   <p>There were a couple of solutions posted with similar ideas, but different implementations.  I'll walk you through the one\r\n      posted by <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/authors\/15233\">Jan Simon<\/a>. The idea Jan uses is to take the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2009b\/techdoc\/ref\/diff.html\">difference<\/a> of the sorted results and find where the differences are not zero (i.e., they <b>are<\/b> different values).  Create the correct indices for these now <i>unique<\/i> values in the logical vector <tt>UV<\/tt>. Finally use this set of logical indices to extract the required values from the original data.  Notice that this solution\r\n      doesn't call the function <tt>unique<\/tt> and only calls the function <tt>sort<\/tt> one time.\r\n   <\/p>\r\n   <h3>Code in Action<a name=\"4\"><\/a><\/h3>\r\n   <p>Let's create <tt>X<\/tt> and see what happens in the code.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">myString = <span style=\"color: #A020F0\">'now is the time for cheering, tgif!'<\/span>;\r\nX = double(myString)<\/pre><pre style=\"font-style:oblique\">X =\r\n  Columns 1 through 13\r\n   110   111   119    32   105   115    32   116   104   101    32   116   105\r\n  Columns 14 through 26\r\n   109   101    32   102   111   114    32    99   104   101   101   114   105\r\n  Columns 27 through 35\r\n   110   103    44    32   116   103   105   102    33\r\n<\/pre><p>You can see the data <tt>X<\/tt> is now sorted in <tt>Xs<\/tt> and <tt>SortVec<\/tt> tracks the original locations of the values.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">[Xs, SortVec] = sort(X(:))<\/pre><pre style=\"font-style:oblique\">Xs =\r\n    32\r\n    32\r\n    32\r\n    32\r\n    32\r\n    32\r\n    33\r\n    44\r\n    99\r\n   101\r\n   101\r\n   101\r\n   101\r\n   102\r\n   102\r\n   103\r\n   103\r\n   104\r\n   104\r\n   105\r\n   105\r\n   105\r\n   105\r\n   109\r\n   110\r\n   110\r\n   111\r\n   111\r\n   114\r\n   114\r\n   115\r\n   116\r\n   116\r\n   116\r\n   119\r\nSortVec =\r\n     4\r\n     7\r\n    11\r\n    16\r\n    20\r\n    30\r\n    35\r\n    29\r\n    21\r\n    10\r\n    15\r\n    23\r\n    24\r\n    17\r\n    34\r\n    28\r\n    32\r\n     9\r\n    22\r\n     5\r\n    13\r\n    26\r\n    33\r\n    14\r\n     1\r\n    27\r\n     2\r\n    18\r\n    19\r\n    25\r\n     6\r\n     8\r\n    12\r\n    31\r\n     3\r\n<\/pre><p>Now place the unique values (when <tt>diff<\/tt> isn't 0) into a logical vector according to the sorting.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">UV(SortVec) = ([1; diff(Xs)] ~= 0)<\/pre><pre style=\"font-style:oblique\">UV =\r\n  Columns 1 through 13\r\n     1     1     1     1     1     1     0     1     1     1     0     0     0\r\n  Columns 14 through 26\r\n     1     0     0     1     0     1     0     1     0     0     0     0     0\r\n  Columns 27 through 35\r\n     0     1     1     0     0     0     0     0     1\r\n<\/pre><p>Use the logical vector to re-scramble the sorting that occurred with the original data.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">Y = X(UV)<\/pre><pre style=\"font-style:oblique\">Y =\r\n  Columns 1 through 13\r\n   110   111   119    32   105   115   116   104   101   109   102   114    99\r\n  Columns 14 through 16\r\n   103    44    33\r\n<\/pre><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">finalString = char(Y)<\/pre><pre style=\"font-style:oblique\">finalString =\r\nnow isthemfrcg,!\r\n<\/pre><h3>Do You Unique Data Values Unsorted?<a name=\"9\"><\/a><\/h3>\r\n   <p>Do you need unsorted unique values as part of your data processing?  I'd love to hear more <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=207#respond\">here<\/a>. In the meantime, perhaps you could create a cryptic signature of the day by running your thoughts through this algorithm!\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_b48a938bc2cb4eb7bd6b1d843e39e561() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='b48a938bc2cb4eb7bd6b1d843e39e561 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' b48a938bc2cb4eb7bd6b1d843e39e561';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Loren Shure';\r\n        copyright = 'Copyright 2009 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_b48a938bc2cb4eb7bd6b1d843e39e561()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.9<br><\/p>\r\n<\/div>\r\n<!--\r\nb48a938bc2cb4eb7bd6b1d843e39e561 ##### SOURCE BEGIN #####\r\n%% Unique Values Without Rearrangement\r\n% In MATLAB, the simplest form of the function \r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2009b\/techdoc\/ref\/unique.html |unique|>\r\n% returns the unique values contained in a numeric vector, with the results\r\n% sorted.  This is often acceptable, but sometimes\r\n% <http:\/\/view_thread\/263654#688586 a user>\r\n% prefers the results in the order originally found in the data.\r\n%% Algorithm for unique\r\n% The reason the results are sorted is because of the algorithm used by\r\n% |unique|.  Conceptually, the input data is sorted, and then adjacent\r\n% elements are compared.  If there are equal elements, all elements except\r\n% the first or the last are removed (depending on how you call the \r\n% function).  Hence, the output is sorted.\r\n%% Avoid the Sorted Output\r\n% To avoid the sorted output, you can simply \r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2009b\/techdoc\/ref\/sort.html |sort|>\r\n% the data first, retaining the indices from the sorting operation.  Study \r\n% the examples for |sort| to see how to use the second output of indices. \r\n%%\r\n% There\r\n% were a couple of solutions posted with similar ideas, but different\r\n% implementations.  I'll walk you through the one posted by \r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/authors\/15233 Jan Simon>.\r\n% The idea Jan uses is to take the\r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2009b\/techdoc\/ref\/diff.html difference>\r\n% of the sorted results and find where the differences are not zero \r\n% (i.e., they *are* different values).  Create the correct indices for\r\n% these now _unique_ values in the logical vector |UV|. Finally use this\r\n% set of logical indices to extract the required values from the original\r\n% data.  Notice that this solution doesn't call the function |unique| and\r\n% only calls the function |sort| one time.\r\n%% Code in Action\r\n% Let's create |X| and see what happens in the code.\r\nmyString = 'now is the time for cheering, tgif!';\r\nX = double(myString)\r\n%%\r\n% You can see the data |X| is now sorted in |Xs| and |SortVec| tracks the\r\n% original locations of the values.\r\n[Xs, SortVec] = sort(X(:))\r\n%%\r\n% Now place the unique values (when |diff| isn't 0) into a logical vector\r\n% according to the sorting.\r\nUV(SortVec) = ([1; diff(Xs)] ~= 0)\r\n%%\r\n% Use the logical vector to re-scramble the sorting that occurred with the\r\n% original data.\r\nY = X(UV)\r\n%%\r\nfinalString = char(Y)\r\n%% Do You Unique Data Values Unsorted?\r\n% Do you need unsorted unique values as part of your data processing?  I'd\r\n% love to hear more <https:\/\/blogs.mathworks.com\/loren\/?p=207#respond here>.\r\n% In the meantime, perhaps you could create a cryptic signature of the day\r\n% by running your thoughts through this algorithm!  \r\n\r\n##### SOURCE END ##### b48a938bc2cb4eb7bd6b1d843e39e561\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   \r\n      In MATLAB, the simplest form of the function unique returns the unique values contained in a numeric vector, with the results sorted.  This is often acceptable, but sometimes\r\n         a... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2009\/11\/26\/unique-values-without-rearrangement\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[33,4],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/207"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=207"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/207\/revisions"}],"predecessor-version":[{"id":2578,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/207\/revisions\/2578"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=207"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=207"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=207"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}