{"id":41,"date":"2006-06-14T12:32:08","date_gmt":"2006-06-14T17:32:08","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=41"},"modified":"2017-05-01T19:32:23","modified_gmt":"2017-05-02T00:32:23","slug":"inverse-mapping-from-values-to-indices","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2006\/06\/14\/inverse-mapping-from-values-to-indices\/","title":{"rendered":"Inverse Mapping from Values to Indices"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <introduction>\r\n      <p>I am aware of frequent customer requests for replacing multiple values in an array.  And the result should be compact code\r\n         and execute quickly, right?!  It came up again on a <a href=\"\">recent thread<\/a> on the MATLAB newsgroup so it seemed timely to write an article on this topic for the archives.  In this recent thread, the\r\n         question of timing came up - which method is fastest?  Your mileage may vary and it would be interesting to compare results,\r\n         especially if yours are qualitatively different than those run from my laptop. You can post your results in the comments below.\r\n      <\/p>\r\n   <\/introduction>\r\n   <h3>Contents<\/h3>\r\n   <div>\r\n      <ul>\r\n         <li><a href=\"#2\">Sample Input and Results<\/a><\/li>\r\n         <li><a href=\"#4\">Algorithms<\/a><\/li>\r\n         <li><a href=\"#9\">Time the Algorithms<\/a><\/li>\r\n         <li><a href=\"#10\">Timing Results<\/a><\/li>\r\n         <li><a href=\"#11\">Comments<\/a><\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <p>Tidy up to start.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">clear\r\nformat <span style=\"color: #A020F0\">short<\/span> <span style=\"color: #A020F0\">g<\/span><\/pre><h3>Sample Input and Results<a name=\"2\"><\/a><\/h3>\r\n   <p>Supppose I have an array <tt>a<\/tt> containing values that I expect to find in another array, <tt>Minit<\/tt>.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">a = [1 3 5 7 8]\r\nMinit = [1 3 5; <span style=\"color: #0000FF\">...<\/span>\r\n    3 5 7; <span style=\"color: #0000FF\">...<\/span>\r\n    3 7 8]<\/pre><pre style=\"font-style:oblique\">a =\r\n     1     3     5     7     8\r\nMinit =\r\n     1     3     5\r\n     3     5     7\r\n     3     7     8\r\n<\/pre><p>I would like to see a replacement matrix for <tt>Minit<\/tt> in which the values are replaced with their indices in the list <tt>a<\/tt>.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">Mwanted = [1 2 3; <span style=\"color: #0000FF\">...<\/span>\r\n     2 3 4; <span style=\"color: #0000FF\">...<\/span>\r\n     2 4 5]<\/pre><pre style=\"font-style:oblique\">Mwanted =\r\n     1     2     3\r\n     2     3     4\r\n     2     4     5\r\n<\/pre><h3>Algorithms<a name=\"4\"><\/a><\/h3>\r\n   <p>There are at least three ways to accomplish the task, one, a brute force loop, and two that require knowledge of other very\r\n      useful MATLAB functions, <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/histc.html\">histc<\/a> and <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/ismember.html\">ismember<\/a>.\r\n   <\/p>\r\n   <p>Let's first look at the <tt>for<\/tt>-loop solution.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">type <span style=\"color: #A020F0\">loopFR<\/span><\/pre><pre style=\"font-style:oblique\">\r\nfunction M1 = loopFR(M,A)\r\n% loopFR replaces M with its indices from A.\r\n\r\n% First set up the inverse mapping array.  Initialize it to zero, and the\r\n% size is the maximum number the array.  Note that I get away with this\r\n% here because all of my numbers in both A and M are positive integers.\r\nimap = zeros(1,max(A));  % inverse mapping array\r\n% Now fill the inverse map, but placing in the location specified in each\r\n% element of A, the index of that element in A.\r\n% If A = [1 5 4], then imap = [1 0 0 3 2].\r\nimap(A) = 1:length(A);  \r\nM1 = M;\r\nfor k = 1:numel(M)\r\n    M1(k) = imap(M(k));\r\nend\r\n<\/pre><p>Here, I set up <tt>imap<\/tt>, an array that allows me to do the inverse mapping from values in <tt>a<\/tt> back to indices in <tt>a<\/tt>.\r\n   <\/p>\r\n   <p>And check that <tt>loopFR<\/tt> gets the right result.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">isequal(loopFR(Minit,a), Mwanted)<\/pre><pre style=\"font-style:oblique\">ans =\r\n     1\r\n<\/pre><p>Now let's check both <tt>ismember<\/tt> and <tt>histc<\/tt> before spending effort timing the results.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">[mem, mem] = ismember(Minit,a);\r\n[his, his] = histc(Minit,a);\r\nisequal(Mwanted, mem, his)<\/pre><pre style=\"font-style:oblique\">ans =\r\n     1\r\n<\/pre><h3>Time the Algorithms<a name=\"9\"><\/a><\/h3><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">N = [1 10 100 1000 10000 100000];  <span style=\"color: #228B22\">% repetitions to test<\/span>\r\nt = zeros(length(N),3); <span style=\"color: #228B22\">% set up the timing collection array<\/span>\r\n<span style=\"color: #0000FF\">for<\/span> n = 1:length(N)\r\n    <span style=\"color: #228B22\">% Make a larger matrix to search.<\/span>\r\n    <span style=\"color: #228B22\">% loopFR<\/span>\r\n    M = repmat(Minit,N(n),1);\r\n    tic\r\n    M1 = loopFR(M,a);\r\n    t(n,1) = toc;\r\n    <span style=\"color: #228B22\">% ismember<\/span>\r\n    tic\r\n    [M2,M2] = ismember(M,a);\r\n    t(n,2) = toc;\r\n    <span style=\"color: #228B22\">% histc<\/span>\r\n    tic\r\n    [M3,M3] = histc(M,a);\r\n    t(n,3) = toc;\r\n    <span style=\"color: #0000FF\">if<\/span> ~isequal(M1,M2,M3)\r\n        disp(<span style=\"color: #A020F0\">'oops'<\/span>)\r\n    <span style=\"color: #0000FF\">end<\/span>\r\n<span style=\"color: #0000FF\">end<\/span><\/pre><h3>Timing Results<a name=\"10\"><\/a><\/h3><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">disp(<span style=\"color: #A020F0\">'       Counter     numel(M)     loop       ismember     histc'<\/span>)\r\ndisp([N' 9*N' t]);\r\nsubplot(1,2,1)\r\nplot(N', t, <span style=\"color: #A020F0\">'--*'<\/span>)\r\nxlabel(<span style=\"color: #A020F0\">'Number of repeated arrays'<\/span>)\r\nylabel(<span style=\"color: #A020F0\">'Execution time'<\/span>)\r\nlegend({<span style=\"color: #A020F0\">'loop'<\/span>,<span style=\"color: #A020F0\">'ismember'<\/span>,<span style=\"color: #A020F0\">'histc'<\/span>},<span style=\"color: #A020F0\">'Location'<\/span>, <span style=\"color: #A020F0\">'NorthWest'<\/span>)\r\nsubplot(1,2,2)\r\nsemilogx(N', t, <span style=\"color: #A020F0\">'--*'<\/span>)\r\nxlabel(<span style=\"color: #A020F0\">'Number of repeated arrays'<\/span>)\r\nylabel(<span style=\"color: #A020F0\">'Execution time'<\/span>)\r\nlegend({<span style=\"color: #A020F0\">'loop'<\/span>,<span style=\"color: #A020F0\">'ismember'<\/span>,<span style=\"color: #A020F0\">'histc'<\/span>},<span style=\"color: #A020F0\">'Location'<\/span>, <span style=\"color: #A020F0\">'NorthWest'<\/span>)<\/pre><pre style=\"font-style:oblique\">       Counter     numel(M)     loop       ismember     histc\r\n            1            9  4.5537e-005   0.00012627  1.2292e-005\r\n           10           90  2.8775e-005  8.8838e-005  1.3689e-005\r\n          100          900  6.6489e-005   0.00022042  5.8108e-005\r\n         1000         9000   0.00046542    0.0017497   0.00051291\r\n        10000        90000     0.004678     0.015192    0.0059801\r\n       1e+005       9e+005     0.065959      0.20439     0.063979\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/41\/CSSMinvmap_01.png\"> <h3>Comments<a name=\"11\"><\/a><\/h3>\r\n   <p>Two important things to point out.  First, the matrix <tt>M<\/tt> that I use throughout is probably not representative of such arrays as the array size grows since I created it via replication.\r\n       Often in such a case, the number of unique elements in <tt>a<\/tt>, the inverse mapping matrix, would also grow, at least a bit.  Second, I used preallocation for the loop code to make sure\r\n      I didn't get a time hit from growing the array, but I was <b>not<\/b> expecting the <b><tt>for<\/tt><\/b> loop results to be so competitive with those from <tt>histc<\/tt>.  Please add your thoughts or results below.\r\n   <\/p>\r\n   <p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br>\r\n      Published with MATLAB&reg; 7.2<br><\/p>\r\n<\/div>\r\n<!--\r\n##### SOURCE BEGIN #####\r\n%% Inverse Mapping from Values to Indices\r\n% I am aware of frequent customer requests for replacing multiple values in\r\n% an array.  And the result should be compact code and execute quickly,\r\n% right?!  It came up again on a\r\n% < recent thread> \r\n% on the MATLAB newsgroup so it seemed timely to write an article on this\r\n% topic for the archives.  In this recent thread, the question of timing\r\n% came up - which method is fastest?  Your mileage may vary and it would be\r\n% interesting to compare results, especially if yours are qualitatively\r\n% different than those run from my laptop.  \r\n% You can post your results <http:?p=41\/Respond here>.\r\n%%\r\n% Tidy up to start.\r\nclear\r\nformat short g\r\n%% Sample Input and Results\r\n% Supppose I have an array |a| containing values that I expect to find in\r\n% another array, |Minit|.\r\na = [1 3 5 7 8]\r\nMinit = [1 3 5; ...\r\n    3 5 7; ...\r\n    3 7 8]\r\n%%\r\n% I would like to see a replacement matrix for |Minit| in which the values\r\n% are replaced with their indices in the list |a|.\r\nMwanted = [1 2 3; ...\r\n     2 3 4; ...\r\n     2 4 5]\r\n%% Algorithms\r\n% There are at least three ways to accomplish the task, one, a brute force\r\n% loop, and two that require knowledge of other very useful MATLAB\r\n% functions,\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/histc.html histc>\r\n% and <https:\/\/www.mathworks.com\/help\/matlab\/ref\/ismember.html ismember>.\r\n%%\r\n% Let's first look at the |for|-loop solution.\r\ntype loopFR\r\n%%\r\n% Here, I set up |imap|, an array that allows me to do the inverse mapping\r\n% from values in |a| back to indices in |a|.\r\n%%\r\n% And check that |loopFR| gets the right result.\r\nisequal(loopFR(Minit,a), Mwanted)\r\n%%\r\n% Now let's check both |ismember| and |histc| before spending effort timing\r\n% the results.\r\n[mem, mem] = ismember(Minit,a);\r\n[his, his] = histc(Minit,a);\r\nisequal(Mwanted, mem, his)\r\n\r\n%% Time the Algorithms\r\nN = [1 10 100 1000 10000 100000];  % repetitions to test\r\nt = zeros(length(N),3); % set up the timing collection array\r\nfor n = 1:length(N)\r\n    % Make a larger matrix to search.\r\n    % loopFR\r\n    M = repmat(Minit,N(n),1);\r\n    tic\r\n    M1 = loopFR(M,a);\r\n    t(n,1) = toc;\r\n    % ismember\r\n    tic\r\n    [M2,M2] = ismember(M,a);\r\n    t(n,2) = toc;\r\n    % histc\r\n    tic\r\n    [M3,M3] = histc(M,a);\r\n    t(n,3) = toc;\r\n    if ~isequal(M1,M2,M3)\r\n        disp('oops')\r\n    end\r\nend\r\n\r\n%% Timing Results\r\ndisp('       Counter     numel(M)     loop       ismember     histc')\r\ndisp([N' 9*N' t]);\r\nsubplot(1,2,1)\r\nplot(N', t, 'REPLACE_WITH_DASH_DASH*')\r\nxlabel('Number of repeated arrays')\r\nylabel('Execution time')\r\nlegend({'loop','ismember','histc'},'Location', 'NorthWest')\r\nsubplot(1,2,2)\r\nsemilogx(N', t, 'REPLACE_WITH_DASH_DASH*')\r\nxlabel('Number of repeated arrays')\r\nylabel('Execution time')\r\nlegend({'loop','ismember','histc'},'Location', 'NorthWest')\r\n%% Comments\r\n% Two important things to point out.  First, the matrix |M| that I use\r\n% throughout is probably not representative of such arrays as the array\r\n% size grows since I created it via replication.  Often in such a case, the\r\n% number of unique elements in |a|, the inverse mapping matrix, would also\r\n% grow, at least a bit.  Second, I used preallocation for the loop code to\r\n% make sure I didn't get a time hit from growing the array, but I was *not*\r\n% expecting the *|for|* loop results to be so competitive with those from\r\n% |histc|.  Please add your thoughts or results\r\n% <http:?p=41\/Respond here>.\r\n##### SOURCE END #####\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   \r\n      I am aware of frequent customer requests for replacing multiple values in an array.  And the result should be compact code\r\n         and execute quickly, right?!  It came up again on... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2006\/06\/14\/inverse-mapping-from-values-to-indices\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[15,12],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/41"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=41"}],"version-history":[{"count":2,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/41\/revisions"}],"predecessor-version":[{"id":2328,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/41\/revisions\/2328"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=41"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=41"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=41"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}