{"id":6411,"date":"2020-10-09T08:27:16","date_gmt":"2020-10-09T12:27:16","guid":{"rendered":"https:\/\/blogs.mathworks.com\/cleve\/?p=6411"},"modified":"2020-10-09T08:27:16","modified_gmt":"2020-10-09T12:27:16","slug":"two-dubious-ways-to-solve-ax-xb-part-1","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/cleve\/2020\/10\/09\/two-dubious-ways-to-solve-ax-xb-part-1\/","title":{"rendered":"Two Dubious Ways to Solve A*X = X*B, part 1"},"content":{"rendered":"<div class=\"content\"><!--introduction--><p>Recently, I had email from a student in Italy.<\/p><p>\r\n<p style=\"margin-left:3ex;\">\r\nI am doing a project on Hand-Eye calibration.  I have found A,\r\nthe transformation from camera to world, and B, the transformation\r\nfrom gripper to base.  I want to find the transformation from\r\ngripper to camera.  This is the solution of the equation AX = XB.\r\nCan you me any advice for solving this equation?\r\n<\/p>\r\n<\/p><p>Later, my correspondent found the solution himself; his matrices were only 4-by-4 and there is an analytic solution using quaternions.<\/p><p>But the general question stuck with me.  How do you solve <tt>A*X = X*B<\/tt>? The equation is ill-posed.  In part 2, I will describe two methods for solving it.  Both methods have serious drawbacks.  First, some background.<\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#071ea3bd-fd7a-45c5-bece-adc700ceef87\">Inhomogeneous Sylvester Equation<\/a><\/li><li><a href=\"#d497d1c7-0283-4c38-8c18-c11784a1a8c7\">Homogeneous Sylvester Equation<\/a><\/li><li><a href=\"#071808c7-eb1c-4267-8271-7780009989ae\">Moler's Laws<\/a><\/li><li><a href=\"#72138642-07e1-47b8-8c95-c2d31815692e\"><tt>B = 0<\/tt><\/a><\/li><li><a href=\"#dedd8087-8e74-47d5-9fe9-0d6e6b93dc48\"><tt>B<\/tt> is diagonal<\/a><\/li><li><a href=\"#99a8aaef-84ca-4a90-89f9-fd9f55e81ad4\"><tt>B = A'<\/tt><\/a><\/li><li><a href=\"#911c9cf4-ee85-40dd-8f44-9f7df11569f6\"><tt>A<\/tt> is symmetric<\/a><\/li><li><a href=\"#7ef50afa-83bd-4ff8-874f-7ab10fd1ba1e\">Preview of part 2<\/a><\/li><li><a href=\"#cda0b727-6dc0-4cec-8718-6b6fd1bc25d5\">Further reading<\/a><\/li><\/ul><\/div><h4>Inhomogeneous Sylvester Equation<a name=\"071ea3bd-fd7a-45c5-bece-adc700ceef87\"><\/a><\/h4><p>The standard, inhomogeneous, Sylvester equation involves <i>three<\/i> nonzero matrices, <tt>A<\/tt>, <tt>B<\/tt>, and <tt>C<\/tt>.<\/p><p><tt>A*X + X*B = C<\/tt><\/p><p>This equation is fundamental in control theory and the function <tt>sylvester(A,B,C)<\/tt> has been part of MATLAB for many years. The help entry for <tt>sylvester<\/tt> does not mention any conditions, other than compatible dimensions, on <tt>A<\/tt>, <tt>B<\/tt>, and <tt>C<\/tt>.  Near the end of the documentation for <tt>sylvester<\/tt> is this single sentence:<\/p><p><tt>The equation has a unique solution when the eigenvalues of A and -B are distinct.<\/tt><\/p><p>If we deliberately violate this condition by taking <tt>B<\/tt> to be <tt>-A'<\/tt>, we find that the computed solution blows up. Let's see this with the matrix<\/p><pre class=\"language-matlab\">A = gallery(3)\r\n<\/pre><pre class=\"language-matlab\">A =\r\n<\/pre><pre>   -149   -50  -154\r\n    537   180   546\r\n    -27    -9   -25<\/pre><p>It isn't obvious, but this matrix was constructed so that its eigenvalues are 1, 2, and 3.<\/p><pre class=\"language-matlab\">e = eig(A)\r\n<\/pre><pre class=\"language-matlab\">e =\r\n<\/pre><pre>   1.0000\r\n   2.0000\r\n   3.0000<\/pre><p>Now solve the Sylvester equation, deliberately violating the condition that <tt>A<\/tt> and <tt>-B<\/tt> not have any eigenvalues in common.<\/p><pre class=\"language-matlab\">X = sylvester(A,-A',eye(3))\r\n<\/pre><p>The solution wants to be infinite, but roundoff gives us <tt>1\/eps<\/tt>.<\/p><pre class=\"language-matlab\">X =\r\n<\/pre><pre>   1.0e+15 *<\/pre><pre>   -0.8519    1.1016    0.4883\r\n    1.1016   -1.4822   -0.6330\r\n    0.4883   -0.6330   -0.2726<\/pre><p>I will take advantage of this behavior one of the methods.<\/p><h4>Homogeneous Sylvester Equation<a name=\"d497d1c7-0283-4c38-8c18-c11784a1a8c7\"><\/a><\/h4><p>If I take <tt>C<\/tt> to be zero, and flip the sign of <tt>B<\/tt>, the Sylvester equation becomes<\/p><p><tt>(1)      A*X = X*B<\/tt><\/p><p>This is our <i>homogeneous<\/i> Sylvester equation and I've labeled it <tt>(1)<\/tt>. It turns out that <tt>(1)<\/tt> has a nonzero solution if and only if <tt>A<\/tt> and <tt>B<\/tt> have at least one eigenvalue in common.  Nothing is required of the <i>rank<\/i> of the solution <tt>X<\/tt>, but it is clear that this rank cannot exceed the number of eigenvalues that <tt>A<\/tt> and <tt>B<\/tt> have in common.<\/p><p>Equation <tt>(1)<\/tt> is a <i>linear<\/i> equation.  If <tt>X<\/tt> and <tt>Y<\/tt> are solutions, so is <tt>alpha*X + beta*Y<\/tt> for any scalars <tt>alpha<\/tt> and <tt>beta<\/tt>.<\/p><p>In order to have compatible dimensions, <tt>A<\/tt> and <tt>B<\/tt> have to be square, but they do not have to be the same size.  If <tt>A<\/tt> is <tt>m-by-m<\/tt> and <tt>B<\/tt> is <tt>n-by-n<\/tt>, then <tt>X<\/tt> is <tt>m-by-n<\/tt> .<\/p><h4>Moler's Laws<a name=\"071808c7-eb1c-4267-8271-7780009989ae\"><\/a><\/h4><p>My fundamental Laws of Scientific Computing are:<\/p><div><ul><li>The hardest things in the world to compute are things that do not exist.<\/li><li>The next hardest are things that are not unique.<\/li><\/ul><\/div><p>Problems which fail to have unique solutions are <i>ill-posed<\/i>. Even when solutions exist, they are wildly sensitive to perturbations.<\/p><p>Equation <tt>(1)<\/tt> is a poster child for ill-posed.  Without restrictions on <tt>A<\/tt> and <tt>B<\/tt>, the only solution is zero.  On the other hand, if <tt>A<\/tt> and <tt>B<\/tt> share at least one eigenvalue, there is at least one solution, but it is not unique because it can be renormalized.  And the number of linearly independent solutions depends upon elusive algebraic and geometric multiplicities.<\/p><h4><tt>B = 0<\/tt><a name=\"72138642-07e1-47b8-8c95-c2d31815692e\"><\/a><\/h4><p>Let's look at some special versions of <tt>(1)<\/tt>.  If we were to write some general purpose software, it will have to deal with all these cases.<\/p><p>If <tt>B<\/tt> is zero, <tt>(1)<\/tt> becomes <tt>A*X = 0<\/tt> and any vector in the null space of <tt>A<\/tt> is a solution.  The best way to find the null space of a matrix uses its SVD.<\/p><h4><tt>B<\/tt> is diagonal<a name=\"dedd8087-8e74-47d5-9fe9-0d6e6b93dc48\"><\/a><\/h4><p>If <tt>B<\/tt> is diagonal with scalar elements <tt>d<\/tt>, then <tt>(1)<\/tt> becomes <tt>A*x = d*x<\/tt>. There are solutions if <tt>d<\/tt> is an eigenvalue, but they aren't unique.<\/p><h4><tt>B = A'<\/tt><a name=\"99a8aaef-84ca-4a90-89f9-fd9f55e81ad4\"><\/a><\/h4><p>Any matrix and its transpose have the same eigenvalues, so the equation becomes<\/p><p><tt>(2)          A*X = X*A'<\/tt><\/p><p>This is the most common form of the homogeneous Sylvester equation. If the solution has full rank, then <tt>X\\A*X = A'<\/tt> and we are asking when is a matrix <i>similar<\/i> to its transpose.<\/p><p>If <tt>A<\/tt> has a full set of linearly independent eigenvectors, then so does <tt>A'<\/tt>.  If <tt>U\\A*U<\/tt> = D = <tt>V\\A'*V<\/tt> where D is diagonal, then <tt>X = V\/U<\/tt> is a solution to <tt>(2)<\/tt> .<\/p><p>If <tt>A = J<\/tt> is a Jordan block for any eigenvalue,  let <tt>X = fliplr(eye(n))<\/tt> .  Then <tt>X<\/tt> is a solution to <tt>(2)<\/tt> .<\/p><h4><tt>A<\/tt> is symmetric<a name=\"911c9cf4-ee85-40dd-8f44-9f7df11569f6\"><\/a><\/h4><p>If <tt>A = A'<\/tt>, <tt>(2)<\/tt> has many solutions, including the identity matrix, any other polynomial in <tt>A<\/tt>, and even functions like <tt>exp(t*A)<\/tt>.<\/p><h4>Preview of part 2<a name=\"7ef50afa-83bd-4ff8-874f-7ab10fd1ba1e\"><\/a><\/h4><p>That's enough for now.  In part two, I will investigate two methods, one using Kronecker products and another using inverse iteration. Here's a preview of the Kronecker products.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"http:\/\/blogs.mathworks.com\/cleve\/files\/sparsity2.png\" alt=\"\"> <\/p><h4>Further reading<a name=\"cda0b727-6dc0-4cec-8718-6b6fd1bc25d5\"><\/a><\/h4><p>An article by Nick Higham about the sensitivity of the Sylvester equation. <a href=\"https:\/\/nhigham.com\/2020\/09\/01\/what-is-the-sylvester-equation\">link<\/a>.<\/p><p>A paper by Olga Taussky and Hans Zassenhuas about when the solution of <tt>X\\A*X = A'<\/tt> is symmetric. <a href=\"https:\/\/projecteuclid.org\/download\/pdf_1\/euclid.pjm\/1103039127\">link<\/a>.<\/p><p>A paper and Fortran software by Alan Laub, me, and three of our students for <tt>A*X*B' + C*X*D' = E<\/tt>. <a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/146847.146929\">link<\/a>, <a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/146847.146930\">link<\/a>.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_f6a0c24cb9134b75bd7b904e85973e33() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='f6a0c24cb9134b75bd7b904e85973e33 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' f6a0c24cb9134b75bd7b904e85973e33';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2020 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_f6a0c24cb9134b75bd7b904e85973e33()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2020b<br><\/p><\/div><!--\r\nf6a0c24cb9134b75bd7b904e85973e33 ##### SOURCE BEGIN #####\r\n%% Two Dubious Ways to Solve A*X = X*B, part 1\r\n% Recently, I had email from a student in Italy.\r\n%\r\n% <html>\r\n% <p style=\"margin-left:3ex;\">\r\n% I am doing a project on Hand-Eye calibration.  I have found A,\r\n% the transformation from camera to world, and B, the transformation\r\n% from gripper to base.  I want to find the transformation from\r\n% gripper to camera.  This is the solution of the equation AX = XB.\r\n% Can you me any advice for solving this equation?\r\n% <\/p>\r\n% <\/html>\r\n%\r\n% Later, my correspondent found the solution himself; his matrices were\r\n% only 4-by-4 and there is an analytic solution using quaternions.\r\n%\r\n% But the general question stuck with me.  How do you solve |A*X = X*B|? \r\n% The equation is ill-posed.  In part 2, I will describe two methods for\r\n% solving it.  Both methods have serious drawbacks.  First, some\r\n% background.\r\n\r\n%% Inhomogeneous Sylvester Equation\r\n% The standard, inhomogeneous, Sylvester equation involves _three_\r\n% nonzero matrices, |A|, |B|, and |C|.\r\n%\r\n% |A*X + X*B = C|\r\n%\r\n% This equation is fundamental in control theory and the function\r\n% |sylvester(A,B,C)| has been part of MATLAB for many years.\r\n% The help entry for |sylvester| does not mention any conditions, other\r\n% than compatible dimensions, on |A|, |B|, and |C|.  Near the end of\r\n% the documentation for |sylvester| is this single sentence:\r\n%\r\n% |The equation has a unique solution when the eigenvalues of A and -B are\r\n% distinct.|\r\n%\r\n% If we deliberately violate this condition by taking |B| to be |-A'|,\r\n% we find that the computed solution blows up.  \r\n% Let's see this with the matrix\r\n%\r\n%   A = gallery(3)\r\n%\r\n%   A =\r\n%\r\n%     -149   -50  -154\r\n%      537   180   546\r\n%      -27    -9   -25\r\n%\r\n% It isn't obvious, but this matrix was constructed so that its\r\n% eigenvalues are 1, 2, and 3.\r\n%\r\n%   e = eig(A)\r\n%\r\n%   e =\r\n%\r\n%     1.0000\r\n%     2.0000\r\n%     3.0000\r\n%\r\n% Now solve the Sylvester equation, deliberately violating the condition\r\n% that |A| and |-B| not have any eigenvalues in common.\r\n%\r\n%   X = sylvester(A,-A',eye(3))\r\n%\r\n% The solution wants to be infinite, but roundoff gives us |1\/eps|.\r\n%\r\n%   X =\r\n%\r\n%     1.0e+15 *\r\n%\r\n%     -0.8519    1.1016    0.4883\r\n%      1.1016   -1.4822   -0.6330\r\n%      0.4883   -0.6330   -0.2726\r\n%\r\n% I will take advantage of this behavior one of the methods.\r\n\r\n%% Homogeneous Sylvester Equation\r\n% If I take |C| to be zero, and flip the sign of |B|, the Sylvester\r\n% equation becomes\r\n%\r\n% |(1)      A*X = X*B|\r\n%\r\n% This is our _homogeneous_ Sylvester equation and I've labeled it |(1)|.\r\n% It turns out that |(1)| has a nonzero solution if and only if\r\n% |A| and |B| have at least one eigenvalue in common.  Nothing is\r\n% required of the _rank_ of the solution |X|, but it is clear that\r\n% this rank cannot exceed the number of eigenvalues that |A| and |B|\r\n% have in common.\r\n%\r\n% Equation |(1)| is a _linear_ equation.  If |X| and |Y| are solutions, so is\r\n% |alpha*X + beta*Y| for any scalars |alpha| and |beta|.\r\n%\r\n% In order to have compatible dimensions, |A| and |B| have to be square,\r\n% but they do not have to be the same size.  If |A| is |m-by-m|\r\n% and |B| is |n-by-n|, then |X| is |m-by-n| .\r\n\r\n%% Moler's Laws\r\n% My fundamental Laws of Scientific Computing are:\r\n%\r\n% * The hardest things in the world to compute are things that do not exist.\r\n% * The next hardest are things that are not unique.\r\n%\r\n% Problems which fail to have unique solutions are _ill-posed_.\r\n% Even when solutions exist, they are wildly sensitive to perturbations.\r\n%\r\n% Equation |(1)| is a poster child for ill-posed.  Without restrictions\r\n% on |A| and |B|, the only solution is zero.  On the other hand, if |A|\r\n% and |B| share at least one eigenvalue, there is at least one solution,\r\n% but it is not unique because it can be renormalized.  And the number \r\n% of linearly independent solutions depends upon elusive algebraic and \r\n% geometric multiplicities.\r\n\r\n%% |B = 0|\r\n% Let's look at some special versions of |(1)|.  If we were to write some\r\n% general purpose software, it will have to deal with all these cases.\r\n%\r\n% If |B| is zero, |(1)| becomes |A*X = 0| and any vector in the null space\r\n% of |A| is a solution.  The best way to find the null space of a matrix\r\n% uses its SVD.\r\n\r\n%% |B| is diagonal\r\n% If |B| is diagonal with scalar elements |d|, then |(1)| becomes |A*x = d*x|.\r\n% There are solutions if |d| is an eigenvalue, but they aren't unique.\r\n\r\n%% |B = A'|\r\n% Any matrix and its transpose have the same eigenvalues, so the equation\r\n% becomes\r\n%\r\n% |(2)          A*X = X*A'|\r\n%\r\n% This is the most common form of the homogeneous Sylvester equation.\r\n% If the solution has full rank, then\r\n% |X\\A*X = A'| and we are asking when is a matrix _similar_ to\r\n% its transpose.\r\n%\r\n% If |A| has a full set of linearly independent eigenvectors, then so does\r\n% |A'|.  If |U\\A*U| = D = |V\\A'*V| where D is\r\n% diagonal, then |X = V\/U| is a solution to |(2)| .\r\n%\r\n% If |A = J| is a Jordan block for any eigenvalue,  let \r\n% |X = fliplr(eye(n))| .  Then |X| is a solution to |(2)| .\r\n\r\n%%  |A| is symmetric\r\n% If |A = A'|, |(2)| has many solutions, including the identity matrix,\r\n% any other polynomial in |A|, and even functions like |exp(t*A)|.\r\n\r\n%% Preview of part 2\r\n% That's enough for now.  In part two, I will investigate two methods,\r\n% one using Kronecker products and another using inverse iteration.\r\n% Here's a preview of the Kronecker products.\r\n%\r\n% <<sparsity2.png>>\r\n\r\n%% Further reading\r\n% An article by Nick Higham about the sensitivity of the Sylvester equation.\r\n% <https:\/\/nhigham.com\/2020\/09\/01\/what-is-the-sylvester-equation\r\n% link>.\r\n%\r\n% A paper by Olga Taussky and Hans Zassenhuas about when the solution of\r\n% |X\\A*X = A'| is symmetric.\r\n% <https:\/\/projecteuclid.org\/download\/pdf_1\/euclid.pjm\/1103039127 link>.\r\n%\r\n% A paper and Fortran software by Alan Laub, me, and three of our\r\n% students for |A*X*B' + C*X*D' = E|.\r\n% <https:\/\/dl.acm.org\/doi\/10.1145\/146847.146929 link>,\r\n% <https:\/\/dl.acm.org\/doi\/10.1145\/146847.146930 link>.\r\n\r\n##### SOURCE END ##### f6a0c24cb9134b75bd7b904e85973e33\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/cleve\/files\/sparsity2.png\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><!--introduction--><p>Recently, I had email from a student in Italy.... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/cleve\/2020\/10\/09\/two-dubious-ways-to-solve-ax-xb-part-1\/\">read more >><\/a><\/p>","protected":false},"author":78,"featured_media":6415,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[11,24,13,6,16],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/posts\/6411"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/users\/78"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/comments?post=6411"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/posts\/6411\/revisions"}],"predecessor-version":[{"id":6413,"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/posts\/6411\/revisions\/6413"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/media\/6415"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/media?parent=6411"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/categories?post=6411"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/cleve\/wp-json\/wp\/v2\/tags?post=6411"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}