How to solve this quadratic matrix equation?

Question

I would like to solve for $X$ in the matrix equation $$ XCX + AX = I $$ where all the matrices are $n\times n$, have real components, $X$ is positive semidefinite and $C$ is symmetric. My (possibly optimistic) hunch is that $X$ will be unique because of the positive semidefinite requirement; if not I only care about finding a single solution.

My searches have turned up a lot of work on solving non-symmetric riccati equations, unfortunately I don't meet the requirements for any of them. For example, everything I've seen requires that the entries of $C$ all be nonnegative, which is not true in my case.

Noam D. Elkies · Accepted Answer · 2013-10-19 05:15:41Z

Both existence and uniqueness can fail spectacularly.

In the special case $A=0$ we get $XCX=I$, which is equivalent to $C = X^{-2}$. Thus we must have $\det C > 0$, else then $C=X^{-2}$ has no real solution at all, even without requiring $X$ to be symmetric, let alone positive-definite. If $X$ is symmetric then $C$ must also be positive-definite. Note that $\det C$ can be zero or negative even when all entries of $C$ are positive.

When a solution does exist then there might even be a positive-dimensional family of positive-definite $X$ satisfying the equation $XCX+AX=I$. For example, let $A=3I$ and let $C=-2I$. Then the equation becomes $2X^2-3X+I=0$, i.e. $(X-I)(2X-I)=0$. So for any orthogonal decomposition ${\bf R}^n = V_1 \oplus V_{1/2}$ we get a solution by making $V_1$ the $1$-eigenspace $V_{1/2}$ the $(1/2)$-eigenspace, i.e. letting $X$ be the transformation that takes any $v_1+v_2$ ($v_i \in V_i$) to $v_1 + \frac12 v_2$.

Federico Poloni · Accepted Answer · 2013-10-19 09:15:12Z

Not a complete answer, but a long comment. Using standard results for Riccati equations, one can parametrize all (symmetric and non-symmetric) solutions. One can rewrite the equation in the form $$ \mathcal{H} \begin{bmatrix} I\\ X \end{bmatrix} = \begin{bmatrix} I\\X \end{bmatrix}CX, \quad \mathcal{H}=\begin{bmatrix} 0 & C\\ I & -A \end{bmatrix} $$ hence each solution is associated to an invariant subspace of the matrix $\mathcal{H}$. We can reverse that relation: (1) compute eigenvalues and eigenvectors of $\mathcal{H}$ (2) if the eigenvalues are all distinct, simply pick any $n$ out of $2n$ of them, and call them $u_1, u_2,\dots, u_n$ (with eigenvalues $\lambda_1,\lambda_2,\dots,\lambda_n$); otherwise the same holds with Jordan chain vectors, but you have to be careful not to take a vector in a chain unless you also take all the preceding ones (3) stack them horizontally to build $\begin{bmatrix}U_1 \\ U_2\end{bmatrix}=\begin{bmatrix}u_1 & u_2 & \dots & u_n\end{bmatrix}$, $U_1,U_2\in\mathbb{C}^{n\times n}$. (4) If $U_1$ is invertible, $X=U_2U_1^{-1}$ is a solution (which may or may not be symmetric), and $CX$ has eigenvalues $\lambda_1,\lambda_2,\dots,\lambda_n$.

This gives all the solutions, which are at most $\binom{2n}{n}$ (in the nonderogatory case: if the matrix is derogatory, they are $\infty$ since there is arbitrariety in choosing the eigenvectors).

Typically the choice of these eigenvalues is "driven" by knowing which eigenvalues you want $CX$ to have. In your case you can at least say something: if $X$ has to be PD then $CX$ has all real eigenvalues and the same signature as $C$, and this restricts how you can choose the $\lambda_i$.

However, I am not sure that your matrix $\mathcal{H}$ has enough structure to say something general on its eigenvalues (let alone proving that there must be a symmetric solution -- as far as I know, this holds true only for some special classes of matrices, such as Hamiltonian or symplectic ones).

Can you run some experiments and tell us how the eigenvalues of $\mathcal{H}$ look like in your real-life cases? This could tell us something: for instance, if the matrix were symplectic, you would see them in pairs $(\lambda,1/\lambda)$.

I doubt experiments will help. $A$ has the form $A=A_0 + aa^T$ where $A_0$ is probably going to be the identity matrix, and $C$ has the form $C=(cc^T)(cc^T)$. The $a$ and $c$ vectors can be completely arbitrary, so it seems like I probably can't say much about the eignenvectors of $\mathcal{H}$. The impression I'm getting is that this equation is not going to be solvable unless I can rework it into fitting the form of Riccati equations as described in the paper you linked. Is that more or less true? — Mike Izbicki, Oct 19, 2013 at 17:42

loup blanc · Accepted Answer · 2013-10-20 14:54:17Z

In fact, your problem reduces to a Riccati equation in dimenion $n-1$ that is easy to solve. Here, if I correctly understood your problem, $A=I_n+aa^T, C=(cc^T)(cc^T)=uu^T$ (since $rank(C)=1$), where $a,u$ are known vectors. $A,X$ are symmetric $>0$ and $C$ is symmetric $\geq 0$, then $AX=XA$. We may assume $A=diag(1+\alpha,I_{n-1})$ where $\alpha\geq 0$. Then $X=diag(\beta,S_{n-1})$ where $\beta>0$ and $S$ is symmetric $>0$. Put $u=[v,w_{n-1}]^T$. $\alpha,v,w$ are known and $\beta,S$ are unknown. The equation can be rewritten: $\begin{pmatrix}\beta^2v^2&\beta vw^TS\\v\beta Sw&Sww^TS\end{pmatrix}+diag((1+\alpha)\beta,S)=I_n$. That implies $\beta^2v^2+(1+\alpha)\beta=1,vSw=0,Sww^TS+S-I_{n-1}=0$.

Case 1. (easy) $v\not=0,Sw=0$. Then $\beta^2v^2+(1+\alpha)\beta-1=0,S=I_{n-1}$. thus necessarily $w=0$ and for every $v$, there is a unique solution $\beta>0$.

Case 2. (more interesting) $v=0$. Then $(1+\alpha)\beta=1,Sww^TS+S-I_{n-1}=0$. You obtain $\beta>0$ if $\alpha>-1$ and finally you must to solve a standard symmetric Riccati equation in dimension $n-1$. As above, we may assume $ww^T=diag(\gamma,0_{n-1})$ where $\gamma\geq 0$. Then put $S=\begin{pmatrix}e&f^T\\f&H_{n-1}\end{pmatrix}$, where $e>0$ and $H$ is symmetric $>0$. The equation can be rewritten $(\gamma e+1)e=1,(\gamma e+1)f=0,H+ff^T=I_{n-1}$. Thus $f=0$ and $H=I_{n-1}$. For every value of $\gamma\geq 0$, there is a unique solution $e>0$.

EDIT: in fact, I assumed $\alpha>0$. It remains to study the case $\alpha=0$.

This looks really promising, but I'm struggling to understand some of the details. In particular, I've never seen a function $diag$ that's applied to two arguments. What exactly does this mean? — Mike Izbicki, Oct 22, 2013 at 5:47
Hi Mike, it is a diagonal of matrices (a standard notation), the first having dimension $1$ and the second dimension $n-1$. $\alpha=trace(aa^T)=a^Ta$ is a real and $I_{n-1}$ is the identity matrix of dimension $n-1$. In the same way one has the form $X=diag(\beta,S_{n-1})$; in the sequel $S_{n-1}$ becomes $S$ (I delete the index,...) and the vector $w_{n-1}$ of dimension $n-1$ becomes $w$. — loup blanc, Oct 22, 2013 at 23:28
Why can we assume that $A$ has that form? For example, if $a = [1,1]$, then $A$ will not be a diagonal of matrices. — Mike Izbicki, Oct 23, 2013 at 8:07
Mike, we may assume that $A$ is in this form after an orthonormal change of basis. $aa^T$ is symmetric $\geq 0$ and has rank $1$. Then, there is an orthogonal matrix $P$ s.t. $aa^T=Pdiag(a^Ta,0_{n-1})P^{-1}$. The columns of $P$ consist in $a/||a||$ and an orthonormal basis of the orthogonal of $a$. — loup blanc, Oct 23, 2013 at 9:29

Stack Exchange Network

How to solve this quadratic matrix equation?

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged
linear-algebra
matrices
matrix-equations
or ask your own question.

Linked

How to solve this quadratic matrix equation?

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged linear-algebramatricesmatrix-equations or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
linear-algebra
matrices
matrix-equations
or ask your own question.