3
$\begingroup$

A problem I am trying to solve has led to me to the following system of equations:

$$A(x^2) + Bx + c = 0$$

Where $A$ and $B$ are known matrices, $c$ is a known vector, $x$ is the vector of unknowns for which we want to solve, and $(x^2)$ is the vector of squared elements of $x$, i.e. $(x^2) = (x_1^2, x_2^2, ... x_N^2)$.

I have been reading a lot about solving simultaneous quadratic equations, quadratic matrix equations, etc, but it seems to be a rather large field with few general solutions known. But somehow the form above seems particularly simple, since there are no "cross-terms" (no $x_0\cdot x_1$ etc.). So I wonder if it is solvable? I think my matrices are all real and invertible, but if a solution exists that requires further special properties I'd still be interested. Actually I am even interested in efficient numerical solutions, since I ended up with these equations as part of an attempt to avoid performing some numerical minimisation in high-dimensions (like 100, so these matrices will be of that sort of size).

$\endgroup$
5
  • $\begingroup$ Do you need one solution, or all of them? $\endgroup$ Mar 21, 2019 at 12:52
  • $\begingroup$ Well for my cases there should usually only be one solution, I'm pretty sure. There is probably some extra structure that ensures this, but I'm afraid I'm not sure what it is. Is there some sort of equivalent of a determinant that could be computed to show this? $\endgroup$
    – Ben Farmer
    Mar 22, 2019 at 13:24
  • $\begingroup$ Bézout's theorem tells you that generically you can expect a lot of solutions, $2^N$ of them (possibly complex). Are you sure you have only one in your case? Have you tried small-dimensional cases to check what happens there? $\endgroup$ Mar 22, 2019 at 15:15
  • $\begingroup$ I've tried it numerically and not found other solutions, but I haven't been very thorough about it. These are a set of extremisation conditions for finding the maximum likelihood estimator in a statistics problem, so it will be the regularity of the parameter space plus some sort of asymptotic theory guaranteeing the (asymptotic) uniqueness of the maximum likelihood estimator that means there should be only one solution, at least for sufficiently largely amounts of data. Not sure how that property will manifest in these equations though. $\endgroup$
    – Ben Farmer
    Mar 22, 2019 at 17:40
  • $\begingroup$ What are the dimensions of your $A,B,C$ matrices ? I mean, if $A,B$ are $m\times n$ and $C$ is $m\times 1$ with $m=2n$ then you can "linearize" the system and solve it as a linear system. $\endgroup$ Apr 18, 2020 at 18:22

3 Answers 3

2
$\begingroup$

The first numerical method to try might be a multivariate Newton's method. The Jacobian of $F(x) = A (x^2) + B (x) + c$ is $J(x)= 2 A\; \text{diag}(x) + B$, and you iterate the map $$ x \mapsto x - J(x)^{-1} F(x) $$ As usual, this is not guaranteed to succeed, but when it does work (if you start "sufficiently close" to a solution) it is efficient. Of course, you probably won't actually compute $J(x)^{-1}$, rather you compute the solution of $J(x) y = F(x)$ and take $x -y$.

$\endgroup$
2
$\begingroup$

We have the following system of quadratic equations in $\mathrm x \in \mathbb R^n$

$$\mathrm A (\mathrm x \circ \mathrm x) + \mathrm B \mathrm x + \mathrm c = 0_m$$

where $\mathrm A \in \mathbb R^{m \times n}$, $\mathrm B \in \mathbb R^{m \times n}$ and $\mathrm c \in \mathbb R^m$ are given, and $\circ$ denotes the entry-wise product.

Let $\mathrm a_i, \mathrm b_i \in \mathbb R^n$ denote the $i$-th rows of matrices $\rm A$ and $\rm B$, respectively. Let $c_i$ denote the $i$-th entry of vector $\rm c$. Hence, the system of quadratic equations can be rewritten as follows

$$\mathrm x^\top \mbox{diag} (\mathrm a_i) \, \mathrm x + \mathrm b_i^\top \mathrm x + c_i = 0, \qquad i \in [m]$$

Note that

$$\begin{aligned} \mathrm x^\top \mbox{diag} (\mathrm a_i) \, \mathrm x + \mathrm b_i^\top \mathrm x + c_i &= \begin{bmatrix} \mathrm x\\ 1\end{bmatrix}^\top \begin{bmatrix} \mbox{diag} (\mathrm a_i) & \frac 12 \mathrm b_i\\ \frac 12\mathrm b_i^\top & c_i\end{bmatrix} \begin{bmatrix} \mathrm x\\ 1\end{bmatrix}\\ &= \mbox{tr} \left( \begin{bmatrix} \mbox{diag} (\mathrm a_i) & \frac 12 \mathrm b_i\\ \frac 12\mathrm b_i^\top & c_i\end{bmatrix} \begin{bmatrix} \mathrm x\\ 1\end{bmatrix} \begin{bmatrix} \mathrm x\\ 1\end{bmatrix}^\top \right)\\ &= \mbox{tr} \left( \begin{bmatrix} \mbox{diag} (\mathrm a_i) & \frac 12 \mathrm b_i\\ \frac 12\mathrm b_i^\top & c_i\end{bmatrix} \begin{bmatrix} \mathrm x \mathrm x^\top & \mathrm x\\ \mathrm x^\top & 1\end{bmatrix} \right)\end{aligned}$$

Let

$$\mathrm M_i := \begin{bmatrix} \mbox{diag} (\mathrm a_i) & \frac 12 \mathrm b_i\\ \frac 12\mathrm b_i^\top & c_i\end{bmatrix}$$

The original system of $m$ quadratic equations in $\mathrm x \in \mathbb R^n$ can then be replaced by the following system of $m$ linear equations in symmetric positive semidefinite matrix $\mathrm Y \in \mathbb R^{(n+1) \times (n+1)}$

$$\langle \mathrm M_i , \mathrm Y \rangle = 0, \qquad i \in [m]$$

plus the constraint $\mbox{rank}(\mathrm Y) = 1$ and the equality constraint $Y_{n+1,n+1} = 1$.

The rank constraint is a major difficulty. Let us discard that constraint and minimize the rank instead

$$\begin{array}{ll} \text{minimize} & \mbox{rank}(\mathrm Y)\\ \text{subject to} & \langle \mathrm M_i , \mathrm Y \rangle = 0, \qquad i \in [m]\\ & Y_{n+1,n+1} = 1\\ & \mathrm Y \succeq \mathrm O_{n+1}\end{array}$$

Since the rank is non-convex, let us use the nuclear norm, which is a convex proxy for rank. Hence, we obtain the following convex optimization problem

$$\begin{array}{ll} \text{minimize} & \| \mathrm Y \|_*\\ \text{subject to} & \langle \mathrm M_i , \mathrm Y \rangle = 0, \qquad i \in [m]\\ & Y_{n+1,n+1} = 1\\ & \mathrm Y \succeq \mathrm O_{n+1}\end{array}$$

Let $\bar{\rm Y}$ denote the minimal solution of the convex program above. There are two cases to consider:

  • If $\mbox{rank} (\bar{\rm Y}) = 1$, find $\bar{\mathrm x} \in \mathbb R^n$ such that $$\begin{bmatrix} \bar{\mathrm x}\\ 1\end{bmatrix} \begin{bmatrix} \bar{\mathrm x}\\ 1\end{bmatrix}^\top = \bar{\mathrm Y}$$ Note that $\bar{\mathrm x}$ is a solution of the original system of quadratic equations.

  • If $\mbox{rank} (\bar{\rm Y}) \neq 1$, try another approach.

$\endgroup$
0
0
$\begingroup$

Too long to comment:

Here is another plausible approach. Consider the convex sets of $\{\lambda, x\}\in R^{2n}$: $$ A\lambda + Bx + C = 0, \lambda \geq 0, $$ and $$ x^2_k\leq \lambda_k, \forall k. $$ The first convex set is a polytope in $R^{2n}$, while each inequality in the second set defines the closed interior of a parabolic cylinder in $R^{2n}$. Note that any point on the boundary of their intersection (convex sets 1 and 2) would be a desired solution. Towards that end, one can possibly use the Alternating Projections method, starting from a point which is not inside any of the parabolic cylinders. Finding such a point can be done this way: (i) choose a random point $\{\lambda^0,x^0\}$, $\lambda^0\geq 0$, (ii) choose a large enough $K>0$ such that $\{\lambda^0,Kx^0\}$ is outside the parabolic cylinders. Also, note that each projection step can be done in polynomial time as the sets are all convex.

Finally, the hope is that the process will converge with a reasonable rate of convergence (this needs further diligence), onto a boundary point of the intersection of the convex sets 1 and 2. Of course, one has to get a bit lucky here.

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

Not the answer you're looking for? Browse other questions tagged or ask your own question.