Bounding Inequalities for Eigenvalues of Principal Submatrices

Show more

1. Introduction

The Cauchy-Poincaré interlacing theorems, and Ky Fan trace theorems are important tools for characterizing the eigenvalues of Hermitian matrices. These classical theorems are closely related and form the basis of several useful observations, e.g., [1] - [24] . In this note, we extend these assertions to handle non-Hermitian matrices. The interest in such extensions arises in Rayleigh-Ritz methods for calculating a few exterior eigenvalues of a large sparse matrix, e.g., [18] [20] [23] . Let us consider, for example, a restarted Krylov method that is applied to some matrix $A\in {\u2102}^{n\times n}$. Then each iteration builds a new matrix $V\in {\u2102}^{n\times k}$ , $k\ll n$ , whose columns provide an orthonormal basis of the current Krylov subspace. The eigenvalues of the related Rayleigh-quotient matrix, ${V}^{\mathrm{*}}AV$ are called Ritz values and used to approximate those of A. If A is Hermitian the interlacing theorems apply, the Ritz values interlace the eigenvalues of A, and it is possible to establish monotonic convergence toward the desired exterior eigenvalues, e.g., [6] [7] . However, if A happens to be non-Hermitian then the behavior of the Ritz values is not that straightforward [2] [3] [8] . This drawback raises the question of whether it is possible to extend the above theorems, and what meanings do the modified ones have.

The plan of the paper is as follows. The idea of replacing eigenvalues and diagonal entries with their moduli is illustrated in the next section, in which we derive the absolute trace theorem. This observation can be viewed as a robust extension of Ky Fan maximum principle, that holds for any normal matrix. Further extensions provide useful majorization relations between eigenvalues of a matrix and a principal submatrix. The third section provides some background material on existing extensions of the interlacing theorems. This includes the concepts of diagonal similarity, transformed-Hermitian matrices, normal embedding, and principally normal matrices. Then, in Section 4, we introduce the “bounding inequalities” and derive conditions that ensure their viability.

2. The Absolute Trace of a Matrix

We shall start by stating Ky Fan trace theorems [9] [10] [11] [16] . Let $H\in {\u2102}^{n\times n}$ be a given Hermitian matrix whose eigenvalues are sorted to satisfy

${\lambda}_{1}\ge {\lambda}_{2}\ge \cdots \ge {\lambda}_{n}\mathrm{,}$ (2.1)

and let k be a positive integer that is smaller than n. Let the set

${\mathbb{Q}}_{k}=\left\{Q|Q\in {\u2102}^{n\times k}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.05em}}{Q}^{*}Q=I\right\}$ (2.2)

contain all the $n\times k$ matrices that have orthonormal columns. Then Ky Fan maximum principle asserts that

$\underset{j=1}{\overset{k}{\sum}}}\text{\hspace{0.05em}}{\lambda}_{j}=\underset{Q\in {\mathbb{Q}}_{k}}{\mathrm{max}}\text{trace}\left({Q}^{*}HQ\right).$ (2.3)

That is, the optimal value of the above maximum problem equals the sum of the k largest eigenvalues of H. Moreover, the optimal value is obtained for a matrix Q whose columns are eigenvectors of the k largest eigenvalues.

Similarly, Ky Fan minimum principle says that

$\underset{j=1}{\overset{k}{\sum}}}\text{\hspace{0.05em}}{\lambda}_{n+1-j}=\underset{Q\in {\mathbb{Q}}_{k}}{\mathrm{min}}\text{trace}\left({Q}^{*}HQ\right),$ (2.4)

and the optimal value is obtained for a matrix Q whose columns are eigenvectors of the k smallest eigenvalues.

The modification of the maximum principle is achieved by introducing a new type of objective function. This change enables us to handle all types of normal matrices.

Definition 1 Let $A=\left({a}_{ij}\right)$ be some matrix from ${\u2102}^{n\times n}$ with singular values ${\sigma}_{j}\left(A\right)$ , $j=1,\cdots ,n$.Then the scalar functions $abstr\left(A\right)$ and $singtr\left(A\right)$ are defined by the equalities

$abstr\left(A\right)={\displaystyle \underset{j=1}{\overset{n}{\sum}}}\left|{a}_{jj}\right|$ (2.5)

and

$singtr\left(A\right)={\displaystyle \underset{j=1}{\overset{n}{\sum}}}\text{\hspace{0.05em}}{\sigma}_{j}\left(A\right)\mathrm{.}$ (2.6)

The function $abstr\left(A\right)$ is called the absolute trace of A, while $singtr\left(A\right)$ is called the singular trace of A. This pair of functions is known to satisfy

$abstr\left(A\right)\le singtr\left(A\right)$ (2.7)

for any $A\in {\u2102}^{n\times n}$. A detailed proof of the last inequality can be found, for example, in ( [5] : p. 1237) ( [14] : p. 154), and ( [24] : pp. 261-263).

Theorem 1 (The absolute trace theorem) Let $N\in {\u2102}^{n\times n}$ be a normal matrix with eigenvalues ${\nu}_{j}$ , $j=1,\cdots ,n$ , that satisfy

$\left|{\nu}_{1}\right|\ge \left|{\nu}_{2}\right|\ge \cdots \ge \left|{\nu}_{n}\right|\mathrm{,}$ (2.8)

and let ${\mathbb{Q}}_{k}$ be defined by (2.2). Then

$\underset{j=1}{\overset{k}{\sum}}}\left|{\nu}_{j}\right|=\underset{Q\in {\mathbb{Q}}_{k}}{\mathrm{max}}abstr\left({Q}^{\mathrm{*}}NQ\right)\mathrm{,$ (2.9)

and the optimal value is obtained for a matrix Q which is composed from eigenvectors of N that correspond to ${\nu}_{1}\mathrm{,}\cdots \mathrm{,}{\nu}_{k}$.

Proof. Since N is normal the singular values of N satisfy

${\sigma}_{j}=\left|{\nu}_{j}\right|\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}j=1,\cdots ,n.$ (2.10)

Let Q be some matrix from ${\mathbb{Q}}_{k}$ and let

${\stackrel{\u02dc}{\sigma}}_{1}\ge {\stackrel{\u02dc}{\sigma}}_{2}\ge \cdots \ge {\stackrel{\u02dc}{\sigma}}_{k}$

denote the singular values of ${Q}^{\mathrm{*}}NQ$. Then (2.7) implies that

$abstr\left({Q}^{\mathrm{*}}NQ\right)\le singtr\left({Q}^{\mathrm{*}}NQ\right)={\displaystyle \underset{j=1}{\overset{k}{\sum}}}\text{\hspace{0.05em}}{\stackrel{\u02dc}{\sigma}}_{j}\mathrm{.}$ (2.11)

On the other hand, the interlacing relations for singular values give the inequalities

${\stackrel{\u02dc}{\sigma}}_{j}\le {\sigma}_{j}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}\text{\hspace{0.05em}}j=\mathrm{1,}\cdots \mathrm{,}k\mathrm{.}$ (2.12)

See (3.8) in the next section. Hence by combining (2.11), (2.12), and (2.10) we see that the inequality

$abstr\left({Q}^{\mathrm{*}}NQ\right)\le {\displaystyle \underset{j=1}{\overset{k}{\sum}}}\left|{\nu}_{j}\right|$ (2.13)

holds for all $Q\in {\mathbb{Q}}_{k}$. Yet equality is obtained when Q is composed from orthonormal eigenvectors of N that correspond to ${\nu}_{1}\mathrm{,}\cdots \mathrm{,}{\nu}_{k}$. □

The inequality (2.7) is part of a more general observation by Ky Fan [10] . Let $A=\left({a}_{ij}\right)\in {\u2102}^{n\times n}$ be an arbitrary complex matrix with singular values

${\sigma}_{1}\left(A\right)\ge {\sigma}_{2}\left(A\right)\ge \cdots \ge {\sigma}_{n}\left(A\right)\ge 0.$ (2.14)

It is also possible to assume that the diagonal entries of A are ordered to satisfy

$\left|{a}_{11}\right|\ge \left|{a}_{22}\right|\ge \cdots \ge \left|{a}_{nn}\right|\ge 0.$ (2.15)

Then Ky Fan [10] has shown that

$\underset{j=1}{\overset{\mathcal{l}}{\sum}}}\left|{a}_{jj}\right|\le {\displaystyle \underset{j=1}{\overset{\mathcal{l}}{\sum}}}\text{\hspace{0.05em}}{\sigma}_{j}\left(A\right)\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}\mathcal{l}=1,\cdots ,n.$ (2.16)

The last observation can be modified by replacing the diagonal entries with eigenvalues. Let ${\lambda}_{j}\left(A\right)$ , $j=1,\cdots ,n$ , denote the eigenvalues of A and assume that

$\left|{\lambda}_{1}\left(A\right)\right|\ge \left|{\lambda}_{2}\left(A\right)\right|\ge \cdots \ge \left|{\lambda}_{n}\left(A\right)\right|\ge 0.$ (2.17)

Then Schur’s triangularization theorem implies the existence of a unitary matrix $U\in {\u2102}^{n\times n}$ such that ${U}^{\mathrm{*}}AU$ is an upper triangular matrix, say $T=\left({t}_{ij}\right)\in {\u2102}^{n\times n}$ , and ${t}_{jj}={\lambda}_{j}\left(A\right)$ for $j=1,\cdots ,n$. Therefore, since T has the same singular values as A, applying (2.16) on T gives

$\underset{j=1}{\overset{\mathcal{l}}{\sum}}}\left|{\lambda}_{j}\left(A\right)\right|\le {\displaystyle \underset{j=1}{\overset{\mathcal{l}}{\sum}}}\text{\hspace{0.05em}}{\sigma}_{j}\left(A\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}\mathcal{l}=1,\cdots ,n.$ (2.18)

See, for example ( [14] : p. 176; [15] : p. 318; [24] : p. 262). The inequalities (2.16) and (2.18) describe majorization relations between the related vectors. For a detailed survey of majorization relations see [15] . See also [1] [14] [24] . The next theorem extends (2.18) to handle principal submatrices.

Theorem 2 Let $A\in {\u2102}^{n\times n}$ be as above, satisfying (2.14) - (2.18). Let $\stackrel{\u02dc}{A}\in {\u2102}^{k\times k}$ be a principal submatrix of A. That is, $1\le k<n$ , and $\stackrel{\u02dc}{A}$ is obtained from A by deleting $n-k$ rows and the same $n-k$ columns. Let the singular values and the eigenvalues of $\stackrel{\u02dc}{A}$ be ordered to satisfy

${\sigma}_{1}\left(\stackrel{\u02dc}{A}\right)\ge \cdots \ge {\sigma}_{k}\left(\stackrel{\u02dc}{A}\right)$ (2.19)

and

$\left|{\lambda}_{1}\left(\stackrel{\u02dc}{A}\right)\right|\ge \cdots \ge \left|{\lambda}_{k}\left(\stackrel{\u02dc}{A}\right)\right|$ (2.20)

respectively. Then

$\underset{j=1}{\overset{\mathcal{l}}{\sum}}}\left|{\lambda}_{j}\left(\stackrel{\u02dc}{A}\right)\right|\le {\displaystyle \underset{j=1}{\overset{\mathcal{l}}{\sum}}}\text{\hspace{0.05em}}{\sigma}_{j}\left(\stackrel{\u02dc}{A}\right)\le {\displaystyle \underset{j=1}{\overset{\mathcal{l}}{\sum}}}\text{\hspace{0.05em}}{\sigma}_{j}\left(A\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}\mathcal{l}=1,\cdots ,k.$ (2.21)

Proof. The first inequality is obtained from (2.18) by replacing A with $\stackrel{\u02dc}{A}$. The second one is concluded from the singular values inequality

${\sigma}_{j}\left(\stackrel{\u02dc}{A}\right)\le {\sigma}_{j}\left(A\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}j=1,\cdots ,k.$

See (3.8) in the next section. □

When handling a normal matrix the last results can be refined as follows.

Theorem 3 Let $N\in {\u2102}^{n\times n}$ be a normal matrix with eigenvalues as in (2.8). Let $P\in {\u2102}^{k\times k}$ be a principal submatrix of N with eigenvalues

$\left|{\eta}_{1}\right|\ge \cdots \ge \left|{\eta}_{k}\right|\mathrm{,}$ (2.22)

and singular values

${\stackrel{\u02dc}{\sigma}}_{1}\ge \cdots \ge {\stackrel{\u02dc}{\sigma}}_{k}\mathrm{.}$ (2.23)

Then

$\underset{j=1}{\overset{\mathcal{l}}{\sum}}}\left|{\eta}_{j}\right|\le {\displaystyle \underset{j=1}{\overset{\mathcal{l}}{\sum}}}\text{\hspace{0.05em}}{\stackrel{\u02dc}{\sigma}}_{j}\le {\displaystyle \underset{j=1}{\overset{\mathcal{l}}{\sum}}}\left|{\nu}_{j}\right|\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\mathcal{l}=\mathrm{1,}\cdots \mathrm{,}k\mathrm{.$ (2.24)

Proof. Since N is normal, the singular values of N are $\left|{\nu}_{j}\right|$ , $j=1,\cdots ,n$. Therefore (2.24) is obtained from (2.21) by replacing A with N, and $\stackrel{\u02dc}{A}$ with P. □

Theorem 4 Let N be a normal matrix as above. Let the matrix $V\in {\u2102}^{n\times k}$ has k orthonormal columns, and consider the $k\times k$ matrix

$P={V}^{*}NV.$ (2.25)

Let the eigenvalues and the singular values of P satisfy (2.22) and (2.23), respectively. Then (2.24) holds.

Proof. Let the orthonormal matrix $U\in {\u2102}^{n\times n}$ be obtained by completing the columns of V to be an orthornormal basis of ${\u2102}^{n}$. Then P is a principal submatrix of the normal matrix ${U}^{\mathrm{*}}NU$ , which has the same eigenvalues and singular values as N. □

The next sections sharpen the relations between eigenvalues of a matrix and a principal submatrix.

3. The Interlacing Theorems of Cauchy and Poincaré

In this section, we briefly overview the interlacing theorems of Cauchy and Poincaré, e.g., [13] [18] [24] , and consider some ways to extend these results. Let $G\in {\u2102}^{n\times n}$ be a Hermitian matrix with eigenvalues

${\lambda}_{1}\ge {\lambda}_{2}\ge \cdots \ge {\lambda}_{n}.$ (3.1)

Let the Hermitian matrix $H\in {\u2102}^{k\times k}$ be a principal submatrix of G with eigenvalues

${\mu}_{1}\ge {\mu}_{2}\ge \cdots \ge {\mu}_{k}\mathrm{.}$ (3.2)

Then Cauchy interlacing theorem asserts that

${\lambda}_{j}\ge {\mu}_{j}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}\text{\hspace{0.05em}}j=1,\cdots ,k,$ (3.3)

and

${\mu}_{k+1-j}\ge {\lambda}_{n+1-j}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}\text{\hspace{0.05em}}j=1,\cdots ,k.$ (3.4)

In particular, for $k=n-1$ we have the interlacing relations

${\lambda}_{1}\ge {\mu}_{1}\ge {\lambda}_{2}\ge {\mu}_{2}\ge {\lambda}_{3}\ge \cdots \ge {\lambda}_{n-1}\ge {\mu}_{n-1}\ge {\lambda}_{n}\mathrm{.}$ (3.5)

The Cauchy interlace theorem can be used to derive analogous results for singular values. Let $A\in {\u2102}^{n\times n}$ have the singular values

${\sigma}_{1}\ge {\sigma}_{2}\ge \cdots \ge {\sigma}_{n}\mathrm{,}$ (3.6)

and let $\stackrel{\u02dc}{A}\in {\u2102}^{k\times k}$ be a principal submatrix of A with singular values

${\stackrel{\u02dc}{\sigma}}_{1}\ge {\stackrel{\u02dc}{\sigma}}_{2}\ge \cdots \ge {\stackrel{\u02dc}{\sigma}}_{k}\mathrm{.}$ (3.7)

Then

${\sigma}_{j}\ge {\stackrel{\u02dc}{\sigma}}_{j}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}j=1,\cdots ,k.$ (3.8)

See, for example, [5] [19] [22] [24] .

Another related corollary is Poincaré separation theorem, in which H is defined as

$H={V}^{*}GV,$ (3.9)

where $V\in {\u2102}^{n\times k}$ has orthonormal columns. Then, assuming that the eigenvalues of G and H satisfy (3.1) and (3.2), the inequalities (3.3)-(3.5) remain valid. The last statement is sometimes summarized by saying that the eigenvalues of H interlace those of G.

Let the matrix U be obtained by extending the columns of V to be an orthonormal basis of ${\u2102}^{n}$. Then Poincaré theorem is concluded from Cauchy theorem by comparing the eigenvalues of ${V}^{\mathrm{*}}GV$ and ${U}^{\mathrm{*}}GU$. Conversely, if V is composed of some columns of the identity matrix then Poincaré theorem coincides with Cauchy theorem. Note that Ky Fan trace theorems are direct consequences of Poincaré theorem.

One way to extend Cauchy theorem is based on diagonal similarity. Let A and B be two matrices in ${\u2102}^{n\times n}$. Then A is diagonally similar to B if there exists a nonsingular diagonal matrix $D\in {\u2102}^{n\times n}$ such that $B={D}^{-1}AD$. Clearly, if A is diagonally similar to B then every principal submatrix of A is diagonally similar to the corresponding principal submatrix of B. Therefore, since similarity transformations preserve eigenvalues, Cauchy interlacing theorem applies to any matrix which is diagonally similar to Hermitian matrix. Let us consider for example a tridiagonal matrix, $T=\left({t}_{ij}\right)\in {\mathbb{R}}^{n\times n}$ , such that

${t}_{i,i+1}\ast {t}_{i+1,i}>0\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}\text{\hspace{0.05em}}i=1,\cdots ,n-1.$ (3.10)

Then it is easy to verify that T is diagonally similar to a tridiagonal symmetric matrix, which means that Cauchy theorem holds for T. However, in the general case, when given an arbitrary matrix, the question of whether this matrix is diagonally similar to some (unknown) Hermitian matrix is not easy to answer.

A second extension regards the following matrices.

Definition 2 A matrix $N\in {\u2102}^{n\times n}$ is called Transformed-Hermitian, or T-H in brief, if it has the form

$N=\alpha H+\beta I,$ (3.11)

where $\alpha $ and $\beta $ are complex numbers and H Hermitian.

An equivalent way to write (3.11) is

$N={\text{e}}^{i\theta}H+\gamma I,$ (3.12)

where $\theta $ is a real number, $\gamma $ a complex number, and H Hermitian. Observe that the eigenvalues of a T-H matrix lie on some line in the complex plane. This shows that the Cauchy-Poincaré interlacing theorems can be extended to T-H matrices: In this case, the interlacing relations take place on the relevant line.

It is also easy to verify that any T-H matrix is normal, and that any normal matrix whose eigenvalues lie on a line in the complex plane is a T-H matrix. Another feature that characterizes T-H matrices is that they have the normal embedding property which is defined as follows.

Definition 3 A normal matrix $N\in {\u2102}^{n\times n}$ has the normal embedding property if for any matrix $V\in {\u2102}^{n\times k}$ with orthonormal columns the matrix ${V}^{\mathrm{*}}NV$ is normal.

The relation between normal embedding and T-H matrices was noted by Ky Fan and Pall ( [12] : p. 299): A normal matrix has the normal embedding property if and only if its eigenvalues lie on a line in the complex plane. In other words, a matrix has the normal embedding property if and only if it is a T-H matrix.

A larger set of matrices gained by moving from T-H matrices to block-diagonal matrices of this kind.

Definition 4 A matrix is called block-diagonal T-H matrix if it is block-diagonal and each block is a T-H matrix.

Note that a block-diagonal T-H matrix may include $1\times 1$ blocks, and it is always a normal matrix. Moreover, let $N\in {\u2102}^{n\times n}$ be a block-diagonal T-H matrix, and let $P\in {\u2102}^{k\times k}$ be a principal submatrix of N. Then, clearly, P is also a block-diagonal T-H matrix, which shows that P is normal. Recently Sherman and Smith [21] have connected these observations with the following useful concept.

Definition 5 A normal matrix N is called principally-normal, or P-N in brief, if every principal submatrix of N is a normal matrix.

It is shown in [21] that if N is a P-N irreducible matrix then N is a T-H matrix. Otherwise, when N is a reducible P-N matrix, there exists a permutation matrix S such that ${S}^{\mathrm{*}}NS$ is a block-diagonal T-H matrix. That is, a matrix is P-N if and only if it can be permuted into a block-diagonal T-H matrix. The eigenvalues of such matrices lie on the union of finitely many lines and singletons. Therefore, although local interlacing relations hold on each line, it is not possible to extend these relations to eigenvalues that lie on different lines. Consider for example a diagonal matrix, D, whose diagonal entries are random complex numbers. Then D has the P-N property but it fails to satisfy the interlacing relations (or to have the normal embedding property). It is also worthwhile noting that although a unitary matrix is always normal, it is not necessarily a P-N matrix. For example consider the matrix

$\left(\begin{array}{ccc}\alpha & \alpha & 0\\ 0& 0& 1\\ \alpha & -\alpha & 0\end{array}\right)$ (3.13)

where $\alpha =1/\sqrt{2}$.

The fact that Cauchy interlace theorem holds for real diagonal matrices enables us to extend this theorem to upper triangular matrices that have real diagonal entries: Let $T\in {\u2102}^{n\times n}$ be such a matrix, and let $\stackrel{\u02dc}{T}\in {\u2102}^{k\times k}$ be a principal submatrix of T. Then $\stackrel{\u02dc}{T}$ is also an upper triangular matrix, and each diagonal entry (eigenvalue) of $\stackrel{\u02dc}{T}$ is also a diagonal entry (eigenvalue) of T. Let the diagonal matrices $D\in {\mathbb{R}}^{n\times n}$ and $\stackrel{\u02dc}{D}\in {\mathbb{R}}^{k\times k}$ be obtained from the diagonal entries of T and $\stackrel{\u02dc}{T}$ respectively. Then $\stackrel{\u02dc}{D}$ is a principal submatrix of D, and the eigenvalues of $\stackrel{\u02dc}{D}$ (the eigenvalues of $\stackrel{\u02dc}{T}$ ) interlace those of D (those of T ). Similar arguments show that the interlacing relations remain valid when the diagonal entries of T lie on a line in the complex plane. Note that lower triangular matrices have a similar property. This example shows that Cauchy interlacing may hold in non-normal matrices.

Summarizing the attempts to extend the interlacing theorems, we see that extension is possible only for certain types of matrices whose eigenvalues lie on a line in the complex plane. If, however, the eigenvalues of a matrix are not real or collinear, then the interlacing inequalities become meaningless. In the next section, we introduce a new kind of inequalities between eigenvalues of a matrix and a principal submatrix. The new inequalities have a clear meaning for any square matrix, and are valid on a larger range of matrices.

4. The Bounding Inequalities

In this section, we introduce the bounding inequalities and discuss their viability.

Definition 6 (The Bounding inequalities) Let the matrix $N\in {\u2102}^{n\times n}$ have eigenvalues ${\nu}_{j}$ , $j=1,\cdots ,n$ , that satisfy

$\left|{\nu}_{1}\right|\ge \left|{\nu}_{2}\right|\ge \cdots \ge \left|{\nu}_{n}\right|\mathrm{.}$ (4.1)

Let $P\in {\u2102}^{k\times k}$ be a principal submatrix of N with eigenvalues ${\eta}_{j}$ , $j=1,\cdots ,k$ , that satisfy

$\left|{\eta}_{1}\right|\ge \left|{\eta}_{2}\right|\ge \cdots \ge \left|{\eta}_{k}\right|\mathrm{.}$ (4.2)

If the eigenvalues of the two matrices satisfy

$\left|{\nu}_{j}\right|\ge \left|{\eta}_{j}\right|\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}j=\mathrm{1,}\cdots \mathrm{,}k\mathrm{,}$ (4.3)

then we say that the bounding inequalities hold.

Definition 7 (The Bounding Property) Let N be as above. If the bounding inequalities hold for any principal submatrix of N, then we say that N has the bounding property.

Below we will show that the bounding property is shared by several types of matrices.

Theorem 5 Any P-N matrix has the bounding property.

Proof. Let N and P be as in Definition 6. Since N and P are normal matrices, $\left|{\nu}_{j}\right|$ , $j=1,\cdots ,n$ , are the singular values of N, and $\left|{\eta}_{j}\right|$ , $j=1,\cdots ,k$ , are the singular values of P. Hence the interlacing relations for singular values imply (4.3).

□

We have seen that unitary matrices may fail to be principally normal. Yet the next assertion resolves this difficulty.

Theorem 6 Any unitary matrix has the bounding property.

Proof. Let N and P be as in Definition 6 and assume that N is unitary. In this case, all the singular values of N equal 1. That is, $\left|{\nu}_{j}\right|=1$ for $j=1,\cdots ,n$. Let ${\stackrel{\u02dc}{\sigma}}_{1}$ denote the largest singular value of P. Then it is easy to verify that

$\left|{\nu}_{1}\right|\ge {\stackrel{\u02dc}{\sigma}}_{1}\ge \left|{\eta}_{1}\right|\mathrm{.}$ (4.4)

Consequently, the j th eigenvalues of N and P satisfy

$\left|{\nu}_{j}\right|=1=\left|{\nu}_{1}\right|\ge {\stackrel{\u02dc}{\sigma}}_{1}\ge \left|{\eta}_{1}\right|\ge \left|{\eta}_{j}\right|\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}j=1,\cdots ,k,$ (4.5)

which are the bounding relations. □

Corollary 7 Any permutation matrix has the bounding property.

As with Cauchy interlacing, the range of matrices that share the bounding property can be extended by applying diagonal similarity.

Theorem 8 If A is diagonally similar to a matrix B that has the bounding property, then A has the bounding property.

Proof. Since A is similar to B both matrices have the same set of eigenvalues. Moreover, let $\stackrel{\u02dc}{A}$ be a principal submatrix of A and let $\stackrel{\u02dc}{B}$ denote the corresponding principal submatrix of B. Then $\stackrel{\u02dc}{A}$ is diagonally similar to $\stackrel{\u02dc}{B}$ , and the two submatrices have the same eigenvalues. Therefore, since B and $\stackrel{\u02dc}{B}$ satisfy the bounding inequalities, the same is true for A and $\stackrel{\u02dc}{A}$. □

Similar arguments can be used to verify the following assertions.

Theorem 9 Let the matrix $A\in {\u2102}^{n\times n}$ has the bounding property. Then the following matrices share this property.

• The matrices ${A}^{\text{T}}$ and ${A}^{\mathrm{*}}$.

• The matrix cA where c is a complex number.

• The matrix $PA{P}^{\text{T}}$ where $P\in {\mathbb{R}}^{n\times n}$ is a permutation matrix.

We have proved that the interlacing relations hold for upper (or lower) triangular matrices whose diagonal entries are restricted to lie on the real axis (or another line). Now we will show that the bounding property holds for such matrices even if their diagonal entries are not real or collinear.

Theorem 10 Any upper (or lower) triangular matrix has the bounding property.

Proof. Let $T\in {\u2102}^{n\times n}$ be an upper triangular matrix, and let $\stackrel{\u02dc}{T}\in {\u2102}^{k\times k}$ be a principal submatrix of T. Then $\stackrel{\u02dc}{T}$ is also an upper triangular matrix. Let the diagonal matrices $D\in {\u2102}^{n\times n}$ and $\stackrel{\u02dc}{D}\in {\u2102}^{k\times k}$ be obtained from the diagonals of T and $\stackrel{\u02dc}{T}$ , respectively. Then $\stackrel{\u02dc}{D}$ has the same eigenvalues as $\stackrel{\u02dc}{T}$. Therefore, since D is a P-N matrix and $\stackrel{\u02dc}{D}$ is a principal submatrix of D, the eigenvalues of T and $\stackrel{\u02dc}{T}$ satisfy the bounding inequalities. □

One corollary of the last theorem is that any matrix that has the Jordan form has the bounding property. Yet a Jordan block with order larger than one is not a normal matrix. This shows that the bounding property is not restricted to normal matrices. Another type of matrix that shares this property is revealed below.

Theorem 11 Let $T=\left({t}_{ij}\right)\in {\u2102}^{n\times n}$ be a tridiagonal matrix that has real diagonal entries, and the off-diagonal entries satisfy

${t}_{i,i+1}\ast {t}_{i+1,i}\ge 0\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}i=1,\cdots ,n-1.$ (4.6)

Then T has the bounding property.

Proof. Condition (4.6) allows that

${t}_{i,i+1}\ast {t}_{i+1,i}=0\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{but}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{t}_{i,i+1}\ne 0\text{\hspace{0.17em}}\left(\text{or}\text{\hspace{0.17em}}{t}_{i+1,i}\ne 0\right).$

This possibility prevents the use of diagonal similarity and the proof needs a different argument. Let $\stackrel{\u02dc}{T}\in {\u2102}^{k\times k}$ be a principal submatrix of T. Then $\stackrel{\u02dc}{T}$ has the same form as T. Let the symmetric tridiagonal matrix $S=\left({s}_{ij}\right)\in {\mathbb{R}}^{n\times n}$ be obtained from T in the following way: ${s}_{ii}={t}_{ii}$ for $i=1,\cdots ,n$ , and ${s}_{i,i+1}={s}_{i+1,i}={\left({t}_{i,i+1}\ast {t}_{i,i+1}\right)}^{1/2}$ for $i=1,\cdots ,n-1$. Let $\stackrel{\u02dc}{S}\in {\mathbb{R}}^{k\times k}$ be a principal submatrix of S that is obtained in the same way that $\stackrel{\u02dc}{T}$ is obtained from T. Recall that the principal minors of a tridiagonal matrix can be computed in a recursive way, e.g., ( [13] , p.29). This determinantal expansion shows that T has the same eigenvalues as S, and that $\stackrel{\u02dc}{T}$ has the same eigenvalues as $\stackrel{\u02dc}{S}$. Moreover, the symmetry of S implies that the eigenvalues of S and $\stackrel{\u02dc}{S}$ satisfy both the bounding inequalities and the interlacing relations. Hence the same is true for the eigenvalues of T and $\stackrel{\u02dc}{T}$. □

A further extension is gained by moving to block-diagonal matrices. For this purpose, we need the following lemmas.

Lemma 12 Let $A\in {\u2102}^{p\times p}$ and $B\in {\u2102}^{q\times q}$ be two square matrices. Assume that B has the bounding property, and let $\stackrel{\u02dc}{B}\in {\u2102}^{\mathcal{l}\times \mathcal{l}}$ be a principal submatrix of B. Let the matrices $N\in {\u2102}^{n\times n}$ and $P\in {\u2102}^{k\times k}$ be defined by the equalities

$N=\left(\begin{array}{cc}A& 0\\ 0& B\end{array}\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}P=\left(\begin{array}{cc}A& 0\\ 0& \stackrel{\u02dc}{B}\end{array}\right),$ (4.7)

and let the eigenvalues of these matrices satisfy (4.1) and (4.2). Then P is a principal submatrix of N, and the bounding inequalities (4.3) hold.

Proof. Let the eigenvalues of $A\mathrm{,}$ $B$, and $\stackrel{\u02dc}{B}$ be denoted as

$\left|{\alpha}_{1}\right|\ge \cdots \ge \left|{\alpha}_{p}\right|\mathrm{,}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\left|{\beta}_{1}\right|\ge \cdots \ge \left|{\beta}_{q}\right|\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\left|{\stackrel{\u02dc}{\beta}}_{1}\right|\ge \cdots \ge \left|{\stackrel{\u02dc}{\beta}}_{\mathcal{l}}\right|\mathrm{,}$ (4.8)

respectively. Then from (4.7) we see that the eigenvalues of N are the union of the eigenvalues of A and B. That is,

$\left\{{\nu}_{1}\mathrm{,}\cdots \mathrm{,}{\nu}_{n}\right\}=\left\{{\alpha}_{1}\mathrm{,}\cdots \mathrm{,}{\alpha}_{p}\right\}\cup \left\{{\beta}_{1}\mathrm{,}\cdots \mathrm{,}{\beta}_{q}\right\}\mathrm{.}$ (4.9)

Similarly, the eigenvalues of P are obtained from those of A and $\stackrel{\u02dc}{B}$. That is

$\left\{{\eta}_{1}\mathrm{,}\cdots \mathrm{,}{\eta}_{k}\right\}=\left\{{\alpha}_{1}\mathrm{,}\cdots \mathrm{,}{\alpha}_{p}\right\}\cup \left\{{\stackrel{\u02dc}{\beta}}_{1}\mathrm{,}\cdots \mathrm{,}{\stackrel{\u02dc}{\beta}}_{\mathcal{l}}\right\}\mathrm{.}$ (4.10)

Let ${\alpha}_{i}$ be some eigenvalue of A. Then there exist two indices, ${i}^{\mathrm{*}}$ and ${i}^{\mathrm{**}}$ say, such that

${\alpha}_{i}={\nu}_{{i}^{*}}={\eta}_{{i}^{**}}$. (4.11)

Moreover, the bounding inequalities

$\left|{\beta}_{j}\right|\ge \left|{\stackrel{\u02dc}{\beta}}_{j}\right|\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.05em}}\text{\hspace{0.17em}}j=1,\cdots ,\mathcal{l},$ (4.12)

implies that when moving from N to P either ${\alpha}_{i}$ stays in its place or it moves “forward”. That is

${i}^{\mathrm{**}}\le {i}^{\mathrm{*}}\mathrm{.}$ (4.13)

Now (4.1) gives

$\left|{\nu}_{{i}^{\mathrm{**}}}\right|\ge \left|{\nu}_{{i}^{\mathrm{*}}}\right|$ (4.14)

and from (4.11) we obtain

$\left|{\nu}_{{i}^{**}}\right|\ge \left|{\nu}_{{i}^{*}}\right|=\left|{\alpha}_{i}\right|=\left|{\eta}_{{i}^{**}}\right|,$ (4.15)

which shows that ${\alpha}_{i}$ satisfies the related bounding inequality.

Next we show that an eigenvalue of P which is an eigenvalue of $\stackrel{\u02dc}{B}$ satisfies the related bounding inequality. Let ${\stackrel{\u02dc}{\beta}}_{j}$ be an eigenvalue of $\stackrel{\u02dc}{B}$ such that

${\stackrel{\u02dc}{\beta}}_{j}={\eta}_{{j}^{**}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}{\beta}_{j}={\nu}_{{j}^{*}}.$ (4.16)

Then (4.12) implies that when moving from N to P the related eigenvalue can’t move forward. That is,

${j}^{\mathrm{*}}\le {j}^{\mathrm{**}}\mathrm{.}$ (4.17)

If ${j}^{*}={j}^{**}$ then (4.12) and (4.16) give

$\left|{\nu}_{{j}^{**}}\right|=\left|{\nu}_{{j}^{*}}\right|=\left|{\beta}_{j}\right|\ge \left|{\stackrel{\u02dc}{\beta}}_{j}\right|=\left|{\eta}_{{j}^{**}}\right|,$ (4.18)

which is the related bounding inequality.

It is possible, therefore, to assume that

${j}^{**}={j}^{*}+r$ (4.19)

where r is a positive integer. The last equality means that r eigenvalues of A have modulus between $\left|{\beta}_{j}\right|$ and $\left|{\stackrel{\u02dc}{\beta}}_{j}\right|$. Let ${\alpha}_{i}$ be the smallest of them (the one that has the smallest modulus) and let ${i}^{\mathrm{*}}$ and ${i}^{\mathrm{**}}$ be as in (4.11). Then

$\left|{\alpha}_{i}\right|\ge \left|{\stackrel{\u02dc}{\beta}}_{j}\right|$ (4.20)

and

${j}^{**}={i}^{*},$ (4.21)

which shows that

$\left|{\nu}_{{j}^{**}}\right|=\left|{\nu}_{{i}^{*}}\right|=\left|{\alpha}_{i}\right|\ge \left|{\stackrel{\u02dc}{\beta}}_{j}\right|=\left|{\eta}_{{j}^{**}}\right|.$ (4.22)

□

Lemma 13 Let the matrices $A\mathrm{,}B\mathrm{,}\stackrel{\u02dc}{B}$ and N be as in Lemma 12, and let $\stackrel{\u02dc}{A}$ be a principal submatrix of A. If, in addition, A has the bounding property then the matrices N and

$\stackrel{\u02dc}{P}=\left(\begin{array}{cc}\stackrel{\u02dc}{A}& 0\\ 0& \stackrel{\u02dc}{B}\end{array}\right)$ (4.23)

satisfy the bounding inequalities.

Proof. From Lemma 12 we know that the eigenvalues of N bound those of P. Also, since $\stackrel{\u02dc}{P}$ is a principal submatrix of P, a second use of Lemma 12 shows that the eigenvalues of P bound those of $\stackrel{\u02dc}{P}$. Hence by combining these relations, we obtain that the eigenvalues of N bound those of $\stackrel{\u02dc}{P}$. □

Lemma 14 Let the matrices A, B and N be as in Lemma 12. If both A and B have the bounding property then N inherits this property.

Proof. This statement is a direct corollary of Lemma 13. □

Theorem 15 Let N be a square block-diagonal matrix. If each block has the bounding property then N inherits this property.

Proof. The proof is concluded from Lemma 14 by using induction on the number of blocks. □

The Poincaré versions of the bounding inequalities are derived below. The first assertion is using the normal embedding property.

Theorem 16 Let $N\in {\u2102}^{n\times n}$ be a normal matrix that has the normal embedding property. Let $V\in {\u2102}^{n\times k}$ be some matrix with orthonormal columns, and let the matrix P be defined by the equality

$P={V}^{*}NV.$ (4.24)

Let the eigenvalues of N and P satisfy (4.1) and (4.2), respectively. Then the bounding inequalities (4.3) hold.

Proof. Let $U\in {\u2102}^{n\times n}$ be a unitary matrix whose first k columns are those of V. Then ${V}^{\mathrm{*}}NV$ is a principal submatrix of ${U}^{\mathrm{*}}NU$ , and both matrices are normal. From this point, the proof continues as in Theorem 5. □

Theorem 17 Let $N\in {\u2102}^{n\times n}$ be a unitary matrix with eigenvalues ${\nu}_{j}$ , $j=1,\cdots ,n$. Let the matrices V and P be as in Theorem 16, and let the eigenvalues of P satisfy (4.2). Then the bounding relations (4.3) hold.

Proof. Let the unitary matrix $U\in {\u2102}^{n\times n}$ be obtained from V as above. Then ${V}^{\mathrm{*}}NV$ is a principal submatrix of ${U}^{\mathrm{*}}NU$ , while the last matrix is unitary. Hence the proof follows from Theorem 6. □

The question of whether it is possible to achieve further extensions is interesting and challenging. For example, the majorization results of Theorem 3 tempt to assume that any normal matrix has the bounding property. Indeed, it is difficult to find a normal matrix that violates this rule, but the validity of this conjecture remains an open question.

Similarly, Theorem 2 suggests that the bounding inequalities may hold in several non-normal matrices. In this context, it is worth mentioning the following empirical observation. Experiments that we have done indicate that the bounding inequalities are valid in several types of random matrices. For example, let $A\in {\mathbb{R}}^{n\times n}$ be a random matrix whose entries are uniformly distributed in the interval $\left[-\mathrm{1,1}\right]$. Then the bounding inequalities are likely to hold whenever $n\ge 100$ and $k\le n/2$. A similar behaviour is observed in other types of random matrices. However, the explanation of this phenomenon is left to future research.

5. Concluding Remarks

The classical theorems that we discuss are restricted to Hermitian matrices. Yet by introducing minor modifications, the theorems become meaningful for non-Hermitian matrices. The current extensions are gained by replacing eigenvalues and diagonal entries with their moduli. This enables us to replace Ky Fan maximum principle with the absolute trace theorem, which is applicable to all normal matrices.

The bounding inequalities and the bounding property are entirely new concepts. The geometric interpretation of the new bounds is quite different from that of the interlacing relations, which makes them meaningful for any square matrix. We have shown that the range of matrices that satisfy these bounds is considerably larger than that of the interlacing relations. Yet the question whether it is possible to achieve further extensions remains open.

As noted in the introduction, the interlacing relations are used to establish monotonic convergence in restarted Krylov methods for Hermitian matrices. The current results pave the way for extending these properties to non-Hermitian matrices.

References

[1] Bhatia, R. (1997) Matrix Analysis. Springer, New York.

https://doi.org/10.1007/978-1-4612-0653-8

[2] Carden, R. and Embree, M. (2012) Ritz Value Localization for Non-Hermitian Matrices. SIAM Journal on Matrix Analysis and Applications, 33, 1320-1338.

https://doi.org/10.1137/120872693

[3] Carden, R. and Hansen, D. (2013) Ritz Values of Normal Matrices and Ceva’s Theorem. Linear Algebra and Its Applications, 438, 4114-4129.

https://doi.org/10.1016/j.laa.2012.12.030

[4] Carlson, D. (1983) Minimax and Interlacing Theorems for Matrices. Linear Algebra and Its Applications, 54, 153-172.

https://doi.org/10.1016/0024-3795(83)90211-2

[5] Dax, A. (2010) On Extremum Properties of Orthogonal Quotient Matrices. Linear Algebra and Its Applications, 432, 1234-1257.

https://doi.org/10.1016/j.laa.2009.10.034

[6] Dax, A. (2017) A New Type of Restarted Krylov Methods. Advances in Linear Algebra & Matrix Theory, 7, 18-28.

https://doi.org/10.4236/alamt.2017.71003

[7] Dax, A. (2018) A Restarted Krylov Method with Inexact Inversions. Numerical Linear Algebra with Applications, 26, e2213.

https://doi.org/10.1002/nla.2213

[8] Embree, M. (2009) The Arnoldi Eigenvalue Iteration with Exact Shifts Can Fail. SIAM Journal on Matrix Analysis and Applications, 31, 1-10.

https://doi.org/10.1137/060669097

[9] Fan, K. (1949) On a Theorem of Weyl Concerning Eigenvalues of Linear Transformations I. Proceedings of the National Academy of Sciences of the United States, 35, 652-655.

https://doi.org/10.1073/pnas.35.11.652

[10] Fan, K. (1951) Maximum Properties and Inequalities for the Eigenvalues of Completely Continuous Operators. Proceedings of the National Academy of Sciences of the United States, 37, 760-766.

https://doi.org/10.1073/pnas.37.11.760

[11] Fan, K. (1953) A Minimum Property of the Eigenvalues of a Hermitian Transformation. The American Mathematical Monthly, 60, 48-50.

https://doi.org/10.2307/2306486

[12] Fan, K. and Pall, G. (1957) Embedding Conditions for Hermitian and Normal Matrices. Canadian Journal of Mathematics, 9, 298-304.

https://doi.org/10.4153/CJM-1957-036-1

[13] Horn, R.A. and Johnson, C.R. (1985) Matrix Analysis. Cambridge University Press, Cambridge.

https://doi.org/10.1017/CBO9780511810817

[14] Horn, R.A. and Johnson, C.R. (1991) Topics in Matrix Analysis. Cambridge University Press, Cambridge.

https://doi.org/10.1017/CBO9780511840371

[15] Marshall, A.W., Olkin, I. and Arnold, B.C. (2011) Inequalities: Theory of Majorization and Its Applications. Springer Series in Statistics, 2nd Edition, Springer, New York.

https://doi.org/10.1007/978-0-387-68276-1

[16] Moslehian, M.S. (2012) Ky Fan Inequalities. Linear and Multilinear Algebra, 60, 1313-1325.

https://doi.org/10.1080/03081087.2011.641545

[17] Noschese, S., Pasquini, L. and Reichel, L. (2013) Tridiagonal Toeplitz Matrices: Properties and Novel Applications. Numerical Linear Algebra with Applications, 20, 302-326.

https://doi.org/10.1002/nla.1811

[18] Parlett, B.N. (1980) The Symmetric Eigenvalue Problem. Prentice-Hall, Englewood Cliffs.

[19] Queiró, J.F. (1987) On the Interlacing Property for Singular Values and Eigenvalues. Linear Algebra and Its Applications, 97, 23-28.

https://doi.org/10.1016/0024-3795(87)90136-4

[20] Saad, Y. (2011) Numerical Methods for Large Eigenvalue Problems. Revised Edition, SIAM, Philadelphia.

https://doi.org/10.1137/1.9781611970739

[21] Sherman, M.D. and Smith, R.L. (2013) Principally Normal Matrices. Linear Algebra and Its Applications, 438, 2617-2627.

https://doi.org/10.1016/j.laa.2012.10.017

[22] Thompson, R.C. (1972) Principal Submatrices IX: Interlacing Inequalities for Singular Values of Submatrices. Linear Algebra and Its Applications, 5, 1-12.

https://doi.org/10.1016/0024-3795(72)90013-4

[23] Watkins, D.S. (2007) The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods. SIAM, Philadelphia.

https://doi.org/10.1137/1.9780898717808

[24] Zhang, F. (1999) Matrix Theory: Basic Results and Techniques. Springer-Verlag, New York.