In this paper, we present explicit similarity transformations to decompose block tridiagonal matrices of the following form:
for some special pairs of , where , into block diogonal matrices. The pairs to be considered in this paper include (1,1), (1,2), and (2,2). We shall show that the transformations for these three pairs all lead to the block diagonal matrices of the following single unified form:
where the operation symbol denotes the matrix direct sum and the diagonal submatrices in which , , are explicitly known, although they depend on the values of and . Our decomposition method is closely related to the classical fast Poisson solver   using Fourier analysis.
The block decomposition scheme to be addressed has been presented by the author in  and formal proof was given for . The decompositions for the other two cases were simply mentioned without derivations. Unfortunately, that paper consists of two errors. First, the eigenvectors used to form the transformation matrix for decomposing and the decomposed submatrices are incorrect, which will be addressed in Theorem 2 of Section 2 in this paper. Second, the transformation matrix Q for is not orthogonal, although the eigenvectors and the decomposed submatrices presented are correct, meaning that the expression should read . The main purposes of this paper include the following tasks.
1) We show that the transformation matrix for decomposing is orthogonal in Theorem 1.
2) We take this opportunity to correct the errors made in our previous paper by providing formal proof with the transformation matrix formed by the correct eigenvectors for decomposing in Theorem 2.
3) We also present a formal derivation for the decomposition of into a block diagonal matrix in Theorem 3.
4) Numerical examples and experiments for all three cases will be given in Section 3 to demonstrate the validity and the advantages of the decompositions.
The block decompositions are all based on similarity transformations with known eigenvectors of certain tridiagonal matrices and they all yield a single unified form of block diagonal matrices. Since similarity transformations preserve all eigenvalues, the eigenvalues of the original matrix , which is of size pq by pq, can be obtained from the q diagonal blocks , each of size only p by p. This block decomposition scheme provides a much more efficient means for solving eigenvalue problems with this type of coefficient matrices. It can also be employed for solving linear systems with efficiency because the transformation matrices are explicitly known. In addition, the decoupled structure of the transformed matrix lends itself to parallel computation with coarse-grain parallelism.
In this following, we present our key observations that lead to the proposed block decomposition method for this class of matrices using transformation matrices whose entries are inherent in the special block tridiagonal form of . Whenever there is no confusion, we shall simply use K to denote . Throughout the paper, the operation symbols and are used to denote the matrix direct sum and the Kronecker product.
Theorem 1. When , the block tridiagonal matrix is orthogonally similar to the block diagonal matrix ,
, where , .
Proof. Let and ,
, where is the identity matrix of dimension p and the symbol denotes the Kronecker product. It has been shown in  that
by stating that is orthonormal. This paper will skip the proof and just provide the details to show that is indeed orthonormal. To this end, we need the following formula :
Let and denote and , respectively. We have
Note that, by L’Hospital’s rule,
if . This will be the case when for any integer k. Now since , we have . The
denominator of will be equal to zero only if . When , we have
When , can be simplified as
since , where we have uesd the fact that . Therefore,
Likewise, since , we have
The denominator of will never be equal to zero. Accordingly,
Finally from (3) and (4), we obtain
indicating the vectors are orthonormal. Accordingly, we have
This completes the proof.
Theorem 2. When , the block tridiagonal matrix is similar to the block diagonal matrix , ,
where , .
Proof. This block diagonalization was mentioned previously in Corollary 2 in  without a proof. Unfortunately, the eigenvectors used to form the transformation matrix Q and the decomposed submatrices Dk consist of errors,
in which 1) the vector as stated in that paper should be replaced by with ; and 2) the expression in Dk should read .
In this paper, we give a formal proof with the correct eigenvectors and provide the explicit form of the inverse of the transformation matrix Q for .
Let . Let also
, ; and . We now show that the similarity transformation block-diagonalizes K into D. It deserves mentioning that Q in this case is not orthogonal. , however, exists and is explicitly known. Therefore, It suffices to show that for all .
Applying the identities:
and noting that:
where we have used the fact that , we obtain
Equation (5) holds for all j, . Accordingly, by arranging all together to form the matrix Q, we obtain . In other words, the transformation block-diagonalizes the matrix :
where and .
This is a similarity transformation and, therefore, all eigenvalues of are preserved in the decomposed matrix D. It is worth mentioning that obtaining all the eigenvalues from D is far more efficient than from the original matrix K since D consists of only q diagonal blocks: . When it comes to solving linear systems in the transformed space that involves Q, however, one needs to employ the LU decomposition of Q or to find . Normally, finding the LU decomposition is more efficient and preferred. However, it does not make sense to find the LU decomposition of Q if the inverse of Q is readily available. In the following, we show that can be obtained explicitly.
Let C be the matrix formed by : , whose explicit form is
where the symbol denotes . It is well known that
Let , a diagonal matrix. It can easily be seen that
Recall that ; and , which can be expressed as:
where we have used the following two properties of Kronecker products :
Note that is a diagonal matrix and is almost an identity matrix except the 1st and the last blocks. This shows that is almost the same as Q. Computationally, can be obtained directly from , which is explicitly known, since is simply a block structure of . Note that C is symmetric, but not orthogonal.
Theorem 3. When and , the block tridiagonal matrix is similar to the block diagonal matrix ,
, where , .
Proof. Let ; ,
; and . We have
Using the identities and yields
where we have used the fact that
. Note that the matrix here is not orthogonal
either. It can be shown that exists. The transformation , therefore, block-diagonalizes K into D.
In the following, we show that is almost identical to and, therefore, can be explicitly obtained from without any difficulty. Again, let C be the matrix formed by : , , and as was done in the previos section. We have in this case:
whose inverse is:
where the symbol denotes with . If C is partitioned as , with R consisting of the first rows and r being the last row of C, we clearly see that . In other words, is the
same as except the last column. Now let , a diagonal matrix. We have . Following the same derivation as we have done in Theorem 2, we conclude that:
since . Note that C in this case is neither symmetric nor orthogonal.
3. Numerical Experiments
To demonstrate the validity and advantage of this block decomposition approach, we present numerical experiments for all three cases of the pair discussed in this paper. Our main task in the experiments is to find all the eigenvalues of the following matrix , with and :
The unshown entries in A and B are all zeros. The matrix can be obtained from a finite element discretization of the heat or membrane equation over a rectangular domain, subject to Dirichlet boundary condition along two opposite sides of the boundary . The matrices and can be obtained from discretization of the same problem subject to certain Neumann boundary conditions. The values of p and q depend on the number of grid lines in the discretization domain.
We intentionally keep the dimension of A and B small so that one can easily see that by explicit multiplications. In other words, the matrices A and B do not commute and, therefore, the traditional fast Poisson solver fails to simultaneously decompose A and B. Using the block decomposition approach presented in the previous section, it is not difficult to see that is similar to a block diagonal matrix of the form:
for all three cases where,
with , .
In the following, we present the results from our experiments, which were performed using the Octave software. Specifically, the Octave built-in function is used to obtain the eigenvalues of a square matrix X. For comparison purposes, we first display the eigenvalues computed from the original matrices (Equation (7)) without applying the block decomposition for all three cases. Note that each is a 20 by 20 matrix. The numerical results are listed in Table 1 where all eigenvalues are listed in the order produced by Octave without reordering.
We then present in Table 2, Table 3, and Table 4 the eigenvalues obtained directly from the decomposed diagonal blocks D1 through D5 (Equation (8)) of , , and , respectively, where each is a 4 by 4 matrix. As can be seen from these tables, all eigenvalues are preserved after the block decomposition. For example, all the eigenvalues shown in Table 2 are identical to those of in Table 1, except for the ordering. This is due to the fact that all of the three block decompositions are similarity transformations. The advantage of using the decomposed matrices to compute the eigenvalues is apparent because the diagonal blocks of the decomposed matrix are explicitly known and need only A, B, and , without the need of forming the block-tridiagonal matrix .
To close this section, it deserves mentioning that the computational complexity of finding all eigenvalues of a square matrix of size is in general. For the matrices presented in this paper, . Without decompositions, the computational complexity is . With the proposed
Table 1. Eigenvalues computed from the original matrices without decomposition.
Table 2. Eigenvalues computed directly from D1 through D5 of .
Table 3. Eigenvalues computed directly from D1 through D5 of .
Table 4. Eigenvalues computed directly from D1 through D5 of .
decomposition, the computational complexity reduces to only , a significant saving in computation. The advantage of the decomposition is obvious, not to mention the additional advantage that can be exploited from the coarse-grain parallelism offered by the block decomposition when the problem is to be solved using multiple processors.
In this paper, we have presented a unified block decomposition scheme for three special cases of block tridiagonal matrices of the form , as shown in Equation (1). This class of block tridiagonal matrices arises frequently from the finite difference approximation to solving certain partial differential equations such as the Laplace’s, Poisson’s, or Helmholtz equations using five- or nine-point schemes, over a rectangular or cubic domain . They can also arise from some finite-element discretization of the same equation  and from surface fitting with the B-spline functions . The values of and typically depend on the boundary conditions of the physical problem: for Dirichlet-Dirichlet conditions, and for Dirichlet-Neumann conditions, and for Neumann-Neumann conditions.
The block decompositions presented are all based on similarity transformations with known eigenvectors of tridiagonal matrices of the same form as , with the submatrices A and B replaced by scalars and . We have also derived the explicit decomposed block diagonal matrices for all of the three cases ( , , and ):
where in which , , depend on the values of and , as can be seen in Section 2. The availability of the explicit decomposed matrices offers great computational and programming advantages. Numerical experiments have been conducted using the software Octave to demonstrate its validity and advantages. Although analogous to the classical fast Poisson solver, this approach does not require matrices A and B be symmetric and commute. This approach also exploits large-grain parallelism and lends itself to parallel and distributed computations for finding the solution of both linear systems and eigenvalue problems.