QR iteration

Assumption: \(A \in \mathbb{R}^{n \times n}\) is symmetric, no repeated eigenvalues \(\lambda\)s.

QR iteration

\[ (Q, R) \Leftarrow \text{QR}(A)\\ A \Leftarrow RQ \]

\[ \begin{aligned} A_1 &= A\\ \text{Factor} \quad A_k &= Q_k R_k\\ \text{Multiply} \quad A_{k+1} &= R_k Q_k\\ R_k Q_k &= Q_{k+1} R_{k+1}\\ \end{aligned} \]

Commutativity

Lemma: Take \(A, B \in \mathbb{R}^{n \times n}\). Suppose that the eigenvectors of \(A\) span \(\mathbb{R}^n\) and have distinct eigenvalues. Then, \(AB = BA\) if and only if \(A\) and \(B\) have the same set of eigenvectors (with possibly different eigenvalues).

Proof:

Suppose \(A, B\) have eigenvectors \(\mathbf{x}_1, \dots, \mathbf{x}_n\) with eigenvalues \(\lambda_1^{A}, \dots, \lambda_n^{A}\) and \(\lambda_1^{B}, \dots, \lambda_n^{B}\). Take \(y\in \mathbb{R}^n \Rightarrow y = \sum_{i=1}^n a_i \mathbf{x}_i\).

\[ BAy = BA\sum_{i=1}^n a_i \mathbf{x}_i = \sum_{i=1}^n a_i BA \mathbf{x}_i = \sum_{i=1}^n a_i \lambda_i^A B \mathbf{x}_i = \sum_{i=1}^n a_i \lambda_i^A \lambda_i^B \mathbf{x}_i\\ ABy = AB\sum_{i=1}^n a_i \mathbf{x}_i = \sum_{i=1}^n a_i AB \mathbf{x}_i = \sum_{i=1}^n a_i \lambda_i^B A \mathbf{x}_i = \sum_{i=1}^n a_i \lambda_i^A \lambda_i^B \mathbf{x}_i\\ \] So, \(BAy = ABy\) for all \(y \in \mathbb{R}^n \Rightarrow BA = AB\).

Suppose \(AB = BA\), Take \(A\mathbf{x} = \lambda \mathbf{x}\). \[ \Rightarrow A(B\mathbf{x}) = B(A\mathbf{x}) = B(\lambda \mathbf{x}) = \lambda B\mathbf{x}\\ 1. B\mathbf{x} \neq 0 \Rightarrow B\mathbf{x} \text{ is an eigenvector of } A \text{ with eigenvalue } \lambda.\\ B\mathbf{x} = c\mathbf{x} \text{ for some } c \in \mathbb{R}.\\ \]

The last equality is because there is an assumption that \(A\) has no repeated eigenvalues. \(B\mathbf{x}\) and \(\mathbf{x}\) are both eigenvectors of \(A\) with the same eigenvalue \(\lambda\). So, \(B\mathbf{x} = c\mathbf{x}\) for some \(c \in \mathbb{R}\).

\[ 2. B\mathbf{x} = 0 \Rightarrow \mathbf{x} \text{ is an eigenvector of } B \text{ with eigenvalue } 0. \]

So, for any eigenvector \(\mathbf{x}\) of \(A\), \(\mathbf{x}\) is an eigenvector of \(B\).

If QR iteration converges

\[ A_{\infty} = Q_{\infty} R_{\infty} = R_{\infty} Q_{\infty} \leftrightarrow A_{\infty} = Q_{\infty}^T A_{\infty} Q_{\infty} \Rightarrow Q_{\infty} A_{\infty} = A_{\infty} Q_{\infty} \]

Viewpoint 1

The eigenvalues of \(A\) equal the diagonal elements of \(R_{\infty}\)

Claim: \(Q_{\infty}\) has all eigenvalues w/ magnitude 1.

Suppose \(A_{\infty} \mathbf{x} = \lambda \mathbf{x} = Q_{\infty} R_{\infty} \mathbf{x} = R_{\infty} Q_{\infty} \mathbf{x} = \pm R_{\infty} \mathbf{x}\).

It is easy to prove that determinant of a triangular matrix is the product of its diagonal entries. So, The eigenvalues of \(A_{\infty}-\) and hence the eigenvalues of \(A-\) equal the diagonal elements of \(R_{\infty}\) up to sign.

Viewpoint 2

\(A_k = QR\), \(Q\) is a set of near-eigenvectors of \(A\)

Suppose the columns of \(A\) are given by \(\vec{a}_1, \dots, \vec{a}_n\), and consider the matrix \(A_k\) for large \(k\). We can write:

\[ A^k = A^{k-1} \cdot A = \begin{bmatrix} | & | & & | \\ A^{k-1} \vec{a}_1 & A^{k-1} \vec{a}_2 & \cdots & A^{k-1} \vec{a}_n \\ | & | & & | \end{bmatrix} \]

By our derivation of power iteration, in the absence of degeneracies, the first column of \(A^k\) will become more and more parallel to the eigenvector \(\vec{x}_1\) of A with largest magnitude \(|\lambda_1|\) as \(k\rightarrow \infty\), since we took a vector \(\vec{a}_1\) and multiplied it by \(A\) many times.

\(A^k\)的第二列投影到第一列的正交补上,然后左乘\(A^{k-1}\),这样就会收敛到第二大的特征值对应的特征向量上。这正好和QR分解的步骤一致。因此,\(A_k = QR\)可以得到\(Q\)的列向量是\(A\)的特征向量的近似,并且是按照特征值的大小从大到小排列,而\(R\)的对角线元素是特征值的近似。

\[ \begin{aligned} A^1 &= Q_1 R_1\\ A^2 &= Q_1 R_1 Q_1 R_1 = Q_1 Q_2 R_2 R_1\\ A^3 &= Q_1 Q_2 R_2 R_1 Q_1 R_1 = Q_1 Q_2 R_2 Q_2 R_2 R_1 = Q_1 Q_2 Q_3 R_3 R_2 R_1\\ \vdots\\ A^k &= Q_1 Q_2 \cdots Q_k R_k R_{k-1} \cdots R_1\\ A^k &= Q_{\infty}^{''} R_{\infty}^{''} \end{aligned} \]

因此按照之前关于Power Iteration的讨论,\(Q_{\infty}^{''}=Q_1 Q_2 \cdots Q_k\)会收敛到\(A\)的特征向量上,而\(R_{\infty}^{''}=R_k R_{k-1} \cdots R_1\)会收敛到\(A\)的特征值上。

\[ \begin{aligned} A_1 &= A\\ A_2 &= R_1 Q_1 = Q_1^T A Q_1 (R_k = Q_k^T A)\\ A_3 &= R_2 Q_2 = Q_2^T A_2 Q_2 = Q_2^T Q_1^T A Q_1 Q_2 \\ \vdots\\ A_k &= R_k Q_k = (Q_1 \cdots Q_k)^T A (Q_1 \cdots Q_k)\rightarrow \text{diag}(\lambda) \end{aligned} \]

由于\(Q_1 \cdots Q_k\)收敛到\(A\)的特征向量上,根据对角化的定义,\(A_k\)收敛到\(A\)的特征值上。

In practice

  • Tridiagonalize before eigenvalue iteration
  • More advanced Krylov subspace methods
  • Use existing software as you can!
作者

Gehan Zheng

发布于

2023-12-23

更新于

2023-12-24

许可协议

评论