2023-12-23发表2023-12-24更新学习 / 数学 / Applied Numerical Algorithms11 分钟读完 (大约1662个字)0次访问

Eigenvalue iteration, deflation

Two basic Properties

Lemma: Every matrix \(A \in \mathbb{R}^{n \times n}\) has at least one (complex) eigenvalue.

Proof:

Assume \(A\not=0\). Take any \(\mathbf{x} \not= 0 \Rightarrow \{\mathbf{x}, A\mathbf{x}, A^2 \mathbf{x}, \cdots\}\) is linearly dependent.

\(\Rightarrow \exists c_i\) not all zero s.t.

\[ \sum_{k=0}^n c_k A^k \mathbf{x} = 0\\ \]

For \(f(z) = c_0 + c_1 z + \cdots + c_n z^n\).

Fundamental theorem of algebra \(\Rightarrow \exists m\geq 1\) roots \(z_i \in \mathbb{C}\) and \(c\not=0\)

s.t. \(f(z) = c(z-z_1)(z-z_2)\cdots(z-z_m)\).

So,

\[ \mathbf{0} = c_0 \mathbf{x} + c_1 A \mathbf{x} + \cdots + c_n A^n \mathbf{x} = c (A-z_1 I) (A-z_2 I) \cdots (A-z_m I) \mathbf{x}, \mathbf{x} \not= 0\\ \Rightarrow \exists i \text{ s.t. } (A-z_i I) \mathbf{x} = 0, \mathbf{x} \not= 0 \Leftrightarrow A-z_i I \text{ has a null space}\\ \Rightarrow (A-z_i I) \mathbf{v} = 0 \rightarrow A\mathbf{v} = z_i \mathbf{v} \]

Lemma: Eigenvectors corresponding to distinct eigenvalues are linearly independent. (See proof in page 113: Proposition 6.2 in the textbook)

\(\rightarrow\) at most \(n\) eigenvalues.

Example of a matrix with only complex eigenvalues: Rotation matrix.

Diagonalizablity

Nondefective: \(A\in \mathbb{R}^{n \times n}\) is nondefective or diagonalizable if its eigenvectors span \(\mathbb{R}^n\).

\[ A\mathbf{x}_i = \lambda_i \mathbf{x}_i\\ \begin{aligned} A \mathbf{X} &= A \begin{bmatrix} | & | & & | \\ \mathbf{x_1} & \mathbf{x_2} & \cdots & \mathbf{x_n} \\ | & | & & | \end{bmatrix}\\ &= \begin{bmatrix} A \mathbf{x_1} & A \mathbf{x_2} & \cdots & A \mathbf{x_n} \end{bmatrix}\\ &= \begin{bmatrix} \lambda_1 \mathbf{x_1} & \lambda_2 \mathbf{x_2} & \cdots & \lambda_n \mathbf{x_n} \end{bmatrix}\\ &= X \text{diag}(\lambda_1, \lambda_2, \cdots, \lambda_n)\\ &= \mathbf{X} D \end{aligned} \]

If \(A\) is nondefective, then \(\mathbf{X}\) is invertible and \(D = \mathbf{X}^{-1} A \mathbf{X}\).

Motivation for Spectral Theorem

Is there a class of matrices that is guaranteed to be nondefective?

Spoiler alert: Symmetric (and Hermitian) matrices.

Extending to \(\mathbb{C}^{n \times n}\)

Complex conjugate

The complex conjugate of \(z = a + bi\in \mathbb{C}\) is \(\bar{z} = a - bi\).

Conjugate transpose

The conjugate transpose of \(A \in \mathbb{C}^{m \times n}\) is \(A^{\mathbf{H}}:=\bar{A}^T\).

Hermitian

\(A \in \mathbb{C}^{n \times n}\) is Hermitian if \(A = A^{\mathbf{H}}\).

Properties

Lemma: All eigenvalues of a Hermitian matrix are real.

Proof:

\[ A\in \mathbb{C}^{n \times n}, A\mathbf{x} = \lambda \mathbf{x}, ||\mathbf{x}||_2^2 = 1\\ \begin{aligned} \lambda &= \lambda \cdot \mathbf{1} \\&= \lambda \cdot <\mathbf{x}, \mathbf{x}> \\&= <A\mathbf{x}, \mathbf{x}> \\&= (A\mathbf{x})^T \mathbf{\overline{x}} \\&= \mathbf{x}^T A^{T} \mathbf{\overline{x}} \\&= \mathbf{x}^T \overline{\overline{A}^T \mathbf{x}} \\&= \mathbf{x}^T \overline{A \mathbf{x}} \\&= \mathbf{x}^T \overline{\lambda \mathbf{x}} \\ &= \overline{\lambda} \mathbf{x}^T \overline{\mathbf{x}} \\ &= \overline{\lambda} \end{aligned} \\ \Rightarrow \lambda \in \mathbb{R} \]

Lemma: Eigenvectors corresponding to distinct eigenvalues of Hermitian matrices must be orthogonal.

Proof:

\[ A\mathbf{x} = \lambda \mathbf{x}, A\mathbf{y} = \mu \mathbf{y}, \lambda \not= \mu\\ \textrm{Prev} \Rightarrow \lambda, \mu \in \mathbb{R}\\ \begin{aligned} \lambda <\mathbf{x}, \mathbf{y}> &= <\lambda \mathbf{x}, \mathbf{y}> \\ &= <A\mathbf{x}, \mathbf{y}> \\ &= <\mathbf{x}, A \mathbf{y}> \\ &= <\mathbf{x}, \mu \mathbf{y}> \\ &= \mu <\mathbf{x}, \mathbf{y}> \end{aligned}\\ \Rightarrow (\lambda - \mu) <\mathbf{x}, \mathbf{y}> = 0\\ \Rightarrow <\mathbf{x}, \mathbf{y}> = 0 \]

Spectral Theorem

Suppose \(A \in \mathbb{C}^{n \times n}\) is Hermitian (if \(A \in \mathbb{R}^{n \times n}\), suppose \(A\) is symmetric). Then \(A\) has exactly \(n\) orthonormal eigenvectors \(\mathbf{x}_1, \mathbf{x}_2, \cdots, \mathbf{x}_n\) with (possibly repeated) eigenvalues \(\lambda_1, \lambda_2, \cdots, \lambda_n\).

Full set: \(D = X^T A X\).

We can diagonalize any symmetric matrix and find at least one eigenvector of any asymmetric matrix.

Question: How do we find eigenvalues/eigenvectors in practice?

Probably different from linear algebra class.

Find eigenvalues

Setup

\[ A \in \mathbb{R}^{n \times n}, \text{ symmetric}\\ \mathbf{x}_1, \mathbf{x}_2, \cdots, \mathbf{x}_n \in \mathbb{R}^n \text{ eigenvectors}\\ |\lambda_1| \geq |\lambda_2| \geq \cdots \geq |\lambda_n| \text{ eigenvalues}\\ \]

Usual Trick

\[ \mathbf{v} \in \mathbb{R}^n\\ \Downarrow \\ \mathbf{v} = c_1 \mathbf{x}_1 + c_2 \mathbf{x}_2 + \cdots + c_n \mathbf{x}_n\\ \]

Only works for nondefective matrices.

\[ A \mathbf{v} = c_1 A \mathbf{x}_1 + c_2 A \mathbf{x}_2 + \cdots + c_n A \mathbf{x}_n\\ = c_1 \lambda_1 \mathbf{x}_1 + c_2 \lambda_2 \mathbf{x}_2 + \cdots + c_n \lambda_n \mathbf{x}_n\\ A^2 \mathbf{v} = c_1 \lambda_1^2 \mathbf{x}_1 + c_2 \lambda_2^2 \mathbf{x}_2 + \cdots + c_n \lambda_n^2 \mathbf{x}_n\\ \vdots\\ A^k \mathbf{v} = c_1 \lambda_1^k \mathbf{x}_1 + c_2 \lambda_2^k \mathbf{x}_2 + \cdots + c_n \lambda_n^k \mathbf{x}_n\\ = \lambda_1^k (c_1 \mathbf{x}_1 + c_2 (\frac{\lambda_2}{\lambda_1})^k \mathbf{x}_2 + \cdots + c_n (\frac{\lambda_n}{\lambda_1})^k \mathbf{x}_n)\\ \approx \lambda_1^k c_1 \mathbf{x}_1 \]

For large \(k\), \(A^k \mathbf{v} \approx \lambda_1^k c_1 \mathbf{x}_1\).(assuming \(\lambda_2 < \lambda_1\) and \(c_1 \not= 0\))

可以随机初始化\(\mathbf{v}\)，然后迭代\(A^k \mathbf{v}\)，最后得到和\(\mathbf{x}_1\)平行的向量。然后取单位向量，就是\(\mathbf{x}_1\)

Normalized Power Iteration

\[ \mathbf{w}_k = A \mathbf{v}_{k-1}\\ \mathbf{v}_k = \frac{\mathbf{w}_k}{||\mathbf{w}_k||}\\ \]

Eigenvalues of Inverse Matrix

\[ A\mathbf{v} = \lambda \mathbf{v}\Rightarrow A^{-1} \mathbf{v} = \frac{1}{\lambda} \mathbf{v}\\ |\lambda_1| \geq |\lambda_2| \geq \cdots \geq |\lambda_n|\\ \Rightarrow |\frac{1}{\lambda_1}| \leq |\frac{1}{\lambda_2}| \leq \cdots \leq |\frac{1}{\lambda_n}|\\ \]

Inverse Iteration

\[ \mathbf{w}_k = A^{-1} \mathbf{v}_{k-1}\\ \mathbf{v}_k = \frac{\mathbf{w}_k}{||\mathbf{w}_k||}\\ \] 直接求解\(A\mathbf{w}_k = \mathbf{v}_{k-1}\)是\(O(n^3)\)的，但是可以用LU分解，可以做到\(O(n^2)\):

\[ \textrm{Solve } L \mathbf{y_k} = \mathbf{v}_{k-1}\\ \textrm{Solve } U \mathbf{w}_k = \mathbf{y}_k\\ \textrm{Normalize } \mathbf{v}_k = \frac{\mathbf{w}_k}{||\mathbf{w}_k||}\\ \]

Eigenvalues of Shifted Matrix

\[ A\mathbf{v} = \lambda \mathbf{v}\Rightarrow (A-\sigma I) \mathbf{v} = (\lambda - \sigma) \mathbf{v}\\ \]

Shifted Inverse Iteration

To find eigenvalue closest to \(\sigma\):

\[ \mathbf{v}_{k-1} = \frac{(A-\sigma I)^{-1} \mathbf{v}_{k-1}}{||(A-\sigma I)^{-1} \mathbf{v}_{k-1}||}\\ \]

Counterintuitive: Conditioning problems when \(\sigma = \lambda\) are limited.

Heuristic: Convergence Rate

\[ A^k \mathbf{v} = \lambda_1^k (c_1 \mathbf{x}_1 + c_2 (\frac{\lambda_2}{\lambda_1})^k \mathbf{x}_2 + \cdots + c_n (\frac{\lambda_n}{\lambda_1})^k \mathbf{x}_n)\\ \]

要使得收敛速度快，可以选择\(\sigma\)使得:

\[ |\frac{\lambda_2-\sigma}{\lambda_1-\sigma}| < |\frac{\lambda_2}{\lambda_1}|\\ \]

可以在每次迭代中更新\(\sigma\)，使得\(\sigma\)逼近\(\lambda_1\)。

Least-Squares Approximation

If \(\mathbf{v}_0\) is approximately an eigenvector:

\[ \min_{\lambda} ||A\mathbf{v}_0 - \lambda \mathbf{v}_0||_2^2 \\ obj = \mathbf{v}_0^T A^T A \mathbf{v} - 2 \lambda \mathbf{v}_0^T A \mathbf{v} + \lambda^2 \mathbf{v}_0^T \mathbf{v}_0\\ \nabla_{\lambda} obj = -2 \mathbf{v}_0^T A \mathbf{v} + 2 \lambda \mathbf{v}_0^T \mathbf{v}_0 = 0\\ \lambda = \frac{\mathbf{v}_0^T A \mathbf{v}}{||\mathbf{v}_0||_2^2}\\ \]

Rayleigh Quotient Iteration

\[ \mathbf{w}_k = (A-\sigma_k I)^{-1} \mathbf{v}_{k-1}\\ \mathbf{v}_k = \frac{\mathbf{w}_k}{||\mathbf{w}_k||}\\ \sigma_k = \frac{\mathbf{v}_k^T A \mathbf{v}_k}{\mathbf{v}_k^T \mathbf{v}_k}\\ \]

Efficiency per iteration vs. number of iterations?

这个收敛速度比shifted inverse iteration快，但是矩阵每次迭代都更新，因此不能用LU分解，复杂度是\(O(n^3)\)。

Unlikely Failure Mode for Iteration

\[ \mathbf{v} = c_1 \mathbf{x}_1 + c_2 \mathbf{x}_2 + \cdots + c_n \mathbf{x}_n\\A^k \mathbf{v} = \lambda_1^k (c_1 \mathbf{x}_1 + c_2 (\frac{\lambda_2}{\lambda_1})^k \mathbf{x}_2 + \cdots + c_n (\frac{\lambda_n}{\lambda_1})^k \mathbf{x}_n)\\ \]

What happens when \(\mathbf{v}_0 \cdot \mathbf{x}_1 = 0\)?

It converges to \(\lambda_2\).

Strategy for Multiple Eigenvalues

Compute \(\mathbf{x}_0\) via power iteration.
Project \(\mathbf{x}_0\) out of \(\mathbf{v}_0\).
Compute \(\mathbf{x}_1\) via power iteration.
Project \(\text{span}(\mathbf{x}_0, \mathbf{x}_1)\) out of \(\mathbf{v}_0\).(将\(\mathbf{v}_0\)的分量投影到\(\mathbf{x}_0\)和\(\mathbf{x}_1\)的正交补上)
...

Assumption: \(A\) is nondefective.

Another Strategy

\(P\) projects out \(\mathbf{v}_1, \mathbf{v}_2, \cdots, \mathbf{v}_{k}\).

即\(P\mathbf{v}\)把\(\mathbf{v}\)的分量投影到\(\mathbf{v}_1, \mathbf{v}_2, \cdots, \mathbf{v}_{k}\)的正交补上（投影结果与\(\mathbf{v}_1, \mathbf{v}_2, \cdots, \mathbf{v}_{k}\)垂直）。

\[ AP\mathbf{v}_i = \begin{cases} A\cdot 0 = 0 & i \leq k\\ A \mathbf{v}_i = \lambda_i \mathbf{v}_i & i > k \end{cases} \Rightarrow \mathbf{v}_i \cdot \begin{cases} 0 & i \leq k\\ \lambda_i & i > k \end{cases}\\ \]

因此每次要找下一个特征值，可以通过构造\(AP\)并且用power iteration对于\(AP\)进行迭代。

Avoiding Numerical Drift

Do power iteration on \(P^T A P\) where \(P\) projects out known eigenvectors.

Deflation: Modify \(A\) so that power iteration reveals another eigenvector you have not yet computed.

Similarity Transformations

Similar matrices: Two matrices \(A, B \in \mathbb{R}^{n \times n}\) are similar if there exists \(T\) with \(B = T^{-1} A T\).

Proposition: Similar matrices have the same eigenvalues.

Householder Asymmetric Deflation

\[ H\mathbf{x}_1 = \mathbf{e}_1\\ \Rightarrow H A H^T \mathbf{e}_1 = H A \mathbf{x}_1 = H \lambda_1 \mathbf{x}_1 = \lambda_1 \mathbf{e}_1\\ H A H^T = \begin{bmatrix} \lambda_1 & \mathbf{b}^T \\ \mathbf{0} & B \end{bmatrix}\\ \] Similarity transform of \(A\Rightarrow\) same eigenvalues.

Next eigenvalue: Power iteration on \(B\).

Eigenvalue iteration, deflation

https://grahamzen.github.io/2023/12/23/Applied Numerical Algorithms/Eigenvalue-iteration-deflation/

作者

Gehan Zheng

发布于

2023-12-23

更新于

2023-12-24

许可协议

#数学数值计算线性代数

Eigenvalue iteration, deflation

Two basic Properties

Diagonalizablity

Motivation for Spectral Theorem

Extending to \(\mathbb{C}^{n \times n}\)

Properties

Spectral Theorem

Find eigenvalues

Setup

Usual Trick

Normalized Power Iteration

Eigenvalues of Inverse Matrix

Inverse Iteration

Eigenvalues of Shifted Matrix

Shifted Inverse Iteration

Heuristic: Convergence Rate

Least-Squares Approximation

Rayleigh Quotient Iteration

Unlikely Failure Mode for Iteration

Strategy for Multiple Eigenvalues

Another Strategy

Avoiding Numerical Drift

Similarity Transformations

Householder Asymmetric Deflation

作者

发布于

更新于

许可协议

评论

链接

分类

归档

标签

目录

最新文章