Eigenvalue iteration, deflation

Two basic Properties

Lemma: Every matrix \(A \in \mathbb{R}^{n \times n}\) has at least one (complex) eigenvalue.

Proof:

Assume \(A\not=0\). Take any \(\mathbf{x} \not= 0 \Rightarrow \{\mathbf{x}, A\mathbf{x}, A^2 \mathbf{x}, \cdots\}\) is linearly dependent.

\(\Rightarrow \exists c_i\) not all zero s.t.

\[ \sum_{k=0}^n c_k A^k \mathbf{x} = 0\\ \]

For \(f(z) = c_0 + c_1 z + \cdots + c_n z^n\).

Fundamental theorem of algebra \(\Rightarrow \exists m\geq 1\) roots \(z_i \in \mathbb{C}\) and \(c\not=0\)

s.t. \(f(z) = c(z-z_1)(z-z_2)\cdots(z-z_m)\).

So,

\[ \mathbf{0} = c_0 \mathbf{x} + c_1 A \mathbf{x} + \cdots + c_n A^n \mathbf{x} = c (A-z_1 I) (A-z_2 I) \cdots (A-z_m I) \mathbf{x}, \mathbf{x} \not= 0\\ \Rightarrow \exists i \text{ s.t. } (A-z_i I) \mathbf{x} = 0, \mathbf{x} \not= 0 \Leftrightarrow A-z_i I \text{ has a null space}\\ \Rightarrow (A-z_i I) \mathbf{v} = 0 \rightarrow A\mathbf{v} = z_i \mathbf{v} \]

Lemma: Eigenvectors corresponding to distinct eigenvalues are linearly independent. (See proof in page 113: Proposition 6.2 in the textbook)

\(\rightarrow\) at most \(n\) eigenvalues.

Example of a matrix with only complex eigenvalues: Rotation matrix.

Diagonalizablity

Nondefective: \(A\in \mathbb{R}^{n \times n}\) is nondefective or diagonalizable if its eigenvectors span \(\mathbb{R}^n\).

\[ A\mathbf{x}_i = \lambda_i \mathbf{x}_i\\ \begin{aligned} A \mathbf{X} &= A \begin{bmatrix} | & | & & | \\ \mathbf{x_1} & \mathbf{x_2} & \cdots & \mathbf{x_n} \\ | & | & & | \end{bmatrix}\\ &= \begin{bmatrix} A \mathbf{x_1} & A \mathbf{x_2} & \cdots & A \mathbf{x_n} \end{bmatrix}\\ &= \begin{bmatrix} \lambda_1 \mathbf{x_1} & \lambda_2 \mathbf{x_2} & \cdots & \lambda_n \mathbf{x_n} \end{bmatrix}\\ &= X \text{diag}(\lambda_1, \lambda_2, \cdots, \lambda_n)\\ &= \mathbf{X} D \end{aligned} \]

If \(A\) is nondefective, then \(\mathbf{X}\) is invertible and \(D = \mathbf{X}^{-1} A \mathbf{X}\).

Motivation for Spectral Theorem

Is there a class of matrices that is guaranteed to be nondefective?

Spoiler alert: Symmetric (and Hermitian) matrices.

Extending to \(\mathbb{C}^{n \times n}\)

Complex conjugate

The complex conjugate of \(z = a + bi\in \mathbb{C}\) is \(\bar{z} = a - bi\).

Conjugate transpose

The conjugate transpose of \(A \in \mathbb{C}^{m \times n}\) is \(A^{\mathbf{H}}:=\bar{A}^T\).

Hermitian

\(A \in \mathbb{C}^{n \times n}\) is Hermitian if \(A = A^{\mathbf{H}}\).

Properties

Lemma: All eigenvalues of a Hermitian matrix are real.

Proof:

\[ A\in \mathbb{C}^{n \times n}, A\mathbf{x} = \lambda \mathbf{x}, ||\mathbf{x}||_2^2 = 1\\ \begin{aligned} \lambda &= \lambda \cdot \mathbf{1} \\&= \lambda \cdot <\mathbf{x}, \mathbf{x}> \\&= <A\mathbf{x}, \mathbf{x}> \\&= (A\mathbf{x})^T \mathbf{\overline{x}} \\&= \mathbf{x}^T A^{T} \mathbf{\overline{x}} \\&= \mathbf{x}^T \overline{\overline{A}^T \mathbf{x}} \\&= \mathbf{x}^T \overline{A \mathbf{x}} \\&= \mathbf{x}^T \overline{\lambda \mathbf{x}} \\ &= \overline{\lambda} \mathbf{x}^T \overline{\mathbf{x}} \\ &= \overline{\lambda} \end{aligned} \\ \Rightarrow \lambda \in \mathbb{R} \]

Lemma: Eigenvectors corresponding to distinct eigenvalues of Hermitian matrices must be orthogonal.

Proof:

\[ A\mathbf{x} = \lambda \mathbf{x}, A\mathbf{y} = \mu \mathbf{y}, \lambda \not= \mu\\ \textrm{Prev} \Rightarrow \lambda, \mu \in \mathbb{R}\\ \begin{aligned} \lambda <\mathbf{x}, \mathbf{y}> &= <\lambda \mathbf{x}, \mathbf{y}> \\ &= <A\mathbf{x}, \mathbf{y}> \\ &= <\mathbf{x}, A \mathbf{y}> \\ &= <\mathbf{x}, \mu \mathbf{y}> \\ &= \mu <\mathbf{x}, \mathbf{y}> \end{aligned}\\ \Rightarrow (\lambda - \mu) <\mathbf{x}, \mathbf{y}> = 0\\ \Rightarrow <\mathbf{x}, \mathbf{y}> = 0 \]

Spectral Theorem

Suppose \(A \in \mathbb{C}^{n \times n}\) is Hermitian (if \(A \in \mathbb{R}^{n \times n}\), suppose \(A\) is symmetric). Then \(A\) has exactly \(n\) orthonormal eigenvectors \(\mathbf{x}_1, \mathbf{x}_2, \cdots, \mathbf{x}_n\) with (possibly repeated) eigenvalues \(\lambda_1, \lambda_2, \cdots, \lambda_n\).

Full set: \(D = X^T A X\).

We can diagonalize any symmetric matrix and find at least one eigenvector of any asymmetric matrix.

Question: How do we find eigenvalues/eigenvectors in practice?

Probably different from linear algebra class.

Find eigenvalues

Setup

\[ A \in \mathbb{R}^{n \times n}, \text{ symmetric}\\ \mathbf{x}_1, \mathbf{x}_2, \cdots, \mathbf{x}_n \in \mathbb{R}^n \text{ eigenvectors}\\ |\lambda_1| \geq |\lambda_2| \geq \cdots \geq |\lambda_n| \text{ eigenvalues}\\ \]

Usual Trick

\[ \mathbf{v} \in \mathbb{R}^n\\ \Downarrow \\ \mathbf{v} = c_1 \mathbf{x}_1 + c_2 \mathbf{x}_2 + \cdots + c_n \mathbf{x}_n\\ \]

Only works for nondefective matrices.

\[ A \mathbf{v} = c_1 A \mathbf{x}_1 + c_2 A \mathbf{x}_2 + \cdots + c_n A \mathbf{x}_n\\ = c_1 \lambda_1 \mathbf{x}_1 + c_2 \lambda_2 \mathbf{x}_2 + \cdots + c_n \lambda_n \mathbf{x}_n\\ A^2 \mathbf{v} = c_1 \lambda_1^2 \mathbf{x}_1 + c_2 \lambda_2^2 \mathbf{x}_2 + \cdots + c_n \lambda_n^2 \mathbf{x}_n\\ \vdots\\ A^k \mathbf{v} = c_1 \lambda_1^k \mathbf{x}_1 + c_2 \lambda_2^k \mathbf{x}_2 + \cdots + c_n \lambda_n^k \mathbf{x}_n\\ = \lambda_1^k (c_1 \mathbf{x}_1 + c_2 (\frac{\lambda_2}{\lambda_1})^k \mathbf{x}_2 + \cdots + c_n (\frac{\lambda_n}{\lambda_1})^k \mathbf{x}_n)\\ \approx \lambda_1^k c_1 \mathbf{x}_1 \]

For large \(k\), \(A^k \mathbf{v} \approx \lambda_1^k c_1 \mathbf{x}_1\).(assuming \(\lambda_2 < \lambda_1\) and \(c_1 \not= 0\))

可以随机初始化\(\mathbf{v}\),然后迭代\(A^k \mathbf{v}\),最后得到和\(\mathbf{x}_1\)平行的向量。然后取单位向量,就是\(\mathbf{x}_1\)

Normalized Power Iteration

\[ \mathbf{w}_k = A \mathbf{v}_{k-1}\\ \mathbf{v}_k = \frac{\mathbf{w}_k}{||\mathbf{w}_k||}\\ \]

Eigenvalues of Inverse Matrix

\[ A\mathbf{v} = \lambda \mathbf{v}\Rightarrow A^{-1} \mathbf{v} = \frac{1}{\lambda} \mathbf{v}\\ |\lambda_1| \geq |\lambda_2| \geq \cdots \geq |\lambda_n|\\ \Rightarrow |\frac{1}{\lambda_1}| \leq |\frac{1}{\lambda_2}| \leq \cdots \leq |\frac{1}{\lambda_n}|\\ \]

Inverse Iteration

\[ \mathbf{w}_k = A^{-1} \mathbf{v}_{k-1}\\ \mathbf{v}_k = \frac{\mathbf{w}_k}{||\mathbf{w}_k||}\\ \] 直接求解\(A\mathbf{w}_k = \mathbf{v}_{k-1}\)\(O(n^3)\)的,但是可以用LU分解,可以做到\(O(n^2)\):

\[ \textrm{Solve } L \mathbf{y_k} = \mathbf{v}_{k-1}\\ \textrm{Solve } U \mathbf{w}_k = \mathbf{y}_k\\ \textrm{Normalize } \mathbf{v}_k = \frac{\mathbf{w}_k}{||\mathbf{w}_k||}\\ \]

Eigenvalues of Shifted Matrix

\[ A\mathbf{v} = \lambda \mathbf{v}\Rightarrow (A-\sigma I) \mathbf{v} = (\lambda - \sigma) \mathbf{v}\\ \]

Shifted Inverse Iteration

To find eigenvalue closest to \(\sigma\):

\[ \mathbf{v}_{k-1} = \frac{(A-\sigma I)^{-1} \mathbf{v}_{k-1}}{||(A-\sigma I)^{-1} \mathbf{v}_{k-1}||}\\ \]

Counterintuitive: Conditioning problems when \(\sigma = \lambda\) are limited.

Heuristic: Convergence Rate

\[ A^k \mathbf{v} = \lambda_1^k (c_1 \mathbf{x}_1 + c_2 (\frac{\lambda_2}{\lambda_1})^k \mathbf{x}_2 + \cdots + c_n (\frac{\lambda_n}{\lambda_1})^k \mathbf{x}_n)\\ \]

要使得收敛速度快,可以选择\(\sigma\)使得:

\[ |\frac{\lambda_2-\sigma}{\lambda_1-\sigma}| < |\frac{\lambda_2}{\lambda_1}|\\ \]

可以在每次迭代中更新\(\sigma\),使得\(\sigma\)逼近\(\lambda_1\)

Least-Squares Approximation

If \(\mathbf{v}_0\) is approximately an eigenvector:

\[ \min_{\lambda} ||A\mathbf{v}_0 - \lambda \mathbf{v}_0||_2^2 \\ obj = \mathbf{v}_0^T A^T A \mathbf{v} - 2 \lambda \mathbf{v}_0^T A \mathbf{v} + \lambda^2 \mathbf{v}_0^T \mathbf{v}_0\\ \nabla_{\lambda} obj = -2 \mathbf{v}_0^T A \mathbf{v} + 2 \lambda \mathbf{v}_0^T \mathbf{v}_0 = 0\\ \lambda = \frac{\mathbf{v}_0^T A \mathbf{v}}{||\mathbf{v}_0||_2^2}\\ \]

Rayleigh Quotient Iteration

\[ \mathbf{w}_k = (A-\sigma_k I)^{-1} \mathbf{v}_{k-1}\\ \mathbf{v}_k = \frac{\mathbf{w}_k}{||\mathbf{w}_k||}\\ \sigma_k = \frac{\mathbf{v}_k^T A \mathbf{v}_k}{\mathbf{v}_k^T \mathbf{v}_k}\\ \]

Efficiency per iteration vs. number of iterations?

这个收敛速度比shifted inverse iteration快,但是矩阵每次迭代都更新,因此不能用LU分解,复杂度是\(O(n^3)\)

Unlikely Failure Mode for Iteration

\[ \mathbf{v} = c_1 \mathbf{x}_1 + c_2 \mathbf{x}_2 + \cdots + c_n \mathbf{x}_n\\A^k \mathbf{v} = \lambda_1^k (c_1 \mathbf{x}_1 + c_2 (\frac{\lambda_2}{\lambda_1})^k \mathbf{x}_2 + \cdots + c_n (\frac{\lambda_n}{\lambda_1})^k \mathbf{x}_n)\\ \]

What happens when \(\mathbf{v}_0 \cdot \mathbf{x}_1 = 0\)?

It converges to \(\lambda_2\).

Strategy for Multiple Eigenvalues

  1. Compute \(\mathbf{x}_0\) via power iteration.
  2. Project \(\mathbf{x}_0\) out of \(\mathbf{v}_0\).
  3. Compute \(\mathbf{x}_1\) via power iteration.
  4. Project \(\text{span}(\mathbf{x}_0, \mathbf{x}_1)\) out of \(\mathbf{v}_0\).(将\(\mathbf{v}_0\)的分量投影到\(\mathbf{x}_0\)\(\mathbf{x}_1\)的正交补上)
  5. ...

Assumption: \(A\) is nondefective.

Another Strategy

\(P\) projects out \(\mathbf{v}_1, \mathbf{v}_2, \cdots, \mathbf{v}_{k}\).

\(P\mathbf{v}\)\(\mathbf{v}\)的分量投影到\(\mathbf{v}_1, \mathbf{v}_2, \cdots, \mathbf{v}_{k}\)的正交补上(投影结果与\(\mathbf{v}_1, \mathbf{v}_2, \cdots, \mathbf{v}_{k}\)垂直)。

\[ AP\mathbf{v}_i = \begin{cases} A\cdot 0 = 0 & i \leq k\\ A \mathbf{v}_i = \lambda_i \mathbf{v}_i & i > k \end{cases} \Rightarrow \mathbf{v}_i \cdot \begin{cases} 0 & i \leq k\\ \lambda_i & i > k \end{cases}\\ \]

因此每次要找下一个特征值,可以通过构造\(AP\)并且用power iteration对于\(AP\)进行迭代。

Avoiding Numerical Drift

Do power iteration on \(P^T A P\) where \(P\) projects out known eigenvectors.

Deflation: Modify \(A\) so that power iteration reveals another eigenvector you have not yet computed.

Similarity Transformations

Similar matrices: Two matrices \(A, B \in \mathbb{R}^{n \times n}\) are similar if there exists \(T\) with \(B = T^{-1} A T\).

Proposition: Similar matrices have the same eigenvalues.

Householder Asymmetric Deflation

\[ H\mathbf{x}_1 = \mathbf{e}_1\\ \Rightarrow H A H^T \mathbf{e}_1 = H A \mathbf{x}_1 = H \lambda_1 \mathbf{x}_1 = \lambda_1 \mathbf{e}_1\\ H A H^T = \begin{bmatrix} \lambda_1 & \mathbf{b}^T \\ \mathbf{0} & B \end{bmatrix}\\ \] Similarity transform of \(A\Rightarrow\) same eigenvalues.

Next eigenvalue: Power iteration on \(B\).

作者

Gehan Zheng

发布于

2023-12-23

更新于

2023-12-24

许可协议

评论