2023-12-20发表2023-12-21更新学习 / 数学 / Applied Numerical Algorithms7 分钟读完 (大约987个字)0次访问

Condition number for linear systems

Sensitivity and Conditioning

Gaussian elimination works in theory, but what about floating point precision?

How much can we trust \(x_0\) if \(0<||Ax_0 - b||_2 \ll 1\)(backwards error)?

Perturbation Analysis

How does \(x\) change if we solve \((A + \delta A)x = b + \delta b\)?

Two viewpoints:

Thanks to floating point precision, \(A\) and \(b\) are approximate.
If \(x_0\) isn't the exact solution, what is the backward error?

What is "Small"

Vector norm

A function \(||\cdot||: \mathbb{R}^n \rightarrow [0, \infty)\) satisfying: 1. \(||x|| = 0 \text{ iff } x = 0\) 2. \(||cx|| = |c| \cdot ||x|| \forall c \in R, x \in \mathbb{R}^n\) 3. \(||x+y|| \leq ||x|| + ||y|| \forall x, y \in \mathbb{R}^n\)

Most Common Norm

\[ ||x||_2 = \sqrt{x_1^2 + \cdots + x_n^2} \]

p-Norms

For \(p \geq 1\), \[ ||x||_p = (|x_1|^p + \cdots + |x_n|^p)^{\frac{1}{p}} \] Taxicab/Manhattan norm: \(||x||_1\)

1-norm和2-norm在求解优化问题时的图示在图里的这个例子里，使用不同的norm得到的\(x\)的元素分布不一样。如果使用1-norm，得到的会是有很多0的稀疏向量。如果使用2-norm，结果就是更有可能是稠密的。

How are Norms the Same

Equivalent norms

Two norm \(||\cdot|| \text{ and } ||\cdot||'\) are equivalent if there exist constants \(c_{\text{low}}, c_{\text{high}} > 0\) such that \(c_{\text{low}}||x|| \leq ||x||' \leq c_{\text{high}}||x|| \text{ for all } x\in \mathbb{R}^n\).

Theorem: All norms on \(\mathbb{R}^n\) are equivalent.

Matrix Norms: "Unrolled" Construction

\[ A\in \mathbb{R}^{m\times n} \leftrightarrow a(:) \in \mathbb{R}^{mn}\\ ||A||_{\text{Fro}} = \sqrt{\sum_{ij}a_{ij}^2} \]

Matrix Norms: "Induced" Construction

\[ ||A||:= \max{\{||Ax||: ||x|| = 1\}} \] 可以写成优化问题： \[ \max ||Ax||_2 \text{ s.t. } \frac{1}{2}||x||_2 = \frac{1}{2} \] 使用拉格朗日乘子法： \[ n(x, \lambda) = \frac{1}{2}x^TA^TAx - \lambda x^Tx\\ \nabla n(x, \lambda) = A^TAx - \lambda x = 0\\ \Rrightarrow A^TAx = \lambda x\\ \text{So } \lambda = \text{ largest eigenvalue of } A^TA\\ \begin{aligned} ||A||_2 &= \sqrt{x^TA^TAx}\\ &= \sqrt{\lambda} \sqrt{x^Tx}\\ &= \sqrt{\lambda}\\ &= \sqrt{\text{max eigenvalue of } A^TA} \\ &= \sqrt{\text{biggest singular value of } A} \end{aligned} \]

Other Induced Norms:

\[ ||A||_1 = \max_{j} \sum_{i}|a_{ij}|\\ ||A||_{\infty} = \max_{i} \sum_{j}|a_{ij}|\\ \]

\[ \begin{aligned} ||A||_\infty &= (\max_x ||Ax||_\infty \text{ s.t. } ||x||_\infty = 1) \\ &= (\max_{x, i} |(Ax)_i| \text{ s.t. } |x_j| \leq 1 \forall j)\\ &= \max_{x,i} |\sum_{j} A_{ij}x_j| \text{ s.t. } |x_j| \leq 1 \forall j\\ &\le \max_{x,i} \sum_{j} |A_{ij}||x_j| \text{ s.t. } |x_j| \leq 1 \forall j\\ &\le \max_{x,i} \sum_{j} |A_{ij}| \text{ s.t. } |x_j| \leq 1 \forall j\\ \end{aligned} \]

Take \(x_j = \text{sign}(A_{ij})\), then the less-than-equal-to becomes equal-to.

All matrix norms are equivalent. These are the special cases of vector norms.

\[ \mathbb{R}^{m\times n} \cong \mathbb{R}^{mn}\\ \]

Lemma

Suppose \(||A||\) is induced.

\[ \forall v\neq 0, ||Av|| = ||v|| \cdot ||A\frac{v}{||v||}|| \leq ||A|| \cdot ||v|| \]

\(||A\frac{v}{||v||}|| \leq ||A||\) is the definition of induced norm.

\[ \begin{aligned} ||AB|| &= \max_{||x|| = 1} ||A(Bx)|| \\ & \leq \max_{||x|| = 1} ||A|| \cdot ||Bx|| \\ & = ||A|| ||B|| \end{aligned} \]

Model Problem

\[ \begin{aligned} (A+\epsilon \delta A)\mathbf{x}(\epsilon) &= b + \epsilon \delta b\\ &\downarrow{\frac{d}{d\epsilon}} &\\ \delta A \mathbf{x}(\epsilon) + (A + \epsilon \delta A) \mathbf{x}'(\epsilon) &= \delta b\\ &\downarrow{\epsilon = 0} &\\ \delta A \mathbf{x}(0) + A \mathbf{x}'(0) &= \delta b\\ &\Rightarrow \mathbf{x}'(0) = A^{-1}\delta b - A^{-1}\delta A \mathbf{x}(0)\\ \end{aligned} \]

According to Taylor's formula: \[ x(\epsilon) = x(0) + \epsilon x'(0) + O(\epsilon^2) \]

Relative error: \[ \begin{aligned} \frac{||x(\epsilon) - x(0)|| }{||x(0)||} &= \frac{||\epsilon x'(0) + O(\epsilon^2)||}{||x(0)||} \\ &= \frac{||\epsilon (A^{-1}\delta b - A^{-1}\delta A \mathbf{x}(0)) + O(\epsilon^2)||}{||x(0)||} \\ & \leq \frac{|\epsilon|}{||x(0)||} (||A^{-1}\delta b|| + ||A^{-1}\delta A \mathbf{x}(0)||) + O(\epsilon^2) \\ & \leq \frac{|\epsilon|}{||x(0)||} (||A^{-1}|| \cdot ||\delta b|| + ||A^{-1}|| \cdot ||\delta A|| \cdot ||x(0)||) + O(\epsilon^2) \\ & \leq |\epsilon| \cdot ||A^{-1}|| \cdot ||A|| \cdot (\frac{||\delta b||}{||A|| \cdot ||x(0)||} + \frac{||\delta A||}{||A||}) + O(\epsilon^2) \\ \end{aligned} \] Since \(||A|| \cdot ||x(0)|| \geq ||Ax(0)|| = ||b||\),

\[ \begin{aligned} &\leq |\epsilon| \cdot ||A^{-1}|| \cdot ||A|| \cdot (\frac{||\delta b||}{||b||} + \frac{||\delta A||}{||A||}) + O(\epsilon^2)\\ & = |\epsilon| k D + O(\epsilon^2) \end{aligned} \]

\((\frac{||\delta b||}{||b||} + \frac{||\delta A||}{||A||})\) is the relative error of input.

\(||A^{-1}|| \cdot ||A||\) is the condition number.

\[ ||A^{-1}|| \neq \frac{1}{||A||} \]

Small \(k \Rightarrow x\) stable to changes in \(b, A\).

Condition Number

The condition number of \(A\in \mathbb{R}^{n\times n}\) for a given matrix norm \(||\cdot||\) is \(cond(A) = ||A^{-1}|| \cdot ||A||\).

Relative change: \[ D :=\frac{||\delta b||}{||b||} + \frac{||\delta A||}{||A||}\\ \frac{||x(\epsilon) - x(0)|| }{||x(0)||} \leq \epsilon \cdot D \cdot k + O(\epsilon^2) \] Invariant to scaling (unlike determinant!); one for the identity.

Condition Number of Induced Norms

Because \[ \max_{||x||_2 = 1} ||A^{-1}x||_2 = \frac{1}{\min_{||x||_2 = 1} ||Ax||_2} \] \[ cond(A) = (\max_{\mathbf{x}\neq 0} \frac{||A\mathbf{x}||}{||\mathbf{x}||} )(\min_{\mathbf{y}\neq 0} \frac{||A\mathbf{y}||}{||\mathbf{y}||} )^{-1} \]

Chicken \(\Leftrightarrow\) Egg

Computing \(||A^{-1}||\) typically requires solving \(Ax = b\), but how do we know the reliability of \(\mathbf{x}\)?

What is the condition number of computing the condition number of \(A\)?

What is the condition number of computing what the condition number is of computing the condition number of \(A\)?

Instead

Bound the condition number.

Below: Problem is at least this hard
Above: Problem is at most this hard

Potential for Approximation

\[ ||A^{-1}x|| \leq ||A^{-1}|| \cdot ||x|| \\ \Downarrow \\ cond(A) = ||A||||A^{-1}|| \geq \frac{||A||||A^{-1}x||}{||x||} \]

Condition number for linear systems

https://grahamzen.github.io/2023/12/20/Applied Numerical Algorithms/Condition-number-for-linear-systems/

作者

Gehan Zheng

发布于

2023-12-20

更新于

2023-12-21

许可协议

#数学数值计算

Condition number for linear systems

Sensitivity and Conditioning

Perturbation Analysis

What is "Small"

Vector norm

Most Common Norm

p-Norms

How are Norms the Same

Equivalent norms

Matrix Norms: "Unrolled" Construction

Matrix Norms: "Induced" Construction

Lemma

Model Problem

Condition Number

Condition Number of Induced Norms

Chicken \(\Leftrightarrow\) Egg

Instead

Potential for Approximation

作者

发布于

更新于

许可协议

评论

链接

分类

归档

标签

目录

最新文章