ScaleInvariantAnalysis
This small package provides a number of tools for numerical analysis under conditions where the underlying problems are scale-invariant. At present it is oriented toward the types of problems that appear in mathematical optimization. Under scaling transformations $x \rightarrow s \odot x$ ($\odot$ is the Hadamard product), a Hessian matrix $H$ transforms as $H \rightarrow H \oslash (s \otimes s)$. Therefore, operations like norm(H) are non-sensical. Nevertheless, we work on computers with finite precision, so operations like $Hx$ and $H^{-1} g$ are expected to have some error. This package provides tools for calculating and estimating errors in a scale-covariant manner.
Example
Suppose you have a diagonal Hessian matrix and you estimate its condition number using tools that are not scale-invariant:
julia> using LinearAlgebra
julia> H = [1 0; 0 1e-8];
julia> cond(H)
1.0e8You might declare this matrix to be "poorly scaled." However, the operations H * x and H \ g both have coordinatewise relative errors of size eps(): there are no delicate cancelations and thus operations involving H reach the full machine precision. This does not seem entirely consistent with common expectations of working with matrices with large condition numbers.
Under a coordinate transformation x → [x[1], x[2]/10^4], H becomes the identity matrix which has a condition number of 1, and this better reflects our actual experience with operations involving H. This package provides a scale-invariant analog of the condition number:
julia> using ScaleInvariantAnalysis
julia> condscale(H)
1.0(You may have some roundoff error in the last few digits.) This version of the condition number matches our actual experience using H. In contrast,
julia> condscale([1 0.9999; 0.9999 1])
19999.0remains poorly-conditioned under all scale-transformations of the matrix.
Index of available tools
ScaleInvariantAnalysis.condscaleScaleInvariantAnalysis.divmagScaleInvariantAnalysis.dotabsScaleInvariantAnalysis.matrixscaleScaleInvariantAnalysis.symscale
Reference documentation
ScaleInvariantAnalysis.condscale — Method
κ = condscale(A; exact=true)Given a symmetric matrix A, return the condition number of the scaled matrix
A ./ (a .* a')where a = symscale(A; exact).
This is a scale-invariant estimate of the condition number of A.
ScaleInvariantAnalysis.divmag — Method
a, mag = divmag(A, b; cond::Bool=false, kwargs...)Given a symmetric matrix A and vector b, for x = A \ b return a pair where mag is a naive estimate of the magnitude of sum(abs.(x .* a)). a and mag are scale-covariant in circumstances where A \ b is contravariant. With cond=false, the estimate is based only on the magnitudes of the numbers in A and b, and does not account for the conditioning of A or cancellation in the solution process. Any kwargs are passed to symscale.
This can be used to form scale-invariant estimates of roundoff errors in computations involving A, b, and x.
ScaleInvariantAnalysis.dotabs — Method
dotabs(x, y)Compute the sum of absolute values of elementwise products of x and y:
∑_i |x[i] * y[i]|ScaleInvariantAnalysis.matrixscale — Method
a, b = matrixscale(A; exact=false)Given a matrix A, return vectors a and b representing the "scale of each axis," so that |A[i,j]| ~ a[i] * b[j] for all i, j. a[i] and b[j] are nonnegative, and are zero only if A[i, j] = 0 for all j or all i, respectively.
With exact=true, a and b solve the optimization problem
min ∑_{i,j : A[i,j] ≠ 0} (log(|A[i,j]| / (a[i] * b[j])))²
s.t. ∑_i nA[i] * log(a[i]) = ∑_j mA[j] * log(b[j])where nA and mA are the number of nonzeros in each row and column, respectively. Up to multiplication by a scalar, these vectors are covariant under changes of scale but not general linear transformations.
With exact=false, the pattern of nonzeros in A is approximated as u * v', where sum(u) * v[j] = mA[j] and sum(v) * u[i] = nA[i]. This results in an O(m*n) rather than O((m+n)^3) algorithm.
ScaleInvariantAnalysis.symscale — Method
a = symscale(A; exact=false)Given a matrix A assumed to be symmetric, return a vector a representing the "scale of each axis," so that |A[i,j]| ~ a[i] * a[j] for all i, j. a[i] is nonnegative, and is zero only if A[i, j] = 0 for all j.
With exact=true, a minimizes the objective function
∑_{i,j : A[i,j] ≠ 0} (log(|A[i,j]| / (a[i] * a[j])))²and is therefore covariant under changes of scale but not general linear transformations.
With exact=false, the pattern of nonzeros in A is approximated as u * u', where sum(u) * u[i] = nz[i] is the number of nonzero in row i. This results in an O(n^2) rather than O(n^3) algorithm.