ScaleInvariantAnalysis

This small package provides a number of tools for numerical analysis under conditions where the underlying problems are scale-invariant. At present it is oriented toward the types of problems that appear in mathematical optimization. Under scaling transformations $x \rightarrow s \odot x$ ($\odot$ is the Hadamard product), a Hessian matrix $H$ transforms as $H \rightarrow H \oslash (s \otimes s)$. Therefore, operations like norm(H) are non-sensical. Nevertheless, we work on computers with finite precision, so operations like $Hx$ and $H^{-1} g$ are expected to have some error. This package provides tools for calculating and estimating errors in a scale-covariant manner.

Example

Suppose you have a diagonal Hessian matrix and you estimate its condition number using tools that are not scale-invariant:

julia> using LinearAlgebra

julia> H = [1 0; 0 1e-8];

julia> cond(H)
1.0e8

You might declare this matrix to be "poorly scaled." However, the operations H * x and H \ g both have coordinatewise relative errors of size eps(): there are no delicate cancelations and thus operations involving H reach the full machine precision. This does not seem entirely consistent with common expectations of working with matrices with large condition numbers.

Under a coordinate transformation x → [x[1], x[2]/10^4], H becomes the identity matrix which has a condition number of 1, and this better reflects our actual experience with operations involving H. This package provides a scale-invariant analog of the condition number:

julia> using ScaleInvariantAnalysis

julia> condscale(H)
1.0

(You may have some roundoff error in the last few digits.) This version of the condition number matches our actual experience using H. In contrast,

julia> condscale([1 0.9999; 0.9999 1])
19999.0

remains poorly-conditioned under all scale-transformations of the matrix.

Index of available tools

Reference documentation

ScaleInvariantAnalysis.condscaleMethod
κ = condscale(A; exact=true)

Given a symmetric matrix A, return the condition number of the scaled matrix

A ./ (a .* a')

where a = symscale(A; exact).

This is a scale-invariant estimate of the condition number of A.

source
ScaleInvariantAnalysis.divmagMethod
a, mag = divmag(A, b; cond::Bool=false, kwargs...)

Given a symmetric matrix A and vector b, for x = A \ b return a pair where mag is a naive estimate of the magnitude of sum(abs.(x .* a)). a and mag are scale-covariant in circumstances where A \ b is contravariant. With cond=false, the estimate is based only on the magnitudes of the numbers in A and b, and does not account for the conditioning of A or cancellation in the solution process. Any kwargs are passed to symscale.

This can be used to form scale-invariant estimates of roundoff errors in computations involving A, b, and x.

source
ScaleInvariantAnalysis.matrixscaleMethod
a, b = matrixscale(A; exact=false)

Given a matrix A, return vectors a and b representing the "scale of each axis," so that |A[i,j]| ~ a[i] * b[j] for all i, j. a[i] and b[j] are nonnegative, and are zero only if A[i, j] = 0 for all j or all i, respectively.

With exact=true, a and b solve the optimization problem

min ∑_{i,j : A[i,j] ≠ 0} (log(|A[i,j]| / (a[i] * b[j])))²
s.t. ∑_i nA[i] * log(a[i]) = ∑_j mA[j] * log(b[j])

where nA and mA are the number of nonzeros in each row and column, respectively. Up to multiplication by a scalar, these vectors are covariant under changes of scale but not general linear transformations.

With exact=false, the pattern of nonzeros in A is approximated as u * v', where sum(u) * v[j] = mA[j] and sum(v) * u[i] = nA[i]. This results in an O(m*n) rather than O((m+n)^3) algorithm.

source
ScaleInvariantAnalysis.symscaleMethod
a = symscale(A; exact=false)

Given a matrix A assumed to be symmetric, return a vector a representing the "scale of each axis," so that |A[i,j]| ~ a[i] * a[j] for all i, j. a[i] is nonnegative, and is zero only if A[i, j] = 0 for all j.

With exact=true, a minimizes the objective function

∑_{i,j : A[i,j] ≠ 0} (log(|A[i,j]| / (a[i] * a[j])))²

and is therefore covariant under changes of scale but not general linear transformations.

With exact=false, the pattern of nonzeros in A is approximated as u * u', where sum(u) * u[i] = nz[i] is the number of nonzero in row i. This results in an O(n^2) rather than O(n^3) algorithm.

source