Sweep operator

Definition

  • We learnt Cholesky decomposition and QR decomposition approaches for solving linear regression.

  • The popular statistical software SAS uses sweep operator for linear regression and matrix inversion.

  • Assume $\mathbf{A}$ is symmetric and positive semidefinite.

  • Sweep on the $k$-th diagonal entry $a_{kk} \ne 0$ yields $\widehat A$ with entries $$ \begin{eqnarray*} \widehat a_{kk} &=& - \frac{1}{a_{kk}} \\ \widehat a_{ik} &=& \frac{a_{ik}}{a_{kk}} \\ \widehat a_{kj} &=& \frac{a_{kj}}{a_{kk}} \\ \widehat a_{ij} &=& a_{ij} - \frac{a_{ik} a_{kj}}{a_{kk}}, \quad i \ne k, j \ne k. \end{eqnarray*} $$ $n^2$ flops (taking into account of symmetry).

  • Inverse sweep sends $\mathbf{A}$ to $\check A$ with entries $$ \begin{eqnarray*} \check a_{kk} &=& - \frac{1}{a_{kk}} \\ \check a_{ik} &=& - \frac{a_{ik}}{a_{kk}} \\ \check a_{kj} &=& - \frac{a_{kj}}{a_{kk}} \\ \check a_{ij} &=& a_{ij} - \frac{a_{ik}a_{kj}}{a_{kk}}, \quad i \ne k, j \ne k. \end{eqnarray*} $$ $n^2$ flops (taking into account of symmetry).

  • $\check{\hat{\mathbf{A}}} = \mathbf{A}$.

  • Successively sweeping all diagonal entries of $\mathbf{A}$ yields $- \mathbf{A}^{-1}$.

  • Exercise: invert a $2 \times 2$ matrix, say $$ \mathbf{A} = \begin{pmatrix} 4 & 3 \\ 3 & 2 \end{pmatrix}, $$ on paper using sweep operator.

  • Block form of sweep: Let the symmetric matrix $\mathbf{A}$ be partitioned as $$ \mathbf{A} = \begin{pmatrix} \mathbf{A}_{11} & \mathbf{A}_{12} \\ \mathbf{A}_{21} & \mathbf{A}_{22} \end{pmatrix}. $$ If possible, sweep on the diagonal entries of $\mathbf{A}_{11}$ yields

$$ \begin{pmatrix} - \mathbf{A}_{11}^{-1} & \mathbf{A}_{11}^{-1} \mathbf{A}_{12} \\ \mathbf{A}_{21} \mathbf{A}_{11}^{-1} & \mathbf{A}_{22} - \mathbf{A}_{21} \mathbf{A}_{11}^{-1} \mathbf{A}_{12} \end{pmatrix}. $$


Order dose not matter. The block $\mathbf{A}_{22} - \mathbf{A}_{21} \mathbf{A}_{11}^{-1} \mathbf{A}_{12}$ is recognized as the Schur complement of $\mathbf{A}_{11}$.

  • Pd and determinant: $\mathbf{A}$ is pd if and only if each diagonal entry can be swept in succession and is positive until it is swept. When a diagonal entry of a pd matrix $\mathbf{A}$ is swept, it becomes negative and remains negative thereafter. Taking the product of diagonal entries just before each is swept yields the determinant of $\mathbf{A}$.

Applications

Linear regression (as done in SAS)

Sweep on the Gram matrix $$ \begin{pmatrix} \mathbf{X}^T \mathbf{X} & \mathbf{X}^T \mathbf{y} \\ \mathbf{y}^T \mathbf{X} & \mathbf{y}^T \mathbf{y} \end{pmatrix} $$
yields

$$ \begin{eqnarray*} \begin{pmatrix} - (\mathbf{X}^T \mathbf{X})^{-1} & (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y} \\ \mathbf{y}^T \mathbf{X} (\mathbf{X}^T \mathbf{X})^{-1} & \mathbf{y}^T \mathbf{y} - \mathbf{y}^T \mathbf{X} (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y} \end{pmatrix} = \begin{pmatrix} - \sigma^{-2} \text{Cov}(\beta) & \beta \\ \beta^T & \|\mathbf{y} - \hat y\|_2^2 \end{pmatrix}. \end{eqnarray*} $$


In total $np^2 + p^3$ flops.

Invert a matrix in place

$n^3$ flops. Recall that inversion by Cholesky costs $(1/3)n^3 + (4/3) n^3 = (5/3) n^3$ flops and needs to allocate a matrix of same size.

Conditional multivariate normal density calculation

Stepwise regression

MANOVA

Implementation

In [1]:
# Pkg.add("SweepOperator")
using SweepOperator

srand(280)

X = randn(5, 3) # predictor matrix
y = randn(5)    # response vector

# form the augmented Gram matrix
G = [X y]' * [X y]
Out[1]:
4×4 Array{Float64,2}:
  9.589    -4.98208  -5.70052  -1.83933
 -4.98208   4.94476   3.81663   2.82462
 -5.70052   3.81663   5.45442   2.46008
 -1.83933   2.82462   2.46008   4.98343
In [2]:
sweep!(G, 1:3)
Out[2]:
4×4 Array{Float64,2}:
 -0.312747  -0.136595  -0.231278  0.379547
 -4.98208   -0.499383   0.206676  0.650887
 -5.70052    3.81663   -0.569668  0.39225 
 -1.83933    2.82462    2.46008   2.87807 
In [3]:
# least squares solution by QR
X \ y
Out[3]:
3-element Array{Float64,1}:
 0.379547
 0.650887
 0.39225 
In [4]:
# inverse sweep to restore original matrix
sweep!(G, 1:3, true)
Out[4]:
4×4 Array{Float64,2}:
  9.589    -4.98208  -5.70052  -1.83933
 -4.98208   4.94476   3.81663   2.82462
 -5.70052   3.81663   5.45442   2.46008
 -1.83933   2.82462   2.46008   4.98343

Further reading

In [5]:
versioninfo()
Julia Version 0.6.2
Commit d386e40c17 (2017-12-13 18:08 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, skylake)