Table of Contents¶

Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code

Iterative Methods for Solving Linear Equations¶

Introduction¶

So far we have considered direct methods for solving linear equations.

Direct methods (flops fixed a priori) vs iterative methods:
- Direct method (GE/LU, Cholesky, QR, SVD): good for dense, small to moderate sized $\mathbf{A}$
- Iterative methods (Jacobi, Gauss-Seidal, SOR, conjugate-gradient, GMRES): good for large, sparse, structured linear system, parallel computing, warm start

PageRank problem¶

$\mathbf{A} \in \{0,1\}^{n \times n}$ the connectivity matrix of webpages with entries $$ \begin{eqnarray*} a_{ij} = \begin{cases} 1 & \text{if page $i$ links to page $j$} \\ 0 & \text{otherwise} \end{cases}. \end{eqnarray*} $$ $n \approx 10^9$ in May 2017.
$r_i = \sum_j a_{ij}$ is the out-degree of page $i$.
Larry Page imagines a random surfer wandering on internet according to following rules:
- From a page $i$ with $r_i>0$
  - with probability $p$, (s)he randomly chooses a link on page $i$ (uniformly) and follows that link to the next page
  - with probability $1-p$, (s)he randomly chooses one page from the set of all $n$ pages (uniformly) and proceeds to that page
- From a page $i$ with $r_i=0$ (a dangling page), (s)he randomly chooses one page from the set of all $n$ pages (uniformly) and proceeds to that page

The process defines a Markov chain on the space of $n$ pages. Stationary distribution of this Markov chain gives the ranks (probabilities) of each page.

Stationary distribution is the top left eigenvector of the transition matrix $\mathbf{P}$ corresponding to eigenvalue 1. Equivalently it can be cast as a linear equation. $$ (\mathbf{I} - \mathbf{P}^T) \mathbf{x} = \mathbf{0}. $$
Gene Golub: Largest matrix computation in world.
GE/LU will take $2 \times (10^9)^3/3/10^{12} \approx 6.66 \times 10^{14}$ seconds $\approx 20$ million years on a tera-flop supercomputer!
Iterative methods come to the rescue.

Jacobi method¶

Solve $\mathbf{A} \mathbf{x} = \mathbf{b}$.

Jacobi iteration: $$ x_i^{(t+1)} = \frac{b_i - \sum_{j=1}^{i-1} a_{ij} x_j^{(t)} - \sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}. $$
With splitting: $\mathbf{A} = \mathbf{L} + \mathbf{D} + \mathbf{U}$, Jacobi iteration can be written as $$ \mathbf{D} \mathbf{x}^{(t+1)} = - (\mathbf{L} + \mathbf{U}) \mathbf{x}^{(t)} + \mathbf{b}, $$ i.e., $$ \mathbf{x}^{(t+1)} = - \mathbf{D}^{-1} (\mathbf{L} + \mathbf{U}) \mathbf{x}^{(t)} + \mathbf{D}^{-1} \mathbf{b} = - \mathbf{D}^{-1} \mathbf{A} \mathbf{x}^{(t)} + \mathbf{x}^{(t)} + \mathbf{D}^{-1} \mathbf{b}. $$
One round costs $2n^2$ flops with an unstructured $\mathbf{A}$. Gain over GE/LU if converges in $o(n)$ iterations. Saving is huge for sparse or structured $\mathbf{A}$. By structured, we mean the matrix-vector multiplication $\mathbf{A} \mathbf{v}$ is fast ($O(n)$ or $O(n \log n)$).

Gauss-Seidel method¶

Gauss-Seidel (GS) iteration: $$ x_i^{(t+1)} = \frac{b_i - \sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}. $$
With splitting, $(\mathbf{D} + \mathbf{L}) \mathbf{x}^{(t+1)} = - \mathbf{U} \mathbf{x}^{(t)} + \mathbf{b}$, i.e., $$ \mathbf{x}^{(t+1)} = - (\mathbf{D} + \mathbf{L})^{-1} \mathbf{U} \mathbf{x}^{(t)} + (\mathbf{D} + \mathbf{L})^{-1} \mathbf{b}. $$
GS converges for any $\mathbf{x}^{(0)}$ for symmetric and pd $\mathbf{A}$.
Convergence rate of Gauss-Seidel is the spectral radius of the $(\mathbf{D} + \mathbf{L})^{-1} \mathbf{U}$.

Successive over-relaxation (SOR)¶

SOR: $$ x_i^{(t+1)} = \omega \left( b_i - \sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \sum_{j=i+1}^n a_{ij} x_j^{(t)} \right) / a_{ii} + (1-\omega) x_i^{(t)}, $$ i.e., $$ \mathbf{x}^{(t+1)} = (\mathbf{D} + \omega \mathbf{L})^{-1} [(1-\omega) \mathbf{D} - \omega \mathbf{U}] \mathbf{x}^{(t)} + (\mathbf{D} + \omega \mathbf{L})^{-1} (\mathbf{D} + \mathbf{L})^{-1} \omega \mathbf{b}. $$
Need to pick $\omega \in [0,1]$ beforehand, with the goal of improving convergence rate.

Conjugate gradient method¶

Conjugate gradient and its variants are the top-notch iterative methods for solving huge, structured linear systems.
A UCLA invention! Hestenes and Steifel in 60s.
Solving $\mathbf{A} \mathbf{x} = \mathbf{b}$ is equivalent to minimizing the quadratic function $\frac{1}{2} \mathbf{x}^T \mathbf{A} \mathbf{x} - \mathbf{b}^T \mathbf{x}$.

Kershaw's results for a fusion problem.

Method	Number of Iterations
Gauss Seidel	208,000
Block SOR methods	765
Incomplete Cholesky conjugate gradient	25

MatrixDepot.jl¶

MatrixDepot.jl is an extensive collection of test matrices in Julia. After installation, we can check available test matrices by

using MatrixDepot

mdinfo()

include group.jl for user defined matrix generators
verify download of index files...
used remote site is https://sparse.tamu.edu/?per_page=All
populating internal database...

# List matrices that are positive definite and sparse:
mdlist(:symmetric & :posdef & :sparse)

2-element Array{String,1}:
 "poisson"
 "wathen"

# Get help on Poisson matrix
mdinfo("poisson")

# Generate a Poisson matrix of dimension n=10
A = matrixdepot("poisson", 10)

100×100 SparseArrays.SparseMatrixCSC{Float64,Int64} with 460 stored entries:
  [1  ,   1]  =  4.0
  [2  ,   1]  =  -1.0
  [11 ,   1]  =  -1.0
  [1  ,   2]  =  -1.0
  [2  ,   2]  =  4.0
  [3  ,   2]  =  -1.0
  [12 ,   2]  =  -1.0
  [2  ,   3]  =  -1.0
  [3  ,   3]  =  4.0
  [4  ,   3]  =  -1.0
  [13 ,   3]  =  -1.0
  [3  ,   4]  =  -1.0
  ⋮
  [98 ,  97]  =  -1.0
  [88 ,  98]  =  -1.0
  [97 ,  98]  =  -1.0
  [98 ,  98]  =  4.0
  [99 ,  98]  =  -1.0
  [89 ,  99]  =  -1.0
  [98 ,  99]  =  -1.0
  [99 ,  99]  =  4.0
  [100,  99]  =  -1.0
  [90 , 100]  =  -1.0
  [99 , 100]  =  -1.0
  [100, 100]  =  4.0

using UnicodePlots
spy(A)

                    Sparsity Pattern
       ┌──────────────────────────────────────────┐    
     1 │⠻⣦⡀⠀⠘⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ > 0
       │⠀⠈⠻⣦⡀⠀⠉⢆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ < 0
       │⠒⢄⠀⠈⠱⣦⡀⠀⠑⢢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠣⢄⠀⠈⠻⣦⡀⠀⠑⠤⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠱⣀⠀⠈⠱⣦⡀⠀⠑⢄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠑⡄⠀⠈⠻⣦⡀⠀⠘⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠈⠑⢄⠀⠈⠱⣦⡀⠀⠉⢆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠒⢄⠀⠈⠻⣦⡀⠀⠑⢢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠣⢄⠀⠈⠛⣤⡀⠀⠑⠤⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠱⣀⠀⠈⠻⣦⡀⠀⠑⢄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⡄⠀⠈⠛⣤⡀⠀⠘⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠑⢄⠀⠈⠻⣦⡀⠀⠉⢆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠒⢄⠀⠈⠛⣤⡀⠀⠑⢢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠣⢄⠀⠈⠻⣦⡀⠀⠑⠤⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠱⣀⠀⠈⠻⢆⡀⠀⠑⢄⡀⠀⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⡄⠀⠈⠻⣦⡀⠀⠘⢄⠀⠀⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠑⢄⠀⠈⠻⢆⡀⠀⠉⢆⠀⠀⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠒⢄⠀⠈⠻⣦⡀⠀⠑⢢⠀⠀│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠣⢄⠀⠈⠻⢆⡀⠀⠑⠤│    
       │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠱⣀⠀⠈⠻⣦⡀⠀│    
   100 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⡄⠀⠈⠻⣦│    
       └──────────────────────────────────────────┘    
       1                                        100
                        nz = 460

# Get help on Wathen matrix
mdinfo("wathen")

# Generate a Wathen matrix of dimension n=5
A = matrixdepot("wathen", 5)

96×96 SparseArrays.SparseMatrixCSC{Float64,Int64} with 1256 stored entries:
  [1 ,  1]  =  1.81938
  [2 ,  1]  =  -1.81938
  [3 ,  1]  =  0.606459
  [12,  1]  =  -1.81938
  [13,  1]  =  -2.42583
  [18,  1]  =  0.606459
  [19,  1]  =  -2.42583
  [20,  1]  =  0.909688
  [1 ,  2]  =  -1.81938
  [2 ,  2]  =  9.70334
  [3 ,  2]  =  -1.81938
  [12,  2]  =  6.06459
  ⋮
  [85, 95]  =  20.6997
  [94, 95]  =  -6.2099
  [95, 95]  =  33.1194
  [96, 95]  =  -6.2099
  [77, 96]  =  3.10495
  [78, 96]  =  -8.27986
  [79, 96]  =  2.06997
  [84, 96]  =  -8.27986
  [85, 96]  =  -6.2099
  [94, 96]  =  2.06997
  [95, 96]  =  -6.2099
  [96, 96]  =  6.2099

spy(A)

                   Sparsity Pattern
      ┌──────────────────────────────────────────┐    
    1 │⠿⣧⡄⠀⠀⢿⡄⠸⢿⣤⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ > 0
      │⠀⠉⠿⣧⣀⠈⣧⡀⠈⠹⣧⣄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ < 0
      │⣤⣄⡀⠘⠛⣤⡘⢣⣤⣀⠀⠛⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⣀⡉⠉⠻⠶⣈⠻⢆⣈⠉⠛⠷⢆⠀⠀⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⠛⣷⣆⡀⠀⢻⡆⠘⢻⣶⡀⠀⢸⣆⠀⠛⣷⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠉⢿⣤⠀⢿⡄⠀⠈⠿⣧⡄⠹⣧⠀⠈⠹⢿⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠀⠈⠉⠀⠈⠑⠲⢶⣄⡉⠱⣦⡉⠒⢶⣤⣈⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⢠⣤⠀⠉⠛⢣⠈⠛⣤⡄⠈⠙⢣⡄⠀⢠⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⠈⠙⢻⣆⡀⠘⣷⡀⠉⢻⣶⣀⠈⢻⣆⠈⠛⣷⣆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠛⠷⠆⠘⠷⣀⡀⠘⠻⢆⡀⠻⢆⡀⠀⠻⠶⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠻⢶⣤⡈⠻⣦⡈⠛⠷⣦⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠶⣦⠀⠈⠱⣦⠈⠱⣦⡄⠈⠉⢶⡄⠰⢶⣤⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠹⢿⣤⡀⠹⣧⡀⠉⠿⣧⣀⠈⢿⡄⠈⠹⣧⣄⡀⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⠃⠀⠘⢣⣄⡀⠘⠛⣤⡀⢣⣤⣀⠀⠛⠃⠀⠀⠀⠀⠀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡉⠛⠷⠤⣈⠻⢆⣈⠙⠷⠦⢄⡀⠀⣀⡀⠀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⣷⣆⡀⠀⢻⣆⠘⢻⣶⡀⠀⠘⣷⠀⠛⣷⣀⠀⠀│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⢿⣤⠀⠹⡇⠀⠈⠿⣧⡄⠸⣧⠀⠈⠹⢿⣤│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠉⠀⠀⠱⢶⣤⣀⡉⠱⣦⡉⠶⣦⣀⣈⠉│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⣤⠀⠉⠛⢣⡌⠛⣤⡄⠈⠙⠛│    
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠙⢻⣆⡀⠈⢻⡀⠉⢻⣶⣀⠀│    
   96 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠛⣷⡆⠘⣷⠀⠀⠘⢻⣶│    
      └──────────────────────────────────────────┘    
      1                                         96
                       nz = 1256

Numerical examples¶

The IterativeSolvers.jl package implements most commonly used iterative solvers.

Generate test matrix¶

using BenchmarkTools, IterativeSolvers, LinearAlgebra, MatrixDepot, Random

Random.seed!(280)

n = 100
# Poisson matrix of dimension n^2=10000, pd and sparse
A = matrixdepot("poisson", n)
@show typeof(A)
# dense matrix representation of A
Afull = convert(Matrix, A)
@show typeof(Afull)
# sparsity level
count(!iszero, A) / length(A)

typeof(A) = SparseArrays.SparseMatrixCSC{Float64,Int64}
typeof(Afull) = Array{Float64,2}

0.000496

spy(A)

                      Sparsity Pattern
         ┌──────────────────────────────────────────┐    
       1 │⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ > 0
         │⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ < 0
         │⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀⠀⠀│    
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⡀⠀│    
   10000 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦│    
         └──────────────────────────────────────────┘    
         1                                      10000
                         nz = 49600

# storage difference
Base.summarysize(A), Base.summarysize(Afull)

(873768, 800000040)

Matrix-vector muliplication¶

# randomly generated vector of length n^2
b = randn(n^2)
# dense matrix-vector multiplication
@benchmark $Afull * $b

BenchmarkTools.Trial: 
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     35.722 ms (0.00% GC)
  median time:      43.154 ms (0.00% GC)
  mean time:        44.512 ms (0.00% GC)
  maximum time:     51.161 ms (0.00% GC)
  --------------
  samples:          113
  evals/sample:     1

# sparse matrix-vector multiplication
@benchmark $A * $b

BenchmarkTools.Trial: 
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     51.993 μs (0.00% GC)
  median time:      93.657 μs (0.00% GC)
  mean time:        105.492 μs (10.11% GC)
  maximum time:     4.137 ms (97.35% GC)
  --------------
  samples:          10000
  evals/sample:     1

Dense solve via Cholesky¶

# record the Cholesky solution
xchol = cholesky(Afull) \ b;

# dense solve via Cholesky
@benchmark cholesky($Afull) \ $b

BenchmarkTools.Trial: 
  memory estimate:  763.02 MiB
  allocs estimate:  8
  --------------
  minimum time:     3.505 s (3.20% GC)
  median time:      3.508 s (1.77% GC)
  mean time:        3.508 s (1.77% GC)
  maximum time:     3.511 s (0.34% GC)
  --------------
  samples:          2
  evals/sample:     1

Jacobi solver¶

It seems that Jacobi solver doesn't give the correct answer.

xjacobi = jacobi(A, b)
@show norm(xjacobi - xchol)

norm(xjacobi - xchol) = 533.315106586665

533.315106586665

Reading documentation we found that the default value of maxiter is 10. A couple trial runs shows that 30000 Jacobi iterations are enough to get an accurate solution.

xjacobi = jacobi(A, b, maxiter=30000)
@show norm(xjacobi - xchol)

norm(xjacobi - xchol) = 0.00012526024863889176

0.00012526024863889176

@benchmark jacobi($A, $b, maxiter=30000)

BenchmarkTools.Trial: 
  memory estimate:  234.72 KiB
  allocs estimate:  9
  --------------
  minimum time:     2.261 s (0.00% GC)
  median time:      2.348 s (0.00% GC)
  mean time:        2.329 s (0.00% GC)
  maximum time:     2.377 s (0.00% GC)
  --------------
  samples:          3
  evals/sample:     1

Gauss-Seidal¶

# Gauss-Seidel solution is fairly close to Cholesky solution after 15000 iters
xgs = gauss_seidel(A, b, maxiter=15000)
@show norm(xgs - xchol)

norm(xgs - xchol) = 0.0001245846635425542

0.0001245846635425542

@benchmark gauss_seidel($A, $b, maxiter=15000)

BenchmarkTools.Trial: 
  memory estimate:  156.55 KiB
  allocs estimate:  8
  --------------
  minimum time:     1.801 s (0.00% GC)
  median time:      1.853 s (0.00% GC)
  mean time:        1.849 s (0.00% GC)
  maximum time:     1.892 s (0.00% GC)
  --------------
  samples:          3
  evals/sample:     1

SOR¶

# symmetric SOR with ω=0.75
xsor = ssor(A, b, 0.75, maxiter=10000)
@show norm(xsor - xchol)

norm(xsor - xchol) = 0.0022945523203956853

0.0022945523203956853

@benchmark sor($A, $b, 0.75, maxiter=10000)

BenchmarkTools.Trial: 
  memory estimate:  547.27 KiB
  allocs estimate:  20010
  --------------
  minimum time:     1.433 s (0.00% GC)
  median time:      1.441 s (0.00% GC)
  mean time:        1.443 s (0.00% GC)
  maximum time:     1.455 s (0.00% GC)
  --------------
  samples:          4
  evals/sample:     1

Conjugate Gradient¶

# conjugate gradient
xcg = cg(A, b)
@show norm(xcg - xchol)

norm(xcg - xchol) = 1.1005386920768871e-5

1.1005386920768871e-5

@benchmark cg($A, $b)

BenchmarkTools.Trial: 
  memory estimate:  391.89 KiB
  allocs estimate:  23
  --------------
  minimum time:     19.614 ms (0.00% GC)
  median time:      23.401 ms (0.00% GC)
  mean time:        23.440 ms (0.18% GC)
  maximum time:     28.527 ms (0.00% GC)
  --------------
  samples:          214
  evals/sample:     1

builtin(#)
1 baart	13 fiedler	25 invhilb	37 parter	49 shaw
2 binomial	14 forsythe	26 invol	38 pascal	50 smallworld
3 blur	15 foxgood	27 kahan	39 pei	51 spikes
4 cauchy	16 frank	28 kms	40 phillips	52 toeplitz
5 chebspec	17 gilbert	29 lehmer	41 poisson	53 tridiag
6 chow	18 golub	30 lotkin	42 prolate	54 triw
7 circul	19 gravity	31 magic	43 randcorr	55 ursell
8 clement	20 grcar	32 minij	44 rando	56 vand
9 companion	21 hadamard	33 moler	45 randsvd	57 wathen
10 deriv2	22 hankel	34 neumann	46 rohess	58 wilkinson
11 dingdong	23 heat	35 oscillate	47 rosser	59 wing
12 erdrey	24 hilb	36 parallax	48 sampling