In [1]:
versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code

Iterative Methods for Solving Linear Equations

Introduction

So far we have considered direct methods for solving linear equations.

  • Direct methods (flops fixed a priori) vs iterative methods:
    • Direct method (GE/LU, Cholesky, QR, SVD): good for dense, small to moderate sized $\mathbf{A}$
    • Iterative methods (Jacobi, Gauss-Seidal, SOR, conjugate-gradient, GMRES): good for large, sparse, structured linear system, parallel computing, warm start

PageRank problem

  • $\mathbf{A} \in \{0,1\}^{n \times n}$ the connectivity matrix of webpages with entries $$ \begin{eqnarray*} a_{ij} = \begin{cases} 1 & \text{if page $i$ links to page $j$} \\ 0 & \text{otherwise} \end{cases}. \end{eqnarray*} $$ $n \approx 10^9$ in May 2017.

  • $r_i = \sum_j a_{ij}$ is the out-degree of page $i$.

  • Larry Page imagines a random surfer wandering on internet according to following rules:

    • From a page $i$ with $r_i>0$
      • with probability $p$, (s)he randomly chooses a link on page $i$ (uniformly) and follows that link to the next page
      • with probability $1-p$, (s)he randomly chooses one page from the set of all $n$ pages (uniformly) and proceeds to that page
    • From a page $i$ with $r_i=0$ (a dangling page), (s)he randomly chooses one page from the set of all $n$ pages (uniformly) and proceeds to that page

The process defines a Markov chain on the space of $n$ pages. Stationary distribution of this Markov chain gives the ranks (probabilities) of each page.

  • Stationary distribution is the top left eigenvector of the transition matrix $\mathbf{P}$ corresponding to eigenvalue 1. Equivalently it can be cast as a linear equation. $$ (\mathbf{I} - \mathbf{P}^T) \mathbf{x} = \mathbf{0}. $$

  • Gene Golub: Largest matrix computation in world.

  • GE/LU will take $2 \times (10^9)^3/3/10^{12} \approx 6.66 \times 10^{14}$ seconds $\approx 20$ million years on a tera-flop supercomputer!

  • Iterative methods come to the rescue.

Jacobi method

Solve $\mathbf{A} \mathbf{x} = \mathbf{b}$.

  • Jacobi iteration: $$ x_i^{(t+1)} = \frac{b_i - \sum_{j=1}^{i-1} a_{ij} x_j^{(t)} - \sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}. $$

  • With splitting: $\mathbf{A} = \mathbf{L} + \mathbf{D} + \mathbf{U}$, Jacobi iteration can be written as $$ \mathbf{D} \mathbf{x}^{(t+1)} = - (\mathbf{L} + \mathbf{U}) \mathbf{x}^{(t)} + \mathbf{b}, $$ i.e., $$ \mathbf{x}^{(t+1)} = - \mathbf{D}^{-1} (\mathbf{L} + \mathbf{U}) \mathbf{x}^{(t)} + \mathbf{D}^{-1} \mathbf{b} = - \mathbf{D}^{-1} \mathbf{A} \mathbf{x}^{(t)} + \mathbf{x}^{(t)} + \mathbf{D}^{-1} \mathbf{b}. $$

  • One round costs $2n^2$ flops with an unstructured $\mathbf{A}$. Gain over GE/LU if converges in $o(n)$ iterations. Saving is huge for sparse or structured $\mathbf{A}$. By structured, we mean the matrix-vector multiplication $\mathbf{A} \mathbf{v}$ is fast ($O(n)$ or $O(n \log n)$).

Gauss-Seidel method

  • Gauss-Seidel (GS) iteration: $$ x_i^{(t+1)} = \frac{b_i - \sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}. $$

  • With splitting, $(\mathbf{D} + \mathbf{L}) \mathbf{x}^{(t+1)} = - \mathbf{U} \mathbf{x}^{(t)} + \mathbf{b}$, i.e., $$ \mathbf{x}^{(t+1)} = - (\mathbf{D} + \mathbf{L})^{-1} \mathbf{U} \mathbf{x}^{(t)} + (\mathbf{D} + \mathbf{L})^{-1} \mathbf{b}. $$

  • GS converges for any $\mathbf{x}^{(0)}$ for symmetric and pd $\mathbf{A}$.

  • Convergence rate of Gauss-Seidel is the spectral radius of the $(\mathbf{D} + \mathbf{L})^{-1} \mathbf{U}$.

Successive over-relaxation (SOR)

  • SOR: $$ x_i^{(t+1)} = \omega \left( b_i - \sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \sum_{j=i+1}^n a_{ij} x_j^{(t)} \right) / a_{ii} + (1-\omega) x_i^{(t)}, $$ i.e., $$ \mathbf{x}^{(t+1)} = (\mathbf{D} + \omega \mathbf{L})^{-1} [(1-\omega) \mathbf{D} - \omega \mathbf{U}] \mathbf{x}^{(t)} + (\mathbf{D} + \omega \mathbf{L})^{-1} (\mathbf{D} + \mathbf{L})^{-1} \omega \mathbf{b}. $$

  • Need to pick $\omega \in [0,1]$ beforehand, with the goal of improving convergence rate.

Conjugate gradient method

  • Conjugate gradient and its variants are the top-notch iterative methods for solving huge, structured linear systems.

  • A UCLA invention! Hestenes and Steifel in 60s.

  • Solving $\mathbf{A} \mathbf{x} = \mathbf{b}$ is equivalent to minimizing the quadratic function $\frac{1}{2} \mathbf{x}^T \mathbf{A} \mathbf{x} - \mathbf{b}^T \mathbf{x}$.

Kershaw's results for a fusion problem.

Method Number of Iterations
Gauss Seidel 208,000
Block SOR methods 765
Incomplete Cholesky conjugate gradient 25

MatrixDepot.jl

MatrixDepot.jl is an extensive collection of test matrices in Julia. After installation, we can check available test matrices by

In [2]:
using MatrixDepot

mdinfo()
include group.jl for user defined matrix generators
verify download of index files...
used remote site is https://sparse.tamu.edu/?per_page=All
populating internal database...
Out[2]:

Currently loaded Matrices

builtin(#)
1 baart 13 fiedler 25 invhilb 37 parter 49 shaw
2 binomial 14 forsythe 26 invol 38 pascal 50 smallworld
3 blur 15 foxgood 27 kahan 39 pei 51 spikes
4 cauchy 16 frank 28 kms 40 phillips 52 toeplitz
5 chebspec 17 gilbert 29 lehmer 41 poisson 53 tridiag
6 chow 18 golub 30 lotkin 42 prolate 54 triw
7 circul 19 gravity 31 magic 43 randcorr 55 ursell
8 clement 20 grcar 32 minij 44 rando 56 vand
9 companion 21 hadamard 33 moler 45 randsvd 57 wathen
10 deriv2 22 hankel 34 neumann 46 rohess 58 wilkinson
11 dingdong 23 heat 35 oscillate 47 rosser 59 wing
12 erdrey 24 hilb 36 parallax 48 sampling
user(#)
Groups
all local eigen illcond posdef regprob symmetric
builtin user graph inverse random sparse
Suite Sparse of
1 2833
MatrixMarket of
0 498
In [3]:
# List matrices that are positive definite and sparse:
mdlist(:symmetric & :posdef & :sparse)
Out[3]:
2-element Array{String,1}:
 "poisson"
 "wathen" 
In [4]:
# Get help on Poisson matrix
mdinfo("poisson")
Out[4]:

Poisson Matrix (poisson)

A block tridiagonal matrix from Poisson’s equation. This matrix is sparse, symmetric positive definite and has known eigenvalues. The result is of type SparseMatrixCSC.

Input options:

  • [type,] dim: the dimension of the matirx is dim^2.

Groups: ["inverse", "symmetric", "posdef", "eigen", "sparse"]

References:

G. H. Golub and C. F. Van Loan, Matrix Computations, second edition, Johns Hopkins University Press, Baltimore, Maryland, 1989 (Section 4.5.4).

In [5]:
# Generate a Poisson matrix of dimension n=10
A = matrixdepot("poisson", 10)
Out[5]:
100×100 SparseArrays.SparseMatrixCSC{Float64,Int64} with 460 stored entries:
  [1  ,   1]  =  4.0
  [2  ,   1]  =  -1.0
  [11 ,   1]  =  -1.0
  [1  ,   2]  =  -1.0
  [2  ,   2]  =  4.0
  [3  ,   2]  =  -1.0
  [12 ,   2]  =  -1.0
  [2  ,   3]  =  -1.0
  [3  ,   3]  =  4.0
  [4  ,   3]  =  -1.0
  [13 ,   3]  =  -1.0
  [3  ,   4]  =  -1.0
  ⋮
  [98 ,  97]  =  -1.0
  [88 ,  98]  =  -1.0
  [97 ,  98]  =  -1.0
  [98 ,  98]  =  4.0
  [99 ,  98]  =  -1.0
  [89 ,  99]  =  -1.0
  [98 ,  99]  =  -1.0
  [99 ,  99]  =  4.0
  [100,  99]  =  -1.0
  [90 , 100]  =  -1.0
  [99 , 100]  =  -1.0
  [100, 100]  =  4.0
In [6]:
using UnicodePlots
spy(A)
Out[6]:
                    Sparsity Pattern
       ┌──────────────────────────────────────────┐    
     1⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ > 0
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ < 0
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
   100⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       └──────────────────────────────────────────┘    
       1                                        100
                        nz = 460
In [7]:
# Get help on Wathen matrix
mdinfo("wathen")
Out[7]:

Wathen Matrix (wathen)

Wathen Matrix is a sparse, symmetric positive, random matrix arose from the finite element method. The generated matrix is the consistent mass matrix for a regular nx-by-ny grid of 8-nodes.

Input options:

  • [type,] nx, ny: the dimension of the matrix is equal to 3 * nx * ny + 2 * nx * ny + 1.
  • [type,] n: nx = ny = n.

Groups: ["symmetric", "posdef", "eigen", "random", "sparse"]

References:

A. J. Wathen, Realistic eigenvalue bounds for the Galerkin mass matrix, IMA J. Numer. Anal., 7 (1987), pp. 449-457.

In [8]:
# Generate a Wathen matrix of dimension n=5
A = matrixdepot("wathen", 5)
Out[8]:
96×96 SparseArrays.SparseMatrixCSC{Float64,Int64} with 1256 stored entries:
  [1 ,  1]  =  1.81938
  [2 ,  1]  =  -1.81938
  [3 ,  1]  =  0.606459
  [12,  1]  =  -1.81938
  [13,  1]  =  -2.42583
  [18,  1]  =  0.606459
  [19,  1]  =  -2.42583
  [20,  1]  =  0.909688
  [1 ,  2]  =  -1.81938
  [2 ,  2]  =  9.70334
  [3 ,  2]  =  -1.81938
  [12,  2]  =  6.06459
  ⋮
  [85, 95]  =  20.6997
  [94, 95]  =  -6.2099
  [95, 95]  =  33.1194
  [96, 95]  =  -6.2099
  [77, 96]  =  3.10495
  [78, 96]  =  -8.27986
  [79, 96]  =  2.06997
  [84, 96]  =  -8.27986
  [85, 96]  =  -6.2099
  [94, 96]  =  2.06997
  [95, 96]  =  -6.2099
  [96, 96]  =  6.2099
In [9]:
spy(A)
Out[9]:
                   Sparsity Pattern
      ┌──────────────────────────────────────────┐    
    1⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ > 0
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ < 0
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
     ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
   96⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      └──────────────────────────────────────────┘    
      1                                         96
                       nz = 1256

Numerical examples

The IterativeSolvers.jl package implements most commonly used iterative solvers.

Generate test matrix

In [10]:
using BenchmarkTools, IterativeSolvers, LinearAlgebra, MatrixDepot, Random

Random.seed!(280)

n = 100
# Poisson matrix of dimension n^2=10000, pd and sparse
A = matrixdepot("poisson", n)
@show typeof(A)
# dense matrix representation of A
Afull = convert(Matrix, A)
@show typeof(Afull)
# sparsity level
count(!iszero, A) / length(A)
typeof(A) = SparseArrays.SparseMatrixCSC{Float64,Int64}
typeof(Afull) = Array{Float64,2}
Out[10]:
0.000496
In [11]:
spy(A)
Out[11]:
                      Sparsity Pattern
         ┌──────────────────────────────────────────┐    
       1⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ > 0
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ < 0
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
        ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
   10000⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
         └──────────────────────────────────────────┘    
         1                                      10000
                         nz = 49600
In [12]:
# storage difference
Base.summarysize(A), Base.summarysize(Afull)
Out[12]:
(873768, 800000040)

Matrix-vector muliplication

In [13]:
# randomly generated vector of length n^2
b = randn(n^2)
# dense matrix-vector multiplication
@benchmark $Afull * $b
Out[13]:
BenchmarkTools.Trial: 
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     35.722 ms (0.00% GC)
  median time:      43.154 ms (0.00% GC)
  mean time:        44.512 ms (0.00% GC)
  maximum time:     51.161 ms (0.00% GC)
  --------------
  samples:          113
  evals/sample:     1
In [14]:
# sparse matrix-vector multiplication
@benchmark $A * $b
Out[14]:
BenchmarkTools.Trial: 
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     51.993 μs (0.00% GC)
  median time:      93.657 μs (0.00% GC)
  mean time:        105.492 μs (10.11% GC)
  maximum time:     4.137 ms (97.35% GC)
  --------------
  samples:          10000
  evals/sample:     1

Dense solve via Cholesky

In [15]:
# record the Cholesky solution
xchol = cholesky(Afull) \ b;
In [16]:
# dense solve via Cholesky
@benchmark cholesky($Afull) \ $b
Out[16]:
BenchmarkTools.Trial: 
  memory estimate:  763.02 MiB
  allocs estimate:  8
  --------------
  minimum time:     3.505 s (3.20% GC)
  median time:      3.508 s (1.77% GC)
  mean time:        3.508 s (1.77% GC)
  maximum time:     3.511 s (0.34% GC)
  --------------
  samples:          2
  evals/sample:     1

Jacobi solver

It seems that Jacobi solver doesn't give the correct answer.

In [17]:
xjacobi = jacobi(A, b)
@show norm(xjacobi - xchol)
norm(xjacobi - xchol) = 533.315106586665
Out[17]:
533.315106586665

Reading documentation we found that the default value of maxiter is 10. A couple trial runs shows that 30000 Jacobi iterations are enough to get an accurate solution.

In [18]:
xjacobi = jacobi(A, b, maxiter=30000)
@show norm(xjacobi - xchol)
norm(xjacobi - xchol) = 0.00012526024863889176
Out[18]:
0.00012526024863889176
In [19]:
@benchmark jacobi($A, $b, maxiter=30000)
Out[19]:
BenchmarkTools.Trial: 
  memory estimate:  234.72 KiB
  allocs estimate:  9
  --------------
  minimum time:     2.261 s (0.00% GC)
  median time:      2.348 s (0.00% GC)
  mean time:        2.329 s (0.00% GC)
  maximum time:     2.377 s (0.00% GC)
  --------------
  samples:          3
  evals/sample:     1

Gauss-Seidal

In [20]:
# Gauss-Seidel solution is fairly close to Cholesky solution after 15000 iters
xgs = gauss_seidel(A, b, maxiter=15000)
@show norm(xgs - xchol)
norm(xgs - xchol) = 0.0001245846635425542
Out[20]:
0.0001245846635425542
In [21]:
@benchmark gauss_seidel($A, $b, maxiter=15000)
Out[21]:
BenchmarkTools.Trial: 
  memory estimate:  156.55 KiB
  allocs estimate:  8
  --------------
  minimum time:     1.801 s (0.00% GC)
  median time:      1.853 s (0.00% GC)
  mean time:        1.849 s (0.00% GC)
  maximum time:     1.892 s (0.00% GC)
  --------------
  samples:          3
  evals/sample:     1

SOR

In [22]:
# symmetric SOR with ω=0.75
xsor = ssor(A, b, 0.75, maxiter=10000)
@show norm(xsor - xchol)
norm(xsor - xchol) = 0.0022945523203956853
Out[22]:
0.0022945523203956853
In [23]:
@benchmark sor($A, $b, 0.75, maxiter=10000)
Out[23]:
BenchmarkTools.Trial: 
  memory estimate:  547.27 KiB
  allocs estimate:  20010
  --------------
  minimum time:     1.433 s (0.00% GC)
  median time:      1.441 s (0.00% GC)
  mean time:        1.443 s (0.00% GC)
  maximum time:     1.455 s (0.00% GC)
  --------------
  samples:          4
  evals/sample:     1

Conjugate Gradient

In [24]:
# conjugate gradient
xcg = cg(A, b)
@show norm(xcg - xchol)
norm(xcg - xchol) = 1.1005386920768871e-5
Out[24]:
1.1005386920768871e-5
In [25]:
@benchmark cg($A, $b)
Out[25]:
BenchmarkTools.Trial: 
  memory estimate:  391.89 KiB
  allocs estimate:  23
  --------------
  minimum time:     19.614 ms (0.00% GC)
  median time:      23.401 ms (0.00% GC)
  mean time:        23.440 ms (0.18% GC)
  maximum time:     28.527 ms (0.00% GC)
  --------------
  samples:          214
  evals/sample:     1