In [1]:
versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code

Triangular systems

We consider computer algorithms for solving linear equations $\mathbf{A} \mathbf{x} = \mathbf{b}$, a ubiquitous task in statistics.

Idea: turning original problem into an easy one, e.g., triangular system.

Lower triangular system

To solve $\mathbf{A} \mathbf{x} = \mathbf{b}$, where $\mathbf{A} \in \mathbb{R}^{n \times n}$ is lower triangular

$$ \begin{pmatrix} a_{11} & 0 & \cdots & 0 \\ a_{21} & a_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} = \begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{pmatrix}. $$
  • Forward substitution: $$ \begin{eqnarray*} x_1 &=& b_1 / a_{11} \\ x_2 &=& (b_2 - a_{21} x_1) / a_{22} \\ x_3 &=& (b_3 - a_{31} x_1 - a_{32} x_2) / a_{33} \\ &\vdots& \\ x_n &=& (b_n - a_{n1} x_1 - a_{n2} x_2 - \cdots - a_{n,n-1} x_{n-1}) / a_{nn}. \end{eqnarray*} $$

  • $1 + 3 + 5 + \cdots + (2n-1) = n^2$ flops.

  • $\mathbf{A}$ can be accessed by row ($ij$ loop) or column ($ji$ loop).

Upper triangular system

To solve $\mathbf{A} \mathbf{x} = \mathbf{b}$, where $\mathbf{A} \in \mathbb{R}^{n \times n}$ is upper triangular
$$ \begin{pmatrix} a_{11} & \cdots & a_{1,n-1} & a_{1n} \\ \vdots & \ddots & \vdots & \vdots \\ 0 & \cdots & a_{n-1,n-1} & a_{n-1,n} \\ 0 & 0 & 0 & a_{nn} \end{pmatrix} \begin{pmatrix} x_1 \\ \vdots \\ x_{n-1} \\ x_n \end{pmatrix} = \begin{pmatrix} b_1 \\ \vdots \\ b_{n-1} \\ b_n \end{pmatrix}. $$

  • Back substitution $$ \begin{eqnarray*} x_n &=& b_n / a_{nn} \\ x_{n-1} &=& (b_{n-1} - a_{n-1,n} x_n) / a_{n-1,n-1} \\ x_{n-2} &=& (b_{n-2} - a_{n-2,n-1} x_{n-1} - a_{n-2,n} x_n) / a_{n-2,n-2} \\ &\vdots& \\ x_1 &=& (b_1 - a_{12} x_2 - a_{13} x_3 - \cdots - a_{1,n} x_{n}) / a_{11}. \end{eqnarray*} $$

  • $n^2$ flops.

  • $\mathbf{A}$ can be accessed by row ($ij$ loop) or column ($ji$ loop).

Implementation

  • BLAS level 2 function: ?trsv (triangular solve with one right hand side).

  • BLAS level 3 function: ?trsm (matrix triangular solve, i.e., multiple right hand sides).

  • Julia

    • The left divide \ operator in Julia is used for solving linear equations or least squares problem.
    • If A is a triangular matrix, the command A \ b uses forward or backward substitution
    • Or we can call the BLAS wrapper functions directly: trsv!, trsv, trsm!, trsm
In [2]:
using LinearAlgebra, Random

Random.seed!(123) # seed
n = 5
A = randn(n, n)
b = randn(n)
# a random matrix
A
Out[2]:
5×5 Array{Float64,2}:
  1.19027   -0.664713   -0.339366   0.368002  -0.979539
  2.04818    0.980968   -0.843878  -0.281133   0.260402
  1.14265   -0.0754831  -0.888936  -0.734886  -0.468489
  0.459416   0.273815    0.327215  -0.71741   -0.880897
 -0.396679  -0.194229    0.592403  -0.77507    0.277726
In [3]:
Al = LowerTriangular(A) # does not create extra matrix
Out[3]:
5×5 LowerTriangular{Float64,Array{Float64,2}}:
  1.19027     ⋅           ⋅          ⋅        ⋅      
  2.04818    0.980968     ⋅          ⋅        ⋅      
  1.14265   -0.0754831  -0.888936    ⋅        ⋅      
  0.459416   0.273815    0.327215  -0.71741   ⋅      
 -0.396679  -0.194229    0.592403  -0.77507  0.277726
In [4]:
fieldnames(typeof(Al))
Out[4]:
(:data,)
In [5]:
Al.data
Out[5]:
5×5 Array{Float64,2}:
  1.19027   -0.664713   -0.339366   0.368002  -0.979539
  2.04818    0.980968   -0.843878  -0.281133   0.260402
  1.14265   -0.0754831  -0.888936  -0.734886  -0.468489
  0.459416   0.273815    0.327215  -0.71741   -0.880897
 -0.396679  -0.194229    0.592403  -0.77507    0.277726
In [6]:
# same data
pointer(Al.data), pointer(A)
Out[6]:
(Ptr{Float64} @0x000000010f12c170, Ptr{Float64} @0x000000010f12c170)
In [7]:
Al \ b # dispatched to BLAS function for triangular solve
Out[7]:
5-element Array{Float64,1}:
  1.28031359532452  
 -4.485407033333146 
  5.326125412123139 
  0.4468195081389211
 -3.091688163812573 
In [8]:
# or use BLAS wrapper directly
BLAS.trsv('L', 'N', 'N', A, b)
Out[8]:
5-element Array{Float64,1}:
  1.28031359532452 
 -4.485407033333146
  5.326125412123139
  0.446819508138921
 -3.091688163812573
In [9]:
?BLAS.trsv
Out[9]:
trsv(ul, tA, dA, A, b)

Return the solution to A*x = b or one of the other two variants determined by tA and ul. dA determines if the diagonal values are read or are assumed to be all ones.

Some algebraic facts of triangular system

  • Eigenvalues of a triangular matrix $\mathbf{A}$ are diagonal entries $\lambda_i = a_{ii}$.

  • Determinant $\det(\mathbf{A}) = \prod_i a_{ii}$.

  • The product of two upper (lower) triangular matrices is upper (lower) triangular.

  • The inverse of an upper (lower) triangular matrix is upper (lower) triangular.

  • The product of two unit upper (lower) triangular matrices is unit upper (lower) triangular.

  • The inverse of a unit upper (lower) triangular matrix is unit upper (lower) triangular.

Julia types for triangular matrices

LowerTriangular, UnitLowerTriangular, UpperTriangular, UnitUpperTriangular.

In [10]:
using BenchmarkTools, LinearAlgebra, Random

Random.seed!(123) # seed
A = randn(1000, 1000)

# if we don't tell Julia it's triangular: O(n^3) complexity
# tril(A) returns a full triangular matrix, same as Matlab
@benchmark eigvals(tril($A))
Out[10]:
BenchmarkTools.Trial: 
  memory estimate:  15.55 MiB
  allocs estimate:  16
  --------------
  minimum time:     42.712 ms (0.00% GC)
  median time:      44.874 ms (2.45% GC)
  mean time:        45.532 ms (2.76% GC)
  maximum time:     89.663 ms (43.22% GC)
  --------------
  samples:          110
  evals/sample:     1
In [11]:
# if we tell Julia it's triangular: O(n) complexity
@benchmark eigvals(LowerTriangular($A))
Out[11]:
BenchmarkTools.Trial: 
  memory estimate:  7.94 KiB
  allocs estimate:  1
  --------------
  minimum time:     1.676 μs (0.00% GC)
  median time:      2.282 μs (0.00% GC)
  mean time:        3.285 μs (28.33% GC)
  maximum time:     4.328 ms (99.91% GC)
  --------------
  samples:          10000
  evals/sample:     10
In [12]:
# if we don't tell Julia it's triangular: O(n^3) complexity
# tril(A) returns a full triangular matrix, same as Matlab
@benchmark det(tril($A))
Out[12]:
BenchmarkTools.Trial: 
  memory estimate:  7.64 MiB
  allocs estimate:  4
  --------------
  minimum time:     2.024 ms (0.00% GC)
  median time:      2.134 ms (0.00% GC)
  mean time:        2.428 ms (19.43% GC)
  maximum time:     45.379 ms (96.47% GC)
  --------------
  samples:          2051
  evals/sample:     1
In [13]:
# if we tell Julia it's triangular: O(n) complexity
@benchmark det(LowerTriangular($A))
Out[13]:
BenchmarkTools.Trial: 
  memory estimate:  7.95 KiB
  allocs estimate:  2
  --------------
  minimum time:     1.847 μs (0.00% GC)
  median time:      2.398 μs (0.00% GC)
  mean time:        3.347 μs (28.12% GC)
  maximum time:     4.376 ms (99.89% GC)
  --------------
  samples:          10000
  evals/sample:     10