ADMM
In this section, we solve the same constrained lasso problem using the alternating direction method of multipliers (ADMM) algorithm. ADMM algorithm is advantageous since it can scale to larger size problems and is not restricted to linear constraints. See Gaines and Zhou (2016) for details.
In order to use the algorithm, user needs to supply a projection function onto the constraint set. Some simple examples are discussed below.
sum-to-zero constraint
We demonstrate using a sum-to-zero constraint example
For this constraint, the appropriate projection function for $\boldsymbol{x} = (x_1, \cdots, x_p)^T$ would be
First, let's define a true parameter β such that sum(β) = 0.
using ConstrainedLasso, Base.Test
n, p = 50, 100 β = zeros(p); β[1:round(Int, p / 4)] = 0 β[(round(Int, p / 4) + 1):round(Int, p / 2)] = 1 β[(round(Int, p / 2) + 1):round(Int, 3p / 4)] = 0 β[(round(Int, 3p / 4) + 1):p] = -1 β
100-element Array{Float64,1}:
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0
  ⋮  
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
Next we generate data based on the true parameter β.
srand(41) X = randn(n, p)
50×100 Array{Float64,2}:
  1.21212    -0.153889    0.141533  …  -0.458125    0.0951976  -2.14019   
  0.345895    1.30676     1.60944      -0.409901    0.323719    0.989333  
 -1.27859    -1.18894     0.512064      1.80509     1.62606    -1.44251   
  0.230616    2.54741    -0.523533      2.73358     1.07999     0.432834  
 -1.17103    -0.39082     0.441921     -0.179239   -0.158189   -0.640611  
  1.67135     0.0829011   0.964089  …  -0.720038    1.99359    -0.671572  
 -0.614717    2.16204    -0.0602       -0.324456   -0.616887    1.11243   
 -0.810535    0.974719   -0.045405      0.881578    1.29611     0.696869  
 -1.10879    -1.32489    -1.18272       0.579381   -0.971269   -0.687591  
 -0.219752   -0.447897   -0.974186     -0.880804   -0.480702   -1.36887   
  0.0952544  -0.126203   -0.273737  …  -0.264421    0.565684   -0.798719  
  1.4126      0.295896   -0.213161     -1.46343    -1.27144    -0.0589753 
 -0.418407   -0.479389    0.324243      1.96976     0.867659   -1.2999    
  ⋮                                 ⋱                                     
  0.504861   -1.03911    -0.357771      0.815027    0.919037    1.07463   
 -0.820358   -0.955319    0.097768      0.553219    1.56424     0.10535   
  1.39684     1.93183     0.706641  …  -0.0222014   0.987281   -0.0646814 
 -1.55206     0.446778    1.48206      -1.42384    -1.04209     0.0460478 
  0.928527    0.933087   -0.641975     -1.16347    -0.313851   -1.20434   
  0.380879   -0.144713    1.54374      -0.605637    0.408246    0.632131  
 -1.30233    -2.31664     1.51324       0.765034   -0.515553    0.984551  
  1.36747     1.34059    -0.114778  …   0.846682   -0.565511   -0.539113  
 -2.82496    -0.0447351   0.426242     -0.353497   -0.14583    -0.00304009
 -0.847741    1.49306     1.15522       0.637659    1.70818     0.641035  
 -0.22286    -0.43932    -0.373259      0.788337    0.223785   -0.343495  
  1.32145     0.104516   -0.993017     -0.272744   -0.133748    0.968627
y = X * β + randn(n)
50-element Array{Float64,1}:
  -9.90585 
  -5.40562 
   5.24289 
  -6.29951 
  -4.9586  
  -6.1342  
  -7.90981 
   2.51009 
  -5.79548 
   1.61355 
  -0.722766
  10.4522  
   4.03935 
   ⋮       
   0.397781
  -2.6661  
   5.36896 
  -3.56537 
  -2.402   
   0.11478 
  -5.39248 
   4.38391 
   0.706801
 -10.1066  
  -1.12558 
  14.2473
Now we estimate coefficients at fixed tuning parameter value using ADMM alogrithm.
ρ = 2.0 β̂admm = lsq_constrsparsereg_admm(X, y, ρ; proj = x -> x - mean(x))
β̂admm
100-element Array{Float64,1}:
  0.0     
  0.174344
  0.0     
 -0.421288
  0.0     
  0.0     
  0.0     
  0.0     
  0.324233
 -0.15384 
  0.0     
  0.0     
  0.0     
  ⋮       
 -0.397027
 -0.32079 
  0.0     
  0.0     
 -0.868508
 -0.992272
 -0.571755
  0.0     
 -1.16568 
  0.0     
  0.0     
  0.0
Now let's compare the estimated coefficients with those obtained using quadratic programming.
ρ = 2.0 beq = [0.0] Aeq = ones(1, p) β̂, = lsq_constrsparsereg(X, y, ρ; Aeq = Aeq, beq = beq)
hcat(β̂admm, β̂)
100×2 Array{Float64,2}:
  0.0        1.51451e-8 
  0.174344   0.178717   
  0.0       -1.02944e-9 
 -0.421288  -0.414043   
  0.0       -3.33221e-10
  0.0       -4.14188e-10
  0.0        3.28018e-11
  0.0        8.38037e-11
  0.324233   0.335375   
 -0.15384   -0.157908   
  0.0       -2.44739e-9 
  0.0       -8.79003e-10
  0.0       -8.17787e-10
  ⋮                     
 -0.397027  -0.391635   
 -0.32079   -0.33352    
  0.0       -5.51459e-9 
  0.0        1.09637e-9 
 -0.868508  -0.867639   
 -0.992272  -0.999583   
 -0.571755  -0.577743   
  0.0       -8.93601e-10
 -1.16568   -1.16862    
  0.0        2.29367e-9 
  0.0       -1.52035e-9 
  0.0        2.71896e-9
Non-negativity constraint
Here we look at the non-negativity constraint. For the non-negativity constraint
the appropriate projection function for $\boldsymbol{x} = (x_1, \cdots, x_p)^T$ is
Now let's generate X and y.
n, p = 50, 100 β = zeros(p) β[1:10] = 1:10 srand(41) X = randn(n, p)
50×100 Array{Float64,2}:
  1.21212    -0.153889    0.141533  …  -0.458125    0.0951976  -2.14019   
  0.345895    1.30676     1.60944      -0.409901    0.323719    0.989333  
 -1.27859    -1.18894     0.512064      1.80509     1.62606    -1.44251   
  0.230616    2.54741    -0.523533      2.73358     1.07999     0.432834  
 -1.17103    -0.39082     0.441921     -0.179239   -0.158189   -0.640611  
  1.67135     0.0829011   0.964089  …  -0.720038    1.99359    -0.671572  
 -0.614717    2.16204    -0.0602       -0.324456   -0.616887    1.11243   
 -0.810535    0.974719   -0.045405      0.881578    1.29611     0.696869  
 -1.10879    -1.32489    -1.18272       0.579381   -0.971269   -0.687591  
 -0.219752   -0.447897   -0.974186     -0.880804   -0.480702   -1.36887   
  0.0952544  -0.126203   -0.273737  …  -0.264421    0.565684   -0.798719  
  1.4126      0.295896   -0.213161     -1.46343    -1.27144    -0.0589753 
 -0.418407   -0.479389    0.324243      1.96976     0.867659   -1.2999    
  ⋮                                 ⋱                                     
  0.504861   -1.03911    -0.357771      0.815027    0.919037    1.07463   
 -0.820358   -0.955319    0.097768      0.553219    1.56424     0.10535   
  1.39684     1.93183     0.706641  …  -0.0222014   0.987281   -0.0646814 
 -1.55206     0.446778    1.48206      -1.42384    -1.04209     0.0460478 
  0.928527    0.933087   -0.641975     -1.16347    -0.313851   -1.20434   
  0.380879   -0.144713    1.54374      -0.605637    0.408246    0.632131  
 -1.30233    -2.31664     1.51324       0.765034   -0.515553    0.984551  
  1.36747     1.34059    -0.114778  …   0.846682   -0.565511   -0.539113  
 -2.82496    -0.0447351   0.426242     -0.353497   -0.14583    -0.00304009
 -0.847741    1.49306     1.15522       0.637659    1.70818     0.641035  
 -0.22286    -0.43932    -0.373259      0.788337    0.223785   -0.343495  
  1.32145     0.104516   -0.993017     -0.272744   -0.133748    0.968627
y = X * β + randn(n)
50-element Array{Float64,1}:
  12.6173  
  40.3776  
   2.2169  
  27.4631  
  38.592   
   7.82023 
  22.7367  
   7.88475 
  -7.47037 
   0.621035
  -4.91899 
 -14.9363  
   8.26901 
   ⋮       
   7.83882 
  -9.30699 
 -29.7205  
  15.2482  
 -19.1784  
  14.9865  
   2.32728 
  -9.11988 
 -15.3472  
  22.9679  
  -0.997964
  42.6068
This time, we feed a sequence of tuning parameter values into the lsq_constrsparsereg_admm function. 
ρ = linspace(1.0, 70.0, 10) β̂admm = lsq_constrsparsereg_admm(X, y, ρ; proj = x -> clamp.(x, 0, Inf))
β̂admm
100×10 Array{Float64,2}:
  0.653843    0.490395    0.277812    …  0.0      0.0      0.0      0.0    
  2.21572     2.07051     1.96482        1.41607  1.26061  1.10519  0.94974
  2.77145     2.52934     2.36727        1.77875  1.64644  1.51417  1.38186
  4.13883     3.90272     3.70409        2.99234  2.82268  2.65297  2.4833 
  4.84074     4.56198     4.29935        3.31819  3.06715  2.81613  2.5651 
  5.8787      5.69235     5.49622     …  4.46134  4.18054  3.89969  3.61888
  6.7637      6.55971     6.34682        5.68324  5.54216  5.40096  5.25987
  8.51065     8.03277     7.8351         6.87417  6.61797  6.36169  6.10549
  8.7451      8.33258     8.02948        6.94937  6.68847  6.42749  6.16659
  9.77876     9.76788     9.6326         9.12581  9.02118  8.91651  8.81187
  1.65481e-6  2.65614e-5  5.75666e-5  …  0.0      0.0      0.0      0.0    
  0.0         0.0         0.0            0.0      0.0      0.0      0.0    
  0.0         0.0         0.0            0.0      0.0      0.0      0.0    
  ⋮                                   ⋱                                    
  0.170999    1.06392e-5  0.0            0.0      0.0      0.0      0.0    
  0.0         0.0         0.0            0.0      0.0      0.0      0.0    
  0.0         0.0         0.0         …  0.0      0.0      0.0      0.0    
 -5.9382e-7   0.0         0.0            0.0      0.0      0.0      0.0    
  0.0         2.57367e-5  3.98409e-5     0.0      0.0      0.0      0.0    
  0.0         0.0         0.0            0.0      0.0      0.0      0.0    
  0.0         1.69146e-5  0.0            0.0      0.0      0.0      0.0    
  0.0         0.0         0.0         …  0.0      0.0      0.0      0.0    
  0.0         0.0         3.45786e-5     0.0      0.0      0.0      0.0    
  0.0141037   0.0         0.0            0.0      0.0      0.0      0.0    
  0.0         2.39758e-5  3.62464e-5     0.0      0.0      0.0      0.0    
  0.0503145   0.0553623   0.0339917      0.0      0.0      0.0      0.0
Again we compare with the estimates from quadratic programming.
bineq = zeros(p) Aineq = - eye(p) β̂, = lsq_constrsparsereg(X, y, ρ; Aineq = Aineq, bineq = bineq)
β̂
100×10 Array{Float64,2}:
  0.652004      0.489105      0.277863     …   2.00626e-9    8.58265e-9 
  2.2179        2.07061       1.9648           1.105         0.949582   
  2.77457       2.52826       2.36721          1.51391       1.38162    
  4.13788       3.9009        3.70426          2.65333       2.48359    
  4.84588       4.56136       4.29938          2.81596       2.56492    
  5.87904       5.69159       5.49619      …   3.89993       3.61905    
  6.75881       6.55723       6.34731          5.40177       5.26053    
  8.51202       8.03199       7.83528          6.36216       6.10583    
  8.74756       8.32936       8.02991          6.42804       6.16702    
  9.77692       9.76675       9.63278          8.91676       8.81211    
 -2.48584e-10  -2.61442e-10  -2.8615e-10   …  -9.64613e-11  -7.2806e-10 
 -1.30999e-10  -3.24647e-10  -3.6769e-10      -1.54375e-10  -8.53117e-10
 -1.68134e-10   3.14662e-10   6.39206e-10      6.38157e-10   4.18343e-9 
  ⋮                                        ⋱                            
  0.172384      5.34512e-9    2.48272e-9       1.24417e-9    7.18092e-9 
  8.77676e-9    2.44647e-9    2.84651e-9       2.01956e-9    1.09825e-8 
  6.75785e-9    2.44624e-10   4.79165e-10  …   2.79034e-10   1.6273e-9  
 -7.91733e-11   4.07951e-10   6.78723e-10      8.17503e-10   5.9523e-9  
 -1.49319e-10  -1.68014e-10  -6.97805e-11      2.31588e-10   1.52119e-9 
  4.53248e-10   5.12964e-10   1.57423e-9       3.93055e-9    3.06738e-8 
 -1.56445e-10   1.21443e-9    1.45145e-9       1.89669e-9    1.22688e-8 
  1.43742e-9    1.20569e-9    1.0833e-9    …   9.42889e-10   6.01223e-9 
  1.90626e-10  -1.41852e-10   3.12113e-11      1.8946e-10    1.17163e-9 
  0.0162573     9.37889e-10   8.29365e-10      7.05427e-10   4.91352e-9 
 -1.94284e-10   1.46358e-10   4.19363e-10      6.81323e-10   4.51758e-9 
  0.0522645     0.055839      0.0336149        3.45354e-9    1.57129e-8
As expected, estimated coefficient values are quite close.
Follow the link to access the .ipynb file of this page.