ADMM
In this section, we solve the same constrained lasso problem using the alternating direction method of multipliers (ADMM) algorithm. ADMM algorithm is advantageous since it can scale to larger size problems and is not restricted to linear constraints. See Gaines and Zhou (2016) for details.
In order to use the algorithm, user needs to supply a projection function onto the constraint set. Some simple examples are discussed below.
sum-to-zero constraint
We demonstrate using a sum-to-zero constraint example
For this constraint, the appropriate projection function for $\boldsymbol{x} = (x_1, \cdots, x_p)^T$ would be
First, let's define a true parameter β such that sum(β) = 0.
using ConstrainedLasso, Base.Test
n, p = 50, 100 β = zeros(p); β[1:round(Int, p / 4)] = 0 β[(round(Int, p / 4) + 1):round(Int, p / 2)] = 1 β[(round(Int, p / 2) + 1):round(Int, 3p / 4)] = 0 β[(round(Int, 3p / 4) + 1):p] = -1 β
100-element Array{Float64,1}:
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
⋮
-1.0
-1.0
-1.0
-1.0
-1.0
-1.0
-1.0
-1.0
-1.0
-1.0
-1.0
-1.0
Next we generate data based on the true parameter β.
srand(41) X = randn(n, p)
50×100 Array{Float64,2}:
1.21212 -0.153889 0.141533 … -0.458125 0.0951976 -2.14019
0.345895 1.30676 1.60944 -0.409901 0.323719 0.989333
-1.27859 -1.18894 0.512064 1.80509 1.62606 -1.44251
0.230616 2.54741 -0.523533 2.73358 1.07999 0.432834
-1.17103 -0.39082 0.441921 -0.179239 -0.158189 -0.640611
1.67135 0.0829011 0.964089 … -0.720038 1.99359 -0.671572
-0.614717 2.16204 -0.0602 -0.324456 -0.616887 1.11243
-0.810535 0.974719 -0.045405 0.881578 1.29611 0.696869
-1.10879 -1.32489 -1.18272 0.579381 -0.971269 -0.687591
-0.219752 -0.447897 -0.974186 -0.880804 -0.480702 -1.36887
0.0952544 -0.126203 -0.273737 … -0.264421 0.565684 -0.798719
1.4126 0.295896 -0.213161 -1.46343 -1.27144 -0.0589753
-0.418407 -0.479389 0.324243 1.96976 0.867659 -1.2999
⋮ ⋱
0.504861 -1.03911 -0.357771 0.815027 0.919037 1.07463
-0.820358 -0.955319 0.097768 0.553219 1.56424 0.10535
1.39684 1.93183 0.706641 … -0.0222014 0.987281 -0.0646814
-1.55206 0.446778 1.48206 -1.42384 -1.04209 0.0460478
0.928527 0.933087 -0.641975 -1.16347 -0.313851 -1.20434
0.380879 -0.144713 1.54374 -0.605637 0.408246 0.632131
-1.30233 -2.31664 1.51324 0.765034 -0.515553 0.984551
1.36747 1.34059 -0.114778 … 0.846682 -0.565511 -0.539113
-2.82496 -0.0447351 0.426242 -0.353497 -0.14583 -0.00304009
-0.847741 1.49306 1.15522 0.637659 1.70818 0.641035
-0.22286 -0.43932 -0.373259 0.788337 0.223785 -0.343495
1.32145 0.104516 -0.993017 -0.272744 -0.133748 0.968627
y = X * β + randn(n)
50-element Array{Float64,1}:
-9.90585
-5.40562
5.24289
-6.29951
-4.9586
-6.1342
-7.90981
2.51009
-5.79548
1.61355
-0.722766
10.4522
4.03935
⋮
0.397781
-2.6661
5.36896
-3.56537
-2.402
0.11478
-5.39248
4.38391
0.706801
-10.1066
-1.12558
14.2473
Now we estimate coefficients at fixed tuning parameter value using ADMM alogrithm.
ρ = 2.0 β̂admm = lsq_constrsparsereg_admm(X, y, ρ; proj = x -> x - mean(x))
β̂admm
100-element Array{Float64,1}:
0.0
0.174344
0.0
-0.421288
0.0
0.0
0.0
0.0
0.324233
-0.15384
0.0
0.0
0.0
⋮
-0.397027
-0.32079
0.0
0.0
-0.868508
-0.992272
-0.571755
0.0
-1.16568
0.0
0.0
0.0
Now let's compare the estimated coefficients with those obtained using quadratic programming.
ρ = 2.0 beq = [0.0] Aeq = ones(1, p) β̂, = lsq_constrsparsereg(X, y, ρ; Aeq = Aeq, beq = beq)
hcat(β̂admm, β̂)
100×2 Array{Float64,2}:
0.0 1.51451e-8
0.174344 0.178717
0.0 -1.02944e-9
-0.421288 -0.414043
0.0 -3.33221e-10
0.0 -4.14188e-10
0.0 3.28018e-11
0.0 8.38037e-11
0.324233 0.335375
-0.15384 -0.157908
0.0 -2.44739e-9
0.0 -8.79003e-10
0.0 -8.17787e-10
⋮
-0.397027 -0.391635
-0.32079 -0.33352
0.0 -5.51459e-9
0.0 1.09637e-9
-0.868508 -0.867639
-0.992272 -0.999583
-0.571755 -0.577743
0.0 -8.93601e-10
-1.16568 -1.16862
0.0 2.29367e-9
0.0 -1.52035e-9
0.0 2.71896e-9
Non-negativity constraint
Here we look at the non-negativity constraint. For the non-negativity constraint
the appropriate projection function for $\boldsymbol{x} = (x_1, \cdots, x_p)^T$ is
Now let's generate X and y.
n, p = 50, 100 β = zeros(p) β[1:10] = 1:10 srand(41) X = randn(n, p)
50×100 Array{Float64,2}:
1.21212 -0.153889 0.141533 … -0.458125 0.0951976 -2.14019
0.345895 1.30676 1.60944 -0.409901 0.323719 0.989333
-1.27859 -1.18894 0.512064 1.80509 1.62606 -1.44251
0.230616 2.54741 -0.523533 2.73358 1.07999 0.432834
-1.17103 -0.39082 0.441921 -0.179239 -0.158189 -0.640611
1.67135 0.0829011 0.964089 … -0.720038 1.99359 -0.671572
-0.614717 2.16204 -0.0602 -0.324456 -0.616887 1.11243
-0.810535 0.974719 -0.045405 0.881578 1.29611 0.696869
-1.10879 -1.32489 -1.18272 0.579381 -0.971269 -0.687591
-0.219752 -0.447897 -0.974186 -0.880804 -0.480702 -1.36887
0.0952544 -0.126203 -0.273737 … -0.264421 0.565684 -0.798719
1.4126 0.295896 -0.213161 -1.46343 -1.27144 -0.0589753
-0.418407 -0.479389 0.324243 1.96976 0.867659 -1.2999
⋮ ⋱
0.504861 -1.03911 -0.357771 0.815027 0.919037 1.07463
-0.820358 -0.955319 0.097768 0.553219 1.56424 0.10535
1.39684 1.93183 0.706641 … -0.0222014 0.987281 -0.0646814
-1.55206 0.446778 1.48206 -1.42384 -1.04209 0.0460478
0.928527 0.933087 -0.641975 -1.16347 -0.313851 -1.20434
0.380879 -0.144713 1.54374 -0.605637 0.408246 0.632131
-1.30233 -2.31664 1.51324 0.765034 -0.515553 0.984551
1.36747 1.34059 -0.114778 … 0.846682 -0.565511 -0.539113
-2.82496 -0.0447351 0.426242 -0.353497 -0.14583 -0.00304009
-0.847741 1.49306 1.15522 0.637659 1.70818 0.641035
-0.22286 -0.43932 -0.373259 0.788337 0.223785 -0.343495
1.32145 0.104516 -0.993017 -0.272744 -0.133748 0.968627
y = X * β + randn(n)
50-element Array{Float64,1}:
12.6173
40.3776
2.2169
27.4631
38.592
7.82023
22.7367
7.88475
-7.47037
0.621035
-4.91899
-14.9363
8.26901
⋮
7.83882
-9.30699
-29.7205
15.2482
-19.1784
14.9865
2.32728
-9.11988
-15.3472
22.9679
-0.997964
42.6068
This time, we feed a sequence of tuning parameter values into the lsq_constrsparsereg_admm function.
ρ = linspace(1.0, 70.0, 10) β̂admm = lsq_constrsparsereg_admm(X, y, ρ; proj = x -> clamp.(x, 0, Inf))
β̂admm
100×10 Array{Float64,2}:
0.653843 0.490395 0.277812 … 0.0 0.0 0.0 0.0
2.21572 2.07051 1.96482 1.41607 1.26061 1.10519 0.94974
2.77145 2.52934 2.36727 1.77875 1.64644 1.51417 1.38186
4.13883 3.90272 3.70409 2.99234 2.82268 2.65297 2.4833
4.84074 4.56198 4.29935 3.31819 3.06715 2.81613 2.5651
5.8787 5.69235 5.49622 … 4.46134 4.18054 3.89969 3.61888
6.7637 6.55971 6.34682 5.68324 5.54216 5.40096 5.25987
8.51065 8.03277 7.8351 6.87417 6.61797 6.36169 6.10549
8.7451 8.33258 8.02948 6.94937 6.68847 6.42749 6.16659
9.77876 9.76788 9.6326 9.12581 9.02118 8.91651 8.81187
1.65481e-6 2.65614e-5 5.75666e-5 … 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
⋮ ⋱
0.170999 1.06392e-5 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 … 0.0 0.0 0.0 0.0
-5.9382e-7 0.0 0.0 0.0 0.0 0.0 0.0
0.0 2.57367e-5 3.98409e-5 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.69146e-5 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 … 0.0 0.0 0.0 0.0
0.0 0.0 3.45786e-5 0.0 0.0 0.0 0.0
0.0141037 0.0 0.0 0.0 0.0 0.0 0.0
0.0 2.39758e-5 3.62464e-5 0.0 0.0 0.0 0.0
0.0503145 0.0553623 0.0339917 0.0 0.0 0.0 0.0
Again we compare with the estimates from quadratic programming.
bineq = zeros(p) Aineq = - eye(p) β̂, = lsq_constrsparsereg(X, y, ρ; Aineq = Aineq, bineq = bineq)
β̂
100×10 Array{Float64,2}:
0.652004 0.489105 0.277863 … 2.00626e-9 8.58265e-9
2.2179 2.07061 1.9648 1.105 0.949582
2.77457 2.52826 2.36721 1.51391 1.38162
4.13788 3.9009 3.70426 2.65333 2.48359
4.84588 4.56136 4.29938 2.81596 2.56492
5.87904 5.69159 5.49619 … 3.89993 3.61905
6.75881 6.55723 6.34731 5.40177 5.26053
8.51202 8.03199 7.83528 6.36216 6.10583
8.74756 8.32936 8.02991 6.42804 6.16702
9.77692 9.76675 9.63278 8.91676 8.81211
-2.48584e-10 -2.61442e-10 -2.8615e-10 … -9.64613e-11 -7.2806e-10
-1.30999e-10 -3.24647e-10 -3.6769e-10 -1.54375e-10 -8.53117e-10
-1.68134e-10 3.14662e-10 6.39206e-10 6.38157e-10 4.18343e-9
⋮ ⋱
0.172384 5.34512e-9 2.48272e-9 1.24417e-9 7.18092e-9
8.77676e-9 2.44647e-9 2.84651e-9 2.01956e-9 1.09825e-8
6.75785e-9 2.44624e-10 4.79165e-10 … 2.79034e-10 1.6273e-9
-7.91733e-11 4.07951e-10 6.78723e-10 8.17503e-10 5.9523e-9
-1.49319e-10 -1.68014e-10 -6.97805e-11 2.31588e-10 1.52119e-9
4.53248e-10 5.12964e-10 1.57423e-9 3.93055e-9 3.06738e-8
-1.56445e-10 1.21443e-9 1.45145e-9 1.89669e-9 1.22688e-8
1.43742e-9 1.20569e-9 1.0833e-9 … 9.42889e-10 6.01223e-9
1.90626e-10 -1.41852e-10 3.12113e-11 1.8946e-10 1.17163e-9
0.0162573 9.37889e-10 8.29365e-10 7.05427e-10 4.91352e-9
-1.94284e-10 1.46358e-10 4.19363e-10 6.81323e-10 4.51758e-9
0.0522645 0.055839 0.0336149 3.45354e-9 1.57129e-8
As expected, estimated coefficient values are quite close.
Follow the link to access the .ipynb file of this page.