ADMM
In this section, we solve the same constrained lasso problem using the alternating direction method of multipliers (ADMM) algorithm. ADMM algorithm is advantageous since it can scale to larger size problems and is not restricted to linear constraints. See Gaines and Zhou (2016) for details.
In order to use the algorithm, user needs to supply a projection function onto the constraint set. Some simple examples are discussed below.
sum-to-zero constraint
We demonstrate using a sum-to-zero constraint example
For this constraint, the appropriate projection function for $\boldsymbol{x} = (x_1, \cdots, x_p)^T$ would be
First, let's define a true parameter β
such that sum(β) = 0
.
using ConstrainedLasso, Base.Test
n, p = 50, 100 β = zeros(p); β[1:round(Int, p / 4)] = 0 β[(round(Int, p / 4) + 1):round(Int, p / 2)] = 1 β[(round(Int, p / 2) + 1):round(Int, 3p / 4)] = 0 β[(round(Int, 3p / 4) + 1):p] = -1 β
100-element Array{Float64,1}: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ⋮ -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0
Next we generate data based on the true parameter β.
srand(41) X = randn(n, p)
50×100 Array{Float64,2}: 1.21212 -0.153889 0.141533 … -0.458125 0.0951976 -2.14019 0.345895 1.30676 1.60944 -0.409901 0.323719 0.989333 -1.27859 -1.18894 0.512064 1.80509 1.62606 -1.44251 0.230616 2.54741 -0.523533 2.73358 1.07999 0.432834 -1.17103 -0.39082 0.441921 -0.179239 -0.158189 -0.640611 1.67135 0.0829011 0.964089 … -0.720038 1.99359 -0.671572 -0.614717 2.16204 -0.0602 -0.324456 -0.616887 1.11243 -0.810535 0.974719 -0.045405 0.881578 1.29611 0.696869 -1.10879 -1.32489 -1.18272 0.579381 -0.971269 -0.687591 -0.219752 -0.447897 -0.974186 -0.880804 -0.480702 -1.36887 0.0952544 -0.126203 -0.273737 … -0.264421 0.565684 -0.798719 1.4126 0.295896 -0.213161 -1.46343 -1.27144 -0.0589753 -0.418407 -0.479389 0.324243 1.96976 0.867659 -1.2999 ⋮ ⋱ 0.504861 -1.03911 -0.357771 0.815027 0.919037 1.07463 -0.820358 -0.955319 0.097768 0.553219 1.56424 0.10535 1.39684 1.93183 0.706641 … -0.0222014 0.987281 -0.0646814 -1.55206 0.446778 1.48206 -1.42384 -1.04209 0.0460478 0.928527 0.933087 -0.641975 -1.16347 -0.313851 -1.20434 0.380879 -0.144713 1.54374 -0.605637 0.408246 0.632131 -1.30233 -2.31664 1.51324 0.765034 -0.515553 0.984551 1.36747 1.34059 -0.114778 … 0.846682 -0.565511 -0.539113 -2.82496 -0.0447351 0.426242 -0.353497 -0.14583 -0.00304009 -0.847741 1.49306 1.15522 0.637659 1.70818 0.641035 -0.22286 -0.43932 -0.373259 0.788337 0.223785 -0.343495 1.32145 0.104516 -0.993017 -0.272744 -0.133748 0.968627
y = X * β + randn(n)
50-element Array{Float64,1}: -9.90585 -5.40562 5.24289 -6.29951 -4.9586 -6.1342 -7.90981 2.51009 -5.79548 1.61355 -0.722766 10.4522 4.03935 ⋮ 0.397781 -2.6661 5.36896 -3.56537 -2.402 0.11478 -5.39248 4.38391 0.706801 -10.1066 -1.12558 14.2473
Now we estimate coefficients at fixed tuning parameter value using ADMM alogrithm.
ρ = 2.0 β̂admm = lsq_constrsparsereg_admm(X, y, ρ; proj = x -> x - mean(x))
β̂admm
100-element Array{Float64,1}: 0.0 0.174344 0.0 -0.421288 0.0 0.0 0.0 0.0 0.324233 -0.15384 0.0 0.0 0.0 ⋮ -0.397027 -0.32079 0.0 0.0 -0.868508 -0.992272 -0.571755 0.0 -1.16568 0.0 0.0 0.0
Now let's compare the estimated coefficients with those obtained using quadratic programming.
ρ = 2.0 beq = [0.0] Aeq = ones(1, p) β̂, = lsq_constrsparsereg(X, y, ρ; Aeq = Aeq, beq = beq)
hcat(β̂admm, β̂)
100×2 Array{Float64,2}: 0.0 1.51451e-8 0.174344 0.178717 0.0 -1.02944e-9 -0.421288 -0.414043 0.0 -3.33221e-10 0.0 -4.14188e-10 0.0 3.28018e-11 0.0 8.38037e-11 0.324233 0.335375 -0.15384 -0.157908 0.0 -2.44739e-9 0.0 -8.79003e-10 0.0 -8.17787e-10 ⋮ -0.397027 -0.391635 -0.32079 -0.33352 0.0 -5.51459e-9 0.0 1.09637e-9 -0.868508 -0.867639 -0.992272 -0.999583 -0.571755 -0.577743 0.0 -8.93601e-10 -1.16568 -1.16862 0.0 2.29367e-9 0.0 -1.52035e-9 0.0 2.71896e-9
Non-negativity constraint
Here we look at the non-negativity constraint. For the non-negativity constraint
the appropriate projection function for $\boldsymbol{x} = (x_1, \cdots, x_p)^T$ is
Now let's generate X
and y
.
n, p = 50, 100 β = zeros(p) β[1:10] = 1:10 srand(41) X = randn(n, p)
50×100 Array{Float64,2}: 1.21212 -0.153889 0.141533 … -0.458125 0.0951976 -2.14019 0.345895 1.30676 1.60944 -0.409901 0.323719 0.989333 -1.27859 -1.18894 0.512064 1.80509 1.62606 -1.44251 0.230616 2.54741 -0.523533 2.73358 1.07999 0.432834 -1.17103 -0.39082 0.441921 -0.179239 -0.158189 -0.640611 1.67135 0.0829011 0.964089 … -0.720038 1.99359 -0.671572 -0.614717 2.16204 -0.0602 -0.324456 -0.616887 1.11243 -0.810535 0.974719 -0.045405 0.881578 1.29611 0.696869 -1.10879 -1.32489 -1.18272 0.579381 -0.971269 -0.687591 -0.219752 -0.447897 -0.974186 -0.880804 -0.480702 -1.36887 0.0952544 -0.126203 -0.273737 … -0.264421 0.565684 -0.798719 1.4126 0.295896 -0.213161 -1.46343 -1.27144 -0.0589753 -0.418407 -0.479389 0.324243 1.96976 0.867659 -1.2999 ⋮ ⋱ 0.504861 -1.03911 -0.357771 0.815027 0.919037 1.07463 -0.820358 -0.955319 0.097768 0.553219 1.56424 0.10535 1.39684 1.93183 0.706641 … -0.0222014 0.987281 -0.0646814 -1.55206 0.446778 1.48206 -1.42384 -1.04209 0.0460478 0.928527 0.933087 -0.641975 -1.16347 -0.313851 -1.20434 0.380879 -0.144713 1.54374 -0.605637 0.408246 0.632131 -1.30233 -2.31664 1.51324 0.765034 -0.515553 0.984551 1.36747 1.34059 -0.114778 … 0.846682 -0.565511 -0.539113 -2.82496 -0.0447351 0.426242 -0.353497 -0.14583 -0.00304009 -0.847741 1.49306 1.15522 0.637659 1.70818 0.641035 -0.22286 -0.43932 -0.373259 0.788337 0.223785 -0.343495 1.32145 0.104516 -0.993017 -0.272744 -0.133748 0.968627
y = X * β + randn(n)
50-element Array{Float64,1}: 12.6173 40.3776 2.2169 27.4631 38.592 7.82023 22.7367 7.88475 -7.47037 0.621035 -4.91899 -14.9363 8.26901 ⋮ 7.83882 -9.30699 -29.7205 15.2482 -19.1784 14.9865 2.32728 -9.11988 -15.3472 22.9679 -0.997964 42.6068
This time, we feed a sequence of tuning parameter values into the lsq_constrsparsereg_admm
function.
ρ = linspace(1.0, 70.0, 10) β̂admm = lsq_constrsparsereg_admm(X, y, ρ; proj = x -> clamp.(x, 0, Inf))
β̂admm
100×10 Array{Float64,2}: 0.653843 0.490395 0.277812 … 0.0 0.0 0.0 0.0 2.21572 2.07051 1.96482 1.41607 1.26061 1.10519 0.94974 2.77145 2.52934 2.36727 1.77875 1.64644 1.51417 1.38186 4.13883 3.90272 3.70409 2.99234 2.82268 2.65297 2.4833 4.84074 4.56198 4.29935 3.31819 3.06715 2.81613 2.5651 5.8787 5.69235 5.49622 … 4.46134 4.18054 3.89969 3.61888 6.7637 6.55971 6.34682 5.68324 5.54216 5.40096 5.25987 8.51065 8.03277 7.8351 6.87417 6.61797 6.36169 6.10549 8.7451 8.33258 8.02948 6.94937 6.68847 6.42749 6.16659 9.77876 9.76788 9.6326 9.12581 9.02118 8.91651 8.81187 1.65481e-6 2.65614e-5 5.75666e-5 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ⋮ ⋱ 0.170999 1.06392e-5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0 -5.9382e-7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.57367e-5 3.98409e-5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.69146e-5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 3.45786e-5 0.0 0.0 0.0 0.0 0.0141037 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.39758e-5 3.62464e-5 0.0 0.0 0.0 0.0 0.0503145 0.0553623 0.0339917 0.0 0.0 0.0 0.0
Again we compare with the estimates from quadratic programming.
bineq = zeros(p) Aineq = - eye(p) β̂, = lsq_constrsparsereg(X, y, ρ; Aineq = Aineq, bineq = bineq)
β̂
100×10 Array{Float64,2}: 0.652004 0.489105 0.277863 … 2.00626e-9 8.58265e-9 2.2179 2.07061 1.9648 1.105 0.949582 2.77457 2.52826 2.36721 1.51391 1.38162 4.13788 3.9009 3.70426 2.65333 2.48359 4.84588 4.56136 4.29938 2.81596 2.56492 5.87904 5.69159 5.49619 … 3.89993 3.61905 6.75881 6.55723 6.34731 5.40177 5.26053 8.51202 8.03199 7.83528 6.36216 6.10583 8.74756 8.32936 8.02991 6.42804 6.16702 9.77692 9.76675 9.63278 8.91676 8.81211 -2.48584e-10 -2.61442e-10 -2.8615e-10 … -9.64613e-11 -7.2806e-10 -1.30999e-10 -3.24647e-10 -3.6769e-10 -1.54375e-10 -8.53117e-10 -1.68134e-10 3.14662e-10 6.39206e-10 6.38157e-10 4.18343e-9 ⋮ ⋱ 0.172384 5.34512e-9 2.48272e-9 1.24417e-9 7.18092e-9 8.77676e-9 2.44647e-9 2.84651e-9 2.01956e-9 1.09825e-8 6.75785e-9 2.44624e-10 4.79165e-10 … 2.79034e-10 1.6273e-9 -7.91733e-11 4.07951e-10 6.78723e-10 8.17503e-10 5.9523e-9 -1.49319e-10 -1.68014e-10 -6.97805e-11 2.31588e-10 1.52119e-9 4.53248e-10 5.12964e-10 1.57423e-9 3.93055e-9 3.06738e-8 -1.56445e-10 1.21443e-9 1.45145e-9 1.89669e-9 1.22688e-8 1.43742e-9 1.20569e-9 1.0833e-9 … 9.42889e-10 6.01223e-9 1.90626e-10 -1.41852e-10 3.12113e-11 1.8946e-10 1.17163e-9 0.0162573 9.37889e-10 8.29365e-10 7.05427e-10 4.91352e-9 -1.94284e-10 1.46358e-10 4.19363e-10 6.81323e-10 4.51758e-9 0.0522645 0.055839 0.0336149 3.45354e-9 1.57129e-8
As expected, estimated coefficient values are quite close.
Follow the link to access the .ipynb file of this page.