Lasso and Square-Root Lasso by QP and SOCP

Dr. Hua Zhou, Mar 15, 2015

This code snippet demonstrates solving lasso and square-root lasso via Quadratic Programming (QP) and Second Order Cone Programming (SOCP) using the Convex.jl package in Julia.

Display system information.

In [10]:
versioninfo()
Julia Version 0.3.6
Commit 0c24dca (2015-02-17 22:12 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

Simulate a data set

In [11]:
using Gadfly

# random seed
srand(2015790)
# n = # obs, p = # predictors excluding intercept
n = 100; p = 10
# Design matrix
X = [ones(n, 1) randn(n, p)]
# True regression coefficients (first 5 are non-zero)
β = [1.0, randn(5, 1), zeros(p - 5, 1)]
# Responses
y = X * β + randn(n, 1)

# plot the true β
plot(x=1:p, y=β[2:end], Geom.point, Guide.title("True β"), 
    Guide.xlabel("j"), Guide.ylabel("βj"))
Out[11]:
j -12.5 -10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 -10 0 10 20 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 -5.0 -4.8 -4.6 -4.4 -4.2 -4.0 -3.8 -3.6 -3.4 -3.2 -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 -5 0 5 10 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 βj True β

Solution path of lasso

Load the Convex.jl package.

In [12]:
using Convex

# Use Gurobi solver
using Gurobi
solver = GurobiSolver(OutputFlag=0)
set_default_solver(solver)

## Use Mosek solver
#using Mosek
#solver = MosekSolver(LOG=1)
#set_default_solver(solver)

## Use SCS solver
#using SCS
#solver = SCSSolver(verbose=1)
#set_default_solver(solver)
Out[12]:
GurobiSolver({(:OutputFlag,0)})
Set up a grid of \(\lambda\) and solve the lasso problem

at each value of \(\lambda\).

In [13]:
λlist = 0:10:200
βpath = zeros(p + 1, length(λlist))
# βhat are optimization variables
βhat = Variable(size(X, 2))
@time for i = 1:length(λlist)
    λ = λlist[i]
    problem = minimize(0.5 * sum_squares(y - X * βhat) + λ * norm(βhat[2:end], 1))
    solve!(problem)
    βpath[:, i] = βhat.value
end
elapsed time: 0.297374913 seconds (16303756 bytes allocated)

Display solution path.

In [14]:
using DataFrames

lassopath = DataFrame(Predictor = rep(1:p, length(λlist)), 
    Coeff = reshape(βpath[2:end, :], (p * length(λlist))), 
    λ = rep(λlist, each=p))

plot(lassopath, x=:λ, y=:Coeff, color=:Predictor, Geom.line,
    Guide.xlabel("λ"), Guide.ylabel("βhat(λ)"), Guide.title("Lasso"),
    Scale.color_discrete_hue)
Out[14]:
λ -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 -200 -190 -180 -170 -160 -150 -140 -130 -120 -110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 -200 0 200 400 -200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 1 2 3 4 5 6 7 8 9 10 Predictor -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 -7.0 -6.8 -6.6 -6.4 -6.2 -6.0 -5.8 -5.6 -5.4 -5.2 -5.0 -4.8 -4.6 -4.4 -4.2 -4.0 -3.8 -3.6 -3.4 -3.2 -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 -10 -5 0 5 10 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 βhat(λ) Lasso

Solution path of square-root lasso

Set up a grid of \(\lambda\) and solve the lasso problem

at each value of \(\lambda\).

In [15]:
λlist = 0:0.5:10
βpath = zeros(p + 1, length(λlist))
# βhat are optimization variables
βhat = Variable(size(X, 2))
@time for i = 1:length(λlist)
    λ = λlist[i]
    problem = minimize(norm(y - X * βhat) + λ * norm(βhat[2:end], 1))
    solve!(problem)
    βpath[:, i] = βhat.value
end
elapsed time: 0.169515327 seconds (13903876 bytes allocated)

Display solution path.

In [16]:
sqrtlassopath = DataFrame(Predictor = rep(1:p, length(λlist)), 
    Coeff = reshape(βpath[2:end, :], (p * length(λlist))), 
    λ = rep(λlist, each=p))
plot(sqrtlassopath, x=:λ, y=:Coeff, color=:Predictor, Geom.line,
    Guide.xlabel("λ"), Guide.ylabel("βhat(λ)"), Guide.title("Sqrt-Root Lasso"),
    Scale.color_discrete_hue)
Out[16]:
λ -15 -10 -5 0 5 10 15 20 25 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 -10 0 10 20 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 Predictor -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 -7.0 -6.8 -6.6 -6.4 -6.2 -6.0 -5.8 -5.6 -5.4 -5.2 -5.0 -4.8 -4.6 -4.4 -4.2 -4.0 -3.8 -3.6 -3.4 -3.2 -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 -10 -5 0 5 10 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 βhat(λ) Sqrt-Root Lasso
In [17]: