Biostat M280: Lasso and Square-Root Lasso by QP and SOCP¶

Dr. Hua Zhou, Mar 7, 2016¶

This code snippet demonstrates solving lasso and square-root lasso via Quadratic Programming (QP) and Second Order Cone Programming (SOCP) using the Convex.jl package in Julia.

Display system information.

versioninfo()

Julia Version 0.4.3

Simulate a data set¶

using Gadfly

# random seed
srand(2015790)
# n = # obs, p = # predictors excluding intercept
n = 100; p = 10
# Design matrix
X = [ones(n, 1) randn(n, p)]
# True regression coefficients (first 5 are non-zero)
β = [1.0; randn(5, 1); zeros(p - 5, 1)]
# Responses
y = X * β + randn(n, 1)

# plot the true β
plot(x=1:p, y=β[2:end], Geom.point, Guide.title("True β"), 
    Guide.xlabel("j"), Guide.ylabel("βj"))

Solution path of lasso¶

Load the Convex.jl package.

using Convex

## Use Gurobi solver
#using Gurobi
#solver = GurobiSolver(OutputFlag=0)
#set_default_solver(solver)

# Use Mosek solver
using Mosek
solver = MosekSolver(LOG=0)
set_default_solver(solver)

## Use SCS solver
#using SCS
#solver = SCSSolver(verbose=1)
#set_default_solver(solver)

Mosek.MosekMathProgSolverInterface.MosekSolver(Any[(:LOG,0)])

Set up a grid of $\lambda$ and solve the lasso problem \begin{eqnarray*} &\text{minimize}& \frac 12 |y - \beta_0 1 - X \beta|_2^2 + \lambda |\beta|_1 \end{eqnarray*} at each value of $\lambda$.

λlist = 0:10:200
βpath = zeros(p + 1, length(λlist))
# βhat are optimization variables
βhat = Variable(size(X, 2))
@time for i = 1:length(λlist)
    λ = λlist[i]
    problem = minimize(0.5 * sumsquares(y - X * βhat) + λ * norm(βhat[2:end], 1))
    solve!(problem)
    βpath[:, i] = βhat.value
end

  0.242228 seconds (69.66 k allocations: 12.229 MB)

Display solution path.

using DataFrames

lassopath = DataFrame(Predictor = rep(1:p, length(λlist)), 
    Coeff = reshape(βpath[2:end, :], (p * length(λlist))), 
    λ = rep(λlist, each=p))

plot(lassopath, x=:λ, y=:Coeff, color=:Predictor, Geom.line,
    Guide.xlabel("λ"), Guide.ylabel("βhat(λ)"), Guide.title("Lasso"),
    Scale.color_discrete_hue)

Solution path of square-root lasso¶

Set up a grid of $\lambda$ and solve the lasso problem \begin{eqnarray*} &\text{minimize}& |y - \beta_0 1 - X \beta|_2 + \lambda |\beta|_1 \end{eqnarray*} at each value of $\lambda$.

λlist = 0:0.5:10
βpath = zeros(p + 1, length(λlist))
# βhat are optimization variables
βhat = Variable(size(X, 2))
@time for i = 1:length(λlist)
    λ = λlist[i]
    problem = minimize(norm(y - X * βhat) + λ * norm(βhat[2:end], 1))
    solve!(problem)
    βpath[:, i] = βhat.value
end

  0.252352 seconds (136.19 k allocations: 13.315 MB, 2.89% gc time)

Display solution path.

sqrtlassopath = DataFrame(Predictor = rep(1:p, length(λlist)), 
    Coeff = reshape(βpath[2:end, :], (p * length(λlist))), 
    λ = rep(λlist, each=p))
plot(sqrtlassopath, x=:λ, y=:Coeff, color=:Predictor, Geom.line,
    Guide.xlabel("λ"), Guide.ylabel("βhat(λ)"), Guide.title("Sqrt-Root Lasso"),
    Scale.color_discrete_hue)