Multinomial-logit regression and sparse regression
A demo of Multinomial-logit regression and sparse regression
Contents
Generate multinomial random vectors from covariates
clear; % reset random seed s = RandStream('mt19937ar','Seed',1); RandStream.setGlobalStream(s); % sample size n = 200; % # covariates p = 15; % # bins d = 5; % design matrix X = randn(n,p); % true regression coefficients: predictors 1, 3, and 5 have effects B = zeros(p,d-1); nzidx = [1 3 5]; B(nzidx,:) = ones(length(nzidx),d-1); prob = zeros(n,d); prob(:,d) = ones(n,1); prob(:,1:d-1) = exp(X*B); prob = bsxfun(@times, prob, 1./sum(prob,2)); batchsize = 25+unidrnd(25,n,1); Y = mnrnd(batchsize,prob); zerorows = sum(Y,2); Y=Y(zerorows~=0, :); X=X(zerorows~=0, :);
Fit multinomial logit regression
tic; [B_hat, stats] = mnlogitreg(X,Y); toc; display(B_hat); display(stats.se); display(stats); % Wald test of predictor significance display('Wald test p-values:'); display(stats.wald_pvalue); figure; plot(stats.logL_iter); xlabel('iteration'); ylabel('log-likelihood');
Elapsed time is 0.064823 seconds.
B_hat =
0.9062 0.9845 0.9823 1.0066
0.0491 0.0174 0.0223 -0.0064
0.9543 0.9696 0.9683 0.9576
-0.0121 -0.0053 0.0275 0.0129
0.9871 1.0022 0.9483 0.9692
0.0010 -0.0019 -0.0619 0.0244
-0.0288 -0.0399 -0.0092 -0.0297
-0.0591 0.0135 -0.0068 0.0009
-0.0469 -0.0452 -0.0606 -0.0716
0.0122 -0.0213 -0.0476 -0.0115
0.0758 -0.0163 0.0003 0.0828
-0.0039 -0.0010 -0.0128 -0.0097
-0.0780 -0.0469 -0.0415 -0.0474
0.0318 -0.0152 -0.0228 0.0117
-0.0009 -0.0230 -0.0645 -0.0423
0.0430 0.0434 0.0434 0.0436
0.0454 0.0456 0.0455 0.0455
0.0420 0.0421 0.0421 0.0423
0.0415 0.0416 0.0414 0.0417
0.0441 0.0442 0.0438 0.0441
0.0381 0.0380 0.0380 0.0380
0.0380 0.0382 0.0380 0.0382
0.0387 0.0385 0.0385 0.0388
0.0378 0.0379 0.0378 0.0380
0.0397 0.0400 0.0401 0.0399
0.0433 0.0433 0.0430 0.0432
0.0427 0.0427 0.0426 0.0427
0.0411 0.0412 0.0412 0.0413
0.0387 0.0387 0.0387 0.0388
0.0398 0.0398 0.0398 0.0400
stats =
struct with fields:
BIC: 3.5804e+03
AIC: 3.3825e+03
dof: 60
iterations: 5
logL: -1.6313e+03
logL_iter: [1×5 double]
prob: [200×5 double]
yhat: [200×5 double]
gradient: [60×1 double]
observed_information: [60×60 double]
H: [60×60 double]
se: [15×4 double]
wald_stat: [1×15 double]
wald_pvalue: [1×15 double]
Wald test p-values:
Columns 1 through 7
0 0.7611 0 0.8724 0 0.2120 0.8340
Columns 8 through 14
0.3342 0.3851 0.6591 0.0494 0.9974 0.4458 0.6268
Column 15
0.4288
Fit multinomial logit sparse regression - - lasso/group/nuclear penalty
penalty = {'sweep','group','nuclear'};
ngridpt = 10;
dist = 'mnlogit';
for i = 1:length(penalty)
pen = penalty{i};
[~, stats] = mglm_sparsereg(X,Y,inf,'penalty',pen,'dist',dist);
maxlambda = stats.maxlambda;
lambdas = exp(linspace(log(maxlambda),-log(size(X,1)),ngridpt));
BICs = zeros(1,ngridpt);
LogLs = zeros(1,ngridpt);
Dofs =zeros(1, ngridpt);
tic;
for j=1:ngridpt
if j==1
B0 = zeros(p,d-1);
else
B0 = B_hat;
end
[B_hat, stats] = mglm_sparsereg(X,Y,lambdas(j),'penalty',pen, ...
'dist',dist,'B0',B0);
BICs(j) = stats.BIC;
LogLs(j) = stats.logL;
Dofs(j) = stats.dof;
end
toc;
% True signal versus estimated signal
[bestbic,bestidx] = min(BICs);
B_best = mglm_sparsereg(X,Y,lambdas(bestidx),'penalty',pen,'dist',dist);
% display MSE of regularized estiamte
display(norm(B_best-B,2)/sqrt(numel(B)));
figure;
subplot(1,3,1);
semilogx(lambdas,BICs);
ylabel('BIC');
xlabel('\lambda');
xlim([min(lambdas) max(lambdas)]);
subplot(1,3,2);
imshow(mat2gray(-B)); title('True B');
subplot(1,3,3);
imshow(mat2gray(-B_best)); title([pen ' estimate']);
end
Elapsed time is 0.158750 seconds.
0.0751
Elapsed time is 0.091532 seconds.
0.0638
Elapsed time is 0.231174 seconds.
0.1726