In this document, we provide a short tutorial on how to use the \(\texttt{MSRL}\) R package. If you encounter any errors or strange behavior, please report the issue at https://github.com/ajmolstad/MSRL. We start by downloading this package from GitHub.
install.packages("devtools")
library(devtools)
devtools::install_github("ajmolstad/MSRL")
library(MSRL)
The multivariate square-root lasso estimator is defined as \[\begin{equation}\label{eq:MSRL} \arg\min_{\beta \in \mathbb{R}^{p \times q}} \left\{ \frac{1}{\sqrt{n}}\|\mathbf{Y} - \mathbf{X}\beta\|_* + \lambda \sum_{j,k}w_{j,k}|\beta_{j,k}|\right\}. \end{equation}\] where \(\|\cdot\|_*\) is the nuclear norm, i.e., the norm which sums the singular values of its matrix argument, and the \(w_{j,k}\) are weights to be defined by the user. Note that here, the \(\mathbf{Y}\) and \(\mathbf{X}\) in the definition have been centered by their columnwise means (to remove the intercept). The \(Y\) and \(X\) input into the function should not be centered. The tuning parameter \(\lambda >0\) is user specified as well: we provide functions below to help the user determine an appropriate value for \(\lambda\).
To demonstrate how to use the package, we generate some data from Model 1 of the article “Insights and algorithms for the multivariate square-root lasso”.
sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Catalina 10.15
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] Matrix_1.2-18
##
## loaded via a namespace (and not attached):
## [1] compiler_3.6.1 magrittr_1.5 tools_3.6.1 htmltools_0.4.0
## [5] yaml_2.2.1 Rcpp_1.0.4 stringi_1.4.6 rmarkdown_2.1
## [9] grid_3.6.1 knitr_1.28 stringr_1.4.0 xfun_0.12
## [13] digest_0.6.25 rlang_0.4.5 lattice_0.20-40 evaluate_0.14
# --------------------------------
# Preliminaries
# --------------------------------
set.seed(1)
p <- 100
n <- 70
q <- 20
ntest <- 100
# --------------------------------
# Generate predictors
# --------------------------------
SigmaX <- matrix(0, nrow=p, ncol=p)
for(k in 1:p){
for(j in 1:p){
SigmaX[j,k] <- .5^abs(j-k)
}
}
diag(SigmaX) <- 1
eo <- eigen(SigmaX)
SigmaXsqrt <- eo$vec%*%diag(eo$val^.5)%*%t(eo$vec)
X <- matrix(rnorm(n*p), nrow=n)%*%SigmaXsqrt
Xtest <- matrix(rnorm(ntest*p), nrow=ntest)%*%SigmaXsqrt
# -----------------------------------
# Generate regression coefficients
# -----------------------------------
beta <- matrix(0, nrow=p, ncol=q)
beta[sample(1:(p*q), floor(p*q)*.04, replace=TRUE)] <- 1
# -------------------------------------
# Generate responses
# -------------------------------------
Sigma <- matrix(0, nrow=q, ncol=q)
for(k in 1:q){
for(j in 1:q){
Sigma[j,k] <- .9^abs(j-k)
}
}
diag(Sigma) <- 1
eo <- eigen(Sigma)
Sigmasqrt <- eo$vec%*%diag(eo$val^.5)%*%t(eo$vec)
Y <- X%*%beta + matrix(rnorm(n*q), nrow=n)%*%Sigmasqrt
Ytest <- Xtest%*%beta + matrix(rnorm(ntest*q), nrow=ntest)%*%Sigmasqrt
First, we show how to fit the model without cross-validation. By setting the argument \(\texttt{nfolds = NULL}\). To track progress, we use the argument \(\texttt{quiet = FALSE}\). Since \(n > q\), we leave \(\texttt{ADMM = FALSE}\) – the function will switch to ADMM automatically if the residual matrix has fewer than \(q\) nonzero singular values.
mod1 <- MSRL.cv(X = X, Y = Y, nlambda = 10, standardize = FALSE, ADMM = FALSE,
nfolds = NULL, weighted = FALSE, delta = .25, tol = 1e-8, quiet = FALSE,
inner.quiet = TRUE)
## 1 : non-zero = 0
## # ------------------------------
## 2 : non-zero = 3
## # ------------------------------
## 3 : non-zero = 12
## # ------------------------------
## 4 : non-zero = 30
## # ------------------------------
## 5 : non-zero = 57
## # ------------------------------
## 6 : non-zero = 77
## # ------------------------------
## 7 : non-zero = 102
## # ------------------------------
## 8 : non-zero = 128
## # ------------------------------
## 9 : non-zero = 180
## # ------------------------------
## 10 : non-zero = 269
## # ------------------------------
If we set \(\texttt{inner.quiet = FALSE}\) and \(\texttt{ADMM = FALSE}\), we also print \(r_1\) and \(r_2\) defined below. These quantities measure whether the first order conditions are satisfied and will print after every 10 iterations. \[ r_1 = \frac{\sum_{j,k} 1\left(\hat{\beta}_{j,k} = 0 \cap |[X'UV']_{j,k}| \geq \sqrt{n} w_{j,k}\lambda\right) }{\sum_{j,k} 1(\hat{\beta}_{j,k} = 0)},\] and \[ r_2 = \max_{(j,k):\hat{\beta_{j,k}} \neq 0} |\frac{1}{\sqrt{n}}[X'UV']_{j,k} - w_{j,k}\lambda {\rm sign}(\hat{\beta}_{j,k})|\] where \((U,D,V) = {\rm svd}(Y - X\hat{\beta})\). This argument may be useful if convergence is slow with default settings. We decrease the convergence tolerance to display (slightly) more output for the sake of example.
mod2 <- MSRL.cv(X = X, Y = Y, nlambda = 10, standardize = FALSE, ADMM = FALSE,
nfolds = NULL, weighted = FALSE, delta = .1, tol = 1e-16, quiet = FALSE,
inner.quiet = FALSE)
## 1 : non-zero = 0
## # ------------------------------
## 2 : non-zero = 8
## # ------------------------------
## 3 : non-zero = 34
## # ------------------------------
## 50 : r1 = 0 r2 = 6.732827e-08
## 4 : non-zero = 77
## # ------------------------------
## 50 : r1 = 0 r2 = 1.064281e-06
## 5 : non-zero = 114
## # ------------------------------
## 50 : r1 = 0 r2 = 1.56599e-06
## 6 : non-zero = 209
## # ------------------------------
## 50 : r1 = 0 r2 = 0.0004127915
## 100 : r1 = 0 r2 = 8.464972e-07
## 7 : non-zero = 370
## # ------------------------------
## 50 : r1 = 0.001429593 r2 = 0.02073136
## 100 : r1 = 0 r2 = 0.01230124
## 150 : r1 = 0 r2 = 0.009410648
## 200 : r1 = 0.001420455 r2 = 0.01254607
## r1 = 798.4852 ; s1 = 0.006017927
## r1 = 0.3863704 ; s1 = 0.009022319
## r1 = 0.000152167 ; s1 = 2.429012e-05
## r1 = 1.657454e-06 ; s1 = 1.614007e-07
## r1 = 2.093852e-08 ; s1 = 1.583912e-08
## r1 = 4.729788e-10 ; s1 = 1.164857e-09
## r1 = 8.881343e-11 ; s1 = 1.970503e-10
## r1 = 9.224022e-12 ; s1 = 3.277956e-11
## r1 = 2.100555e-12 ; s1 = 6.184429e-12
## r1 = 3.737979e-13 ; s1 = 1.171012e-12
## r1 = 7.142011e-14 ; s1 = 2.222106e-13
## r1 = 1.367548e-14 ; s1 = 4.250241e-14
## r1 = 2.608089e-15 ; s1 = 8.138589e-15
## r1 = 5.000902e-16 ; s1 = 1.561459e-15
## r1 = 9.588794e-17 ; s1 = 2.998844e-16
## 8 : non-zero = 580
## # ------------------------------
## r1 = 54.1834 ; s1 = 0.0001303355
## r1 = 0.09534517 ; s1 = 0.001160648
## r1 = 0.0001121315 ; s1 = 8.57493e-05
## r1 = 2.930627e-06 ; s1 = 5.499484e-06
## r1 = 1.895234e-07 ; s1 = 8.139445e-07
## r1 = 5.360256e-08 ; s1 = 2.208821e-07
## r1 = 1.461634e-08 ; s1 = 5.825869e-08
## r1 = 4.09876e-09 ; s1 = 1.729134e-08
## r1 = 1.340517e-09 ; s1 = 5.664181e-09
## r1 = 4.697501e-10 ; s1 = 2.000564e-09
## r1 = 1.766325e-10 ; s1 = 7.560415e-10
## r1 = 7.068666e-11 ; s1 = 3.043555e-10
## r1 = 2.991455e-11 ; s1 = 1.298209e-10
## r1 = 1.329718e-11 ; s1 = 5.827964e-11
## r1 = 6.165632e-12 ; s1 = 2.732636e-11
## r1 = 2.962304e-12 ; s1 = 1.327982e-11
## r1 = 1.465563e-12 ; s1 = 6.641444e-12
## r1 = 7.424567e-13 ; s1 = 3.397244e-12
## r1 = 3.832886e-13 ; s1 = 1.768419e-12
## r1 = 2.008192e-13 ; s1 = 9.329927e-13
## r1 = 1.064314e-13 ; s1 = 4.973087e-13
## r1 = 5.690742e-14 ; s1 = 2.671539e-13
## r1 = 3.063376e-14 ; s1 = 1.443664e-13
## r1 = 1.657541e-14 ; s1 = 7.836367e-14
## r1 = 9.003751e-15 ; s1 = 4.268082e-14
## r1 = 4.905333e-15 ; s1 = 2.330558e-14
## r1 = 2.678481e-15 ; s1 = 1.275035e-14
## r1 = 1.465036e-15 ; s1 = 6.985751e-15
## r1 = 8.023612e-16 ; s1 = 3.831565e-15
## r1 = 4.398649e-16 ; s1 = 2.103257e-15
## r1 = 2.413207e-16 ; s1 = 1.155244e-15
## r1 = 1.324706e-16 ; s1 = 6.348195e-16
## r1 = 7.275044e-17 ; s1 = 3.489566e-16
## r1 = 3.996677e-17 ; s1 = 1.918668e-16
## r1 = 2.196219e-17 ; s1 = 1.055129e-16
## 9 : non-zero = 801
## # ------------------------------
## r1 = 70.72923 ; s1 = 9.598519e-05
## r1 = 0.1571464 ; s1 = 0.001163222
## r1 = 0.0006824377 ; s1 = 0.0003651847
## r1 = 2.741636e-05 ; s1 = 2.19857e-05
## r1 = 3.145861e-06 ; s1 = 4.814763e-06
## r1 = 1.092078e-06 ; s1 = 1.879424e-06
## r1 = 3.409684e-07 ; s1 = 8.108539e-07
## r1 = 2.090087e-07 ; s1 = 4.1953e-07
## r1 = 1.086547e-07 ; s1 = 2.31803e-07
## r1 = 4.99698e-08 ; s1 = 1.319138e-07
## r1 = 3.039705e-08 ; s1 = 8.317053e-08
## r1 = 1.926443e-08 ; s1 = 5.478864e-08
## r1 = 1.299987e-08 ; s1 = 3.730606e-08
## r1 = 8.72908e-09 ; s1 = 2.582624e-08
## r1 = 5.983543e-09 ; s1 = 1.817681e-08
## r1 = 4.171513e-09 ; s1 = 1.284208e-08
## r1 = 2.936489e-09 ; s1 = 9.175142e-09
## r1 = 2.086983e-09 ; s1 = 6.610864e-09
## r1 = 1.494216e-09 ; s1 = 4.795023e-09
## r1 = 1.076175e-09 ; s1 = 3.497142e-09
## r1 = 2.47672e-09 ; s1 = 3.068803e-09
## r1 = 7.13567e-10 ; s1 = 1.896803e-09
## r1 = 3.945219e-10 ; s1 = 1.217341e-09
## r1 = 2.616056e-10 ; s1 = 8.300781e-10
## r1 = 1.945272e-10 ; s1 = 5.698525e-10
## r1 = 1.256188e-10 ; s1 = 3.830034e-10
## r1 = 8.339627e-11 ; s1 = 2.632835e-10
## r1 = 5.614004e-11 ; s1 = 1.807131e-10
## r1 = 3.873262e-11 ; s1 = 1.243441e-10
## r1 = 2.68873e-11 ; s1 = 8.732318e-11
## r1 = 1.882611e-11 ; s1 = 6.166207e-11
## r1 = 1.327254e-11 ; s1 = 4.375042e-11
## r1 = 9.403153e-12 ; s1 = 3.115533e-11
## r1 = 6.689089e-12 ; s1 = 2.225156e-11
## r1 = 4.774304e-12 ; s1 = 1.59319e-11
## r1 = 3.417202e-12 ; s1 = 1.143125e-11
## r1 = 2.451715e-12 ; s1 = 8.216925e-12
## r1 = 1.762633e-12 ; s1 = 5.915707e-12
## r1 = 1.26948e-12 ; s1 = 4.264776e-12
## r1 = 9.157177e-13 ; s1 = 3.078243e-12
## r1 = 6.614318e-13 ; s1 = 2.224144e-12
## r1 = 4.783271e-13 ; s1 = 1.608505e-12
## r1 = 3.462748e-13 ; s1 = 1.164222e-12
## r1 = 2.509119e-13 ; s1 = 8.43265e-13
## r1 = 1.819626e-13 ; s1 = 6.11187e-13
## r1 = 1.320581e-13 ; s1 = 4.432384e-13
## r1 = 9.590425e-14 ; s1 = 3.216101e-13
## r1 = 6.969025e-14 ; s1 = 2.334696e-13
## r1 = 5.066909e-14 ; s1 = 1.695591e-13
## r1 = 3.685782e-14 ; s1 = 1.231931e-13
## r1 = 2.682336e-14 ; s1 = 8.953925e-14
## r1 = 1.952887e-14 ; s1 = 6.510138e-14
## r1 = 1.422353e-14 ; s1 = 4.73485e-14
## r1 = 1.036314e-14 ; s1 = 3.444717e-14
## r1 = 7.552968e-15 ; s1 = 2.506826e-14
## r1 = 5.506508e-15 ; s1 = 1.824781e-14
## r1 = 4.015675e-15 ; s1 = 1.328641e-14
## r1 = 2.929249e-15 ; s1 = 9.676306e-15
## r1 = 2.137286e-15 ; s1 = 7.048741e-15
## r1 = 1.559807e-15 ; s1 = 5.135821e-15
## r1 = 1.138611e-15 ; s1 = 3.742826e-15
## r1 = 8.31325e-16 ; s1 = 2.728215e-15
## r1 = 6.070889e-16 ; s1 = 1.989034e-15
## r1 = 4.434201e-16 ; s1 = 1.4504e-15
## r1 = 3.239335e-16 ; s1 = 1.057822e-15
## r1 = 2.366847e-16 ; s1 = 7.716397e-16
## r1 = 1.729637e-16 ; s1 = 5.629768e-16
## r1 = 1.264173e-16 ; s1 = 4.108087e-16
## r1 = 9.241064e-17 ; s1 = 2.99817e-16
## r1 = 6.756142e-17 ; s1 = 2.188483e-16
## r1 = 4.940081e-17 ; s1 = 1.597711e-16
## r1 = 3.612644e-17 ; s1 = 1.166568e-16
## 10 : non-zero = 1001
## # ------------------------------
Checking the output, we have:
str(mod2)
## List of 13
## $ beta :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
## .. ..@ i : int [1:3194] 508 540 660 746 786 839 1256 1654 0 229 ...
## .. ..@ p : int [1:11] 0 0 8 42 119 233 442 812 1392 2193 ...
## .. ..@ Dim : int [1:2] 2000 10
## .. ..@ Dimnames:List of 2
## .. .. ..$ : NULL
## .. .. ..$ : NULL
## .. ..@ x : num [1:3194] 0.2392 0.0802 0.5712 0.033 0.1546 ...
## .. ..@ factors : list()
## $ sparsity.mat: num [1:10] 0 8 34 77 114 ...
## $ err.pred : NULL
## $ err.wpred : NULL
## $ err.spec : NULL
## $ err.nuc : NULL
## $ Y.mean : num [1:20] 0.333 0.009 0.04 0.186 -0.222 ...
## $ X.mean : num [1:100] 0.1468 0.0199 0.0464 -0.027 0.0928 ...
## $ Y.sd : num [1:20] 2.5 3.29 2.72 2.8 1.7 ...
## $ X.sd : num [1:100] 0.938 0.831 1.099 1.012 0.951 ...
## $ lambda.vec : num [1:10] 0.803 0.622 0.482 0.373 0.289 ...
## $ lam.min : NULL
## $ standardize : logi FALSE
## - attr(*, "class")= chr "MSRL"
Note that \(\beta\) for all candidate tuning parameters is stored as a sparse matrix – to extract coefficients, use the \(\texttt{MSRL.coef}\) function detailed below.
Now we set \(\texttt{nfolds = 5}\) to perform 5-fold cross-validation. Note that this actually fits the model \(\texttt{nfolds} + 1\) times since it first fits the model to the entire dataset.
mod3 <- MSRL.cv(X = X, Y = Y, nlambda = 50, standardize = FALSE, ADMM = FALSE,
nfolds = 5, weighted = FALSE, delta = .1, tol = 1e-8, quiet = FALSE,
inner.quiet = TRUE)
## 1 : non-zero = 0
## # ------------------------------
## 2 : non-zero = 1
## # ------------------------------
## 3 : non-zero = 1
## # ------------------------------
## 4 : non-zero = 2
## # ------------------------------
## 5 : non-zero = 5
## # ------------------------------
## 6 : non-zero = 7
## # ------------------------------
## 7 : non-zero = 9
## # ------------------------------
## 8 : non-zero = 14
## # ------------------------------
## 9 : non-zero = 21
## # ------------------------------
## 10 : non-zero = 24
## # ------------------------------
## 11 : non-zero = 31
## # ------------------------------
## 12 : non-zero = 35
## # ------------------------------
## 13 : non-zero = 46
## # ------------------------------
## 14 : non-zero = 57
## # ------------------------------
## 15 : non-zero = 60
## # ------------------------------
## 16 : non-zero = 71
## # ------------------------------
## 17 : non-zero = 75
## # ------------------------------
## 18 : non-zero = 86
## # ------------------------------
## 19 : non-zero = 92
## # ------------------------------
## 20 : non-zero = 101
## # ------------------------------
## 21 : non-zero = 103
## # ------------------------------
## 22 : non-zero = 108
## # ------------------------------
## 23 : non-zero = 117
## # ------------------------------
## 24 : non-zero = 130
## # ------------------------------
## 25 : non-zero = 144
## # ------------------------------
## 26 : non-zero = 158
## # ------------------------------
## 27 : non-zero = 176
## # ------------------------------
## 28 : non-zero = 203
## # ------------------------------
## 29 : non-zero = 228
## # ------------------------------
## 30 : non-zero = 254
## # ------------------------------
## 31 : non-zero = 285
## # ------------------------------
## 32 : non-zero = 307
## # ------------------------------
## 33 : non-zero = 348
## # ------------------------------
## 34 : non-zero = 386
## # ------------------------------
## 35 : non-zero = 419
## # ------------------------------
## 36 : non-zero = 457
## # ------------------------------
## 37 : non-zero = 490
## # ------------------------------
## 38 : non-zero = 539
## # ------------------------------
## 39 : non-zero = 578
## # ------------------------------
## 40 : non-zero = 622
## # ------------------------------
## 41 : non-zero = 659
## # ------------------------------
## 42 : non-zero = 697
## # ------------------------------
## 43 : non-zero = 738
## # ------------------------------
## 44 : non-zero = 769
## # ------------------------------
## 45 : non-zero = 819
## # ------------------------------
## 46 : non-zero = 862
## # ------------------------------
## 47 : non-zero = 893
## # ------------------------------
## 48 : non-zero = 912
## # ------------------------------
## 49 : non-zero = 949
## # ------------------------------
## 50 : non-zero = 1001
## # ------------------------------
## # ------------------------------
## Through CV fold 1
## # ------------------------------
## # ------------------------------
## Through CV fold 2
## # ------------------------------
## # ------------------------------
## Through CV fold 3
## # ------------------------------
## # ------------------------------
## Through CV fold 4
## # ------------------------------
## # ------------------------------
## Through CV fold 5
## # ------------------------------
In the output, there are four different metrics used to measure performance in cross-validation. Let \(Y_{-k}\) and \(X_{-k}\) be the left-out-fold responses and predictors which have been centered (standardized) based on the left-in-fold data. The metrics are:
\(\texttt{err.pred}\): squared Frobenius norm prediction error \[\|Y_{-k} - X_{-k} \hat{\beta}_{k}\|_F^2,\]
\(\texttt{err.wpred}\): weighted squared Frobenius norm prediction error \[ {\rm tr}\left\{(Y_{-k} - X_{-k}\hat{\beta}_{k})W_{-k}(Y_{-k} - X_{-k}\hat{\beta}_{k})'\right\}\] where \(W_{-k}\) is a diagonal matrix with the left-out-fold inverse variances along its diagonal.
\(\texttt{err.spec}\): spectral norm prediction error. \[\|Y_{-k} - X_{-k} \hat{\beta}_{k}\|.\]
plot(x = log2(mod3$lambda.vec), y = rowMeans(mod3$err.pred), pch=20,
xlab=expression(paste(log[2], "(", lambda,")", sep="")), ylab="CV prediction error")
abline(v = which.min(rowMeans(mod3$err.pred)), lty=2)
The other options work as follows:
\(\texttt{standardize}\) is a TRUE/FALSE argument set to FALSE as default. When false, the \(X\) used in model fitting is \[ \mathbf{X} = X - 1_n \bar{X}',\] where \(\bar{X} = n^{-1}\sum_{i=1}^n x_i\). When TRUE, the \(X\) used for model fitting is standardized: \[ \mathbf{X} = (X - 1_n \bar{X}')/(1_n \bar{\sigma}_X'),\] where \(\bar{\sigma}_X\) is the \(p\)-dimensional standard deviations for each of the \(p\) predictors. All necessary adjustments are made inside the cross-validation function, in addition to both the prediction and coefficient functions.
\(\texttt{weighted}\) is a TRUE/FALSE argument. We do not recommend standardizing the response variables (and do not provide an option to do so in our software) so to adjust the level of penalty applied to \(\beta\), we include this option. When \(\texttt{weighted = FALSE}\), all \(w_{j,k} = 1\); when \(\texttt{weighted = TRUE}\), all \[w_{j,k} = \frac{1}{\bar{\sigma_{y}}_{k}},\] where \(\bar{\sigma_{y}}_{k}\) is the marginal standard deviation for the \(k\)th response variable.
\(\texttt{delta}\) is the parameter \(<1\) which specifies the range of candidate tuning parameters we consider. That is, we automatically select \(\lambda_{\rm max}\) (based on the first order conditions), and we consider a range of candidat etuning parameters going from \(\lambda_{\rm min}\) to \(\lambda_{\rm max}\) where \[\lambda_{\rm min} = \delta \lambda_{\rm max}.\]
\(\texttt{tol}\) is the convergence tolerance. A smaller tolerance leads to a more exact solution. If the default does not seem to be performing well, we recommend model fitting with \(\texttt{inner.quiet = FALSE}\) to track progress and adjusting \(\texttt{tol}\) as necessary.
We can obtain predicted values for a new set of predictors. As a default tuning parameter after cross-validation, we use the tuning parmaeter which minimizes the square Frobenius norm prediction error.
fitted.preds <- MSRL.predict(Xnew = Xtest, fit = mod3, lambda = mod3$lam.min)
str(fitted.preds)
## List of 3
## $ pred : num [1:100, 1:20] -3.014 -2.128 -1.169 1.041 -0.407 ...
## $ beta0: num [1:20, 1] 0.091 0.0359 0.0886 0.1198 0.047 ...
## $ beta : num [1:100, 1:20] 0.988 0.896 0 0 0 ...
fitted.preds$pred[1:10, 1:10]
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] -3.0137877 1.8058944 1.755830175 -1.98837199 1.207187679 -1.0048029
## [2,] -2.1277130 -1.4631429 0.004568042 0.92994383 -0.179341950 -0.3815704
## [3,] -1.1693917 -0.3081421 -0.661339168 -4.51414107 0.221771693 0.8272461
## [4,] 1.0406111 -2.0629376 0.834141553 -0.60879974 -0.833934254 0.7382318
## [5,] -0.4069587 -2.3205459 1.270310841 -2.12087358 -1.755488445 -0.7964036
## [6,] 0.7670643 2.6634630 2.861309696 0.01334273 -0.797302573 -0.5240886
## [7,] -1.6825092 0.8444608 -2.615304096 0.87866431 -0.004125322 -0.4242062
## [8,] 0.9363543 -0.4578171 0.424498647 1.49070341 3.140755666 -1.1591833
## [9,] 1.3429199 -1.5397985 -1.522559213 3.75064295 -0.655863616 -0.4740649
## [10,] 4.2140644 3.0118970 -0.351342763 1.88478608 -1.461525063 -0.2169584
## [,7] [,8] [,9] [,10]
## [1,] -0.9711783 0.3668708 0.57993528 1.1613516
## [2,] 1.7224811 -1.0826128 0.41380999 -4.0971757
## [3,] 3.6674576 0.2050251 0.56757060 2.4274046
## [4,] -1.6369173 0.7570164 1.47836244 -1.3416867
## [5,] -4.7310748 2.0722562 0.82291724 1.0599464
## [6,] 0.8668991 1.4232106 -0.60452395 4.2292524
## [7,] 2.7316803 -2.3515084 -1.07040231 0.4775065
## [8,] -1.7450004 -0.3560972 -1.60900453 -1.5817726
## [9,] 0.3931482 -0.4656568 0.01045258 -5.4429495
## [10,] 1.4798617 -1.7918932 -0.38287036 -0.0291959
plot(fitted.preds$pred[,1], Ytest[,1], pch=20, ylab="True response",
xlab="Predicted response", main="Response variable #1")
abline(0,1, lty=2)
Finally, we can also extract both the regression coefficient matrix and intercept. We display the coefficient matrix below with nonzero entries highlighted in grey.
fitted.coefs <- MSRL.coef(fit = mod3, lambda = mod3$lambda.vec[12])
str(fitted.coefs)
## List of 2
## $ beta0: num [1:20, 1] 0.3119 0.009 0.0507 0.178 -0.0887 ...
## $ beta : num [1:100, 1:20] 0.144 0 0 0 0 ...
temp <- bquote("Estimate of "~beta~"with"~lambda~ "="~.(round(mod3$lambda.vec[12], 4)))
image(t(abs(fitted.coefs$beta)), main=temp, xlab="Responses", ylab="Predictors", col = grey(100:0/100))
image(t(abs(beta)), main=expression(paste("True ", beta)), xlab="Responses", ylab="Predictors", col = grey(100:0/100))