Some FAQs during office hours are collected here:
-
Q3(a): What do you mean by adding an intercept?
Add a column of ones to the covariate matrix $X$. -
Q3(a): Shall we make year a categorical variable?
Treat it as a quantitative variable. -
Q3(b): Do you want us to use each method (QR, Cholesky, sweep) to find the regression coefficients, their s.e., and $\hat \sigma^2$, or you want us to determine which method is most efficient?
I want you to try these three methods for linear regression. No need to do efficiency comparison. - What the heck is that
pivot
from the output ofqr
andchol
functions in R? How to use it?
The functionqr
by default calls the LINPACK routinedqrdc
and produces $XP = QR$. Read documentation at https://svn.r-project.org/R/trunk/src/appl/dqrdc.f jpvt(k) contains the index of the column of the original matrix that has been interchanged into the k-th column, if pivoting was requested. Based on this, for a matrix $M$ of compatible sizes, the following shall be true:M[, pivot] == M %*% t(P)
M[, order(pivot)] == M %*% P
M[pivot, ] == P %*% M
M[order(pivot), ] == t(P) %*% M
.
- Does the trick of applying Cholesky decomposition to the augmented matrix reduce flops?
I don’t think so. It’s just an easy way to organize computation when the response vector $y$ is available.