Some FAQs during office hours are collected here:

  • Q3(a): What do you mean by adding an intercept?
    Add a column of ones to the covariate matrix $X$.

  • Q3(a): Shall we make year a categorical variable?
    Treat it as a quantitative variable.

  • Q3(b): Do you want us to use each method (QR, Cholesky, sweep) to find the regression coefficients, their s.e., and $\hat \sigma^2$, or you want us to determine which method is most efficient?
    I want you to try these three methods for linear regression. No need to do efficiency comparison.

  • What the heck is that pivot from the output of qr and chol functions in R? How to use it?
    The function qr by default calls the LINPACK routine dqrdc and produces $XP = QR$. Read documentation at https://svn.r-project.org/R/trunk/src/appl/dqrdc.f jpvt(k) contains the index of the column of the original matrix that has been interchanged into the k-th column, if pivoting was requested. Based on this, for a matrix $M$ of compatible sizes, the following shall be true:
    • M[, pivot] == M %*% t(P)
    • M[, order(pivot)] == M %*% P
    • M[pivot, ] == P %*% M
    • M[order(pivot), ] == t(P) %*% M.
  • Does the trick of applying Cholesky decomposition to the augmented matrix reduce flops?
    I don’t think so. It’s just an easy way to organize computation when the response vector $y$ is available.