Hua Zhou | 周华 | FAQs on HW9

Can I use loop?
It’s a hard question to answer. A better question is can I avoid loop here? It’s ok to loop over different combination of levels. For Dirilet-multinomial MLE, ridge, and lasso, we have to loop over simulation replicates within a simulation setting.
The run time is excessively long. What can I do?
- Make sure to temper the complexity of your study with your ability to address an issue clearly and completely.
- Optimize your code as much as possible (vectorize code)
- Check out the parallel package in R for multithreading and general parallel computing. Demo code: http://www.stat.ncsu.edu/people/zhou/courses/st810/notes/vcsim.r
How to generate correlated predictors in $X$?
We can use either AR(1) model or equi-correlation model.
How to set random seeds?
I don’t know of a fixed rule for setting random seeds in simulation study. I routinely use a separate seed for each combination of levels. So I can easily reproduce results for a specific setting without saving the whole mess of data.
How to choose number of simulation replicates $S$ for each setting? Do I need to justify my choice?
We are statisticians; we need to choose $S$ such that the last digit you report in your table is significant. Read Dr Marie Davidian’s slides, especially page 22 -33.
For a fixed combination of $i$ (# populations) and $d$ (# categories), do I generate different or same true population parameters in each simulation replicate?
I would generate different truth in each replicates, because I don’t won’t my conclusion be based on a specific true parameter values.

FAQs on HW9 01 Dec 2014