-
Can I use loop?
It’s a hard question to answer. A better question is can I avoid loop here? It’s ok to loop over different combination of levels. For Dirilet-multinomial MLE, ridge, and lasso, we have to loop over simulation replicates within a simulation setting. - The run time is excessively long. What can I do?
- Make sure to temper the complexity of your study with your ability to address an issue clearly and completely.
- Optimize your code as much as possible (vectorize code)
- Check out the
parallel
package inR
for multithreading and general parallel computing. Demo code: http://www.stat.ncsu.edu/people/zhou/courses/st810/notes/vcsim.r
-
How to generate correlated predictors in $X$?
We can use either AR(1) model or equi-correlation model. -
How to set random seeds?
I don’t know of a fixed rule for setting random seeds in simulation study. I routinely use a separate seed for each combination of levels. So I can easily reproduce results for a specific setting without saving the whole mess of data. -
How to choose number of simulation replicates $S$ for each setting? Do I need to justify my choice?
We are statisticians; we need to choose $S$ such that the last digit you report in your table is significant. Read Dr Marie Davidian’s slides, especially page 22 -33. - For a fixed combination of $i$ (# populations) and $d$ (# categories), do I generate different or same true population parameters in each simulation replicate?
I would generate different truth in each replicates, because I don’t won’t my conclusion be based on a specific true parameter values.