• Can I use loop?
    It’s a hard question to answer. A better question is can I avoid loop here? It’s ok to loop over different combination of levels. For Dirilet-multinomial MLE, ridge, and lasso, we have to loop over simulation replicates within a simulation setting.

  • The run time is excessively long. What can I do?
    • Make sure to temper the complexity of your study with your ability to address an issue clearly and completely.
    • Optimize your code as much as possible (vectorize code)
    • Check out the parallel package in R for multithreading and general parallel computing. Demo code: http://www.stat.ncsu.edu/people/zhou/courses/st810/notes/vcsim.r
  • How to generate correlated predictors in $X$?
    We can use either AR(1) model or equi-correlation model.

  • How to set random seeds?
    I don’t know of a fixed rule for setting random seeds in simulation study. I routinely use a separate seed for each combination of levels. So I can easily reproduce results for a specific setting without saving the whole mess of data.

  • How to choose number of simulation replicates $S$ for each setting? Do I need to justify my choice?
    We are statisticians; we need to choose $S$ such that the last digit you report in your table is significant. Read Dr Marie Davidian’s slides, especially page 22 -33.

  • For a fixed combination of $i$ (# populations) and $d$ (# categories), do I generate different or same true population parameters in each simulation replicate?
    I would generate different truth in each replicates, because I don’t won’t my conclusion be based on a specific true parameter values.