Checklist on your resume/cv

  • Git/GitHub (give your GitHub handle)
  • HPC (if you did HW2 Q4)
  • Shiny app
  • SQL databases
  • Apache Hadoop + Spark
  • Cloud Computing (GCP, AWS?, Azure?)
  • Docker
  • Deep learing with Keras+TensorFlow+GPU (if you did HW5???)

  • Make your GitHub repo biostat-m280-2018-winter public (after final week) and show your work to backup your resume.

  • Use these skills in your daily work: use Git/GitHub for all your daily work (homework, research projects), give presentation using R Markdown and Shiny, write blog/tutorial on GitHub, …

  • Stop your own GCP instances and release un-used static IPs to avoid charges.

  • I’ll shut down YARN cluster and the teaching server soon. Make sure you back up your stuff.

What I didn’t cover

  • Scraping data from web (Google, Twitter, Facebook).

  • Machine/statistical learning methods. Familiar with methods in Elements of Statistical Learning and software, e.g., scikit-learn.

  • Algorithms.

  • Public health applications.

  • Be open to languages. Python is a more generic programming language and widely adopted in data science. Scala is popular for implementing distributed programs. Julia is attractive for high performance scientific computing.

Today