Structure of syllabus

Syllabus is divided into 4 courses.

  • First course is a crash course to Data Science workflow. It consists 4 sub-sections, brief explaination of R data structures and syntax. Exploratory data analysis using base R and base graphics. Exploratory data analysis using data.table, lattice and ggplot2. Lastly the single example of Machine Learning using h2o package.
  • Second course is more in-depth explanation of different scenarios in Data Science workflow. Programming in R, Data IO, remote computing, data transformation, more Machine Learning examples, focus on interpretability. Developing own R package and lastly developing data products.
  • Third course covers advanced R language feature, advanced queries against the data. Automation of Machine Learning. Writing C code for your R package. Turning C code to run on multiple threads with OpenMP.
  • Fourth course is less focused on R and more on infrastructure for running R based projects. It covers common unix productivity tools, productionizing your project, deployment, adminstration, maintanance. Setting up continuous integration for R projects.