Speeding up your R & python models: Rcpp and Cython

When writing high-quality data analysis software in R or Python that will be used by other people, you should use a compiled language if you aim to deliver the best possible performance. The aim of this course is to give you a working introduction to best practices C++ programming, data structures, and algorithms so that you can achieve these goals. You will experience how easy it is to implement R functions in C++ using the intuitive Rcpp interface. The course's scope is not limited to R; we will also show the benefits of linking C++ libraries into Python projects, via Cython.

The course discusses the following topics

  • What Rcpp and Cython is. Why C++ for data science?
  • C++ introduction: scalar data types, controlling program flow
  • Accessing R vectors thru Rcpp
  • Lists and R functions
  • C++ Standard Library - fundamental data structures and algorithms
  • Introduction to Cython, linking C++ libraries to Cython; accessing NumPy objects
  • OpenMP - multithreaded C++ made simple

After this course, you will understand

  • How to find the bottlenecks in your code
  • How to use C++ to rewrite such bottlenecks
  • How to integrate C++ with R and Python
  • How to use the most performant data structures for the task (comparing native vs C++ ones)