Prof. Dr. Marek Gagolewski.png

Prof. Dr. Marek Gagolewski

Dr. Marek Gagolewski is a Professor at the Polish Academy of Sciences and Warsaw University of Technology, researching and teaching on Data Science, Big Data, and Machine Learning. He teaches introductory and advanced courses in R, Python, and C++, and supervises PhD and MSc students in Computer and Data Science. He is the author of best-selling books on Python and R programming and many R packages, including the famous stringi package.
Marek holds a PhD in Computer Science from the Polish Academy of Sciences, specializing in data aggregation, fusion, and mining, as well as computational statistics and uncertainty modeling.
 

http://www.gagolewski.com

 

NUMPY, SCIPY, PANDAS, SCIKIT-LEARN

Learning outcome

Numpy and Scipy took python from a general programming language to a very powerful matrix-oriented one. Pandas brought data.frames to python. Data.frames are one of the core concepts in modern data analysis. Building on top of these data structures, Scikit-learn brought killer implementations of best-of-breed algorithms, all under a standardized library. Nowadays, python is the programming language of choice of data scientists.

PREPROCESSING WITH PANDAS

  • Reading data
  • Selecting columns and rows
  • Filtering
  • Vectorized string operations
  • Missing values
  • Handling time
  • Time series

NUMPY, SCIPY

  • Arrays
  • Indexing, Slicing, and Iterating
  • Reshaping
  • Shallow vs deep copy
  • Broadcasting
  • Indexing (advanced)
  • Matrices
  • Matrix decompositions

SCIKIT-LEARN

  • Feature extraction
  • Classification
  • Regression
  • Clustering
  • Dimension reduction
  • Model selection

Pre-requisite

  • understanding of Python and machine learning technologies