“I want to be a (better) data scientist, but working alone through online courses doesn’t cut it.”

Data Science Retreat brings together top data scientists and mentees seeking to grow an exceptional amount quickly.

  • Full-time program for 3 months in Berlin, Germany
  • 86% of participants got the job they wanted out of DSR
  • Next batch for Data Scientists starts Sept 16
  • Next batch for Big-data Engineers starts Sept 16
Want to be notified as we announce new programs?  
Click here.

Thank you! Check your INBOX for a confirmation email.

Oops! Something went wrong while submitting the form

Our Mentors

DSR is the only program worldwide whose mentors are at the Chief Data Scientist and CTO level. They are invested in your progress, and will train you to have the right mindset, solve business questions with technology, and advise leadership.

Pere Ferrera

Pere is co-founder and CTO of Datasalt. He’s a core committer in two Hadoop-based open-source projects, Splout SQL and Pangool. Splout provides a SQL view over Hadoop's Big Data with sub-second latencies and high throughput. Pangool is an improved low-level Java API for Hadoop based on the Tuple MapReduce paradigm (ICDM 2012). Pere is an early adopter of Hadoop, working in Big Data projects since 2008. He’s also the organizer of Big Data Beers Berlin.

Adam Drake

Adam Drake is Chief Data Officer at one of the world's most successful online travel companies. He has been in technology roles for over 15 years in a variety of industries, including online marketing, financial services, healthcare, and oil and gas. His background is in Applied Mathematics, and his interests include online learning systems, high-frequency/low-latency data processing systems, recommender systems, distributed systems, and functional programming (especially in Haskell).

Mikio Braun, PhD

Mikio is a data science researcher and blogger. He previously was co-founder of streamdrill, a company focussing on real-time data analysis. He is part of the Berlin Big Data Competence Center, which aims to bring together machine learning and scalable technologies to create the next generation of Big Data infrastructure. He is also the author of jblas, a fast matrix library for Java which is used by PayPal, and Breeze, and Apache Spark.

Jose Quesada, PhD

Jose Quesada is the founder and director of DSR. Jose helps others to decide better, do better, or be better through data. Like everyone else, he doesn’t know what data science really is, but suspects it has to do with predicting the future before it catches you empty-handed. He has a PhD in Machine learning and worked at top research labs (U. of Colorado, Boulder, Carnegie Mellon, Max Planck Institute). Previously he was a data scientist consultant, specializing in customer lifetime value, and as the head data scientist for GetYourGuide.

Trent McConaghy, PhD

Trent is co-founder & CTO of ascribe, which uses modern crypto, ML, and big data to tackle challenges in digital property ownership. His two startups applied ML in the enterprise semi-conductor space: ADA was acquired in 2004 and Solido is going strong. He has an engineering PhD in applied ML from KU Leuven, Belgium. His interests include large scale regression, automating creativity, anything labeled "impossible", and thousand-fold improvements. He was raised on a pig farm in Canada.

Andreas Granström

Andreas is a Principal Data Scientist at Skyscanner. He previously worked for several different companies in the mobile- and online-advertising space, where his work had a strong focus on using algorithms and data to maximise revenue. Andreas has been an avid programmer for almost 20 years and he is especially passionate about stream-based event processing and functional programming. His academic background is in theoretical Computer Science, focusing on how to automatically prove properties of algorithms and software in order to create provably correct systems.

Christoph Bauer

Christoph is the Co-founder & CTO of Oberbaum Concept, a company focused on Big Data consulting and development. Christoph is an early adopter of Hadoop, working in Big Data projects since 2009. He’s also one of the organizers of Big Data Beers Berlin Meetup. Christoph uses, and encourages the use of, Spark and sparkML.

Arunkumar Srinivasan

Arunkumar Srinivasan is finishing a PhD in Bioinformatics from the Max Planck Institute. He started using R in late 2011 and is coauthor of the data.table R package, which offers fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns and a fast file reader (fread). Arun has a passion for developing tools and algorithms facilitating big-data analyses.

David Anderson

David is the Head of Big Data Engineering at DSR. He began his career as a senior research scientist at Carnegie Mellon University, Mitsubishi Electric Research Labs, and Sun Labs. His research career focused on tangible user-interfaces and real-world applications of machine learning. Since 2005, David has been leading the development of data intensive applications for companies across europe — most recently as CTO at RetentionGrid. 

Jacquelyn Shelton

Jackie's interests were nurtured in the machine learning group at the MPI in Tuebingen where she worked on kernel methods and has since ventured to the probabilistic side using Bayesian modelling, and now sometimes even combines them. Her primary applications are neuroscience and image processing. She is currently at the Technical University of Berlin and is putting the finishing touches on her PhD thesis about large-scale approximate inference in probabilistic models.

Marek Gagolewski, PhD

Marek is a true R hacker and enthusiast since the Paleozoic era of R_1.4.0. Author of a best-selling Polish book on R programming and many R packages, including the famous stringi packages. Computer programmer since the age of 6 (C64 basic, C/C++, assembler, PHP, Java, VHDL, bash, Julia, Maxima, Lisp, Fortran and many others). Marek has a PhD in computer science and specializes in data aggregation, fusion and mining, computational statistics, and uncertainty modeling. Currently an assistant professor and a tutor and mentor at the Warsaw University of Technology, Poland.

Daniel Nouri

Daniel is an expert software engineer, Python programmer, and machine learning specialist. When he's not developing high-performing, end-to-end pattern recognition and predictive analytics systems for his clients, Daniel's learning new tricks to train deep neural networks more efficiently. Through his company Natural Vision, he's been successfully applying deep learning to problems in bioacoustics, computer vision, and text mining.

Data Science Retreat

Data Science Retreat offers two tracks, one for data scientists, and another for data engineers. Both programs are full-time for 3 months at our retreat center in Berlin, and offer access to our exceptional network of mentors. Our classes are very much hands-on, and are taught by senior data scientists and data engineers with many years of practical experience. The highlight of the retreat experience is the personal portfolio project, presented on Demo Day at the end of the program.

Data Scientist

The Data Scientist track is for data scientists with some experience in machine learning. The instruction focuses on advanced topics in machine learning, and requires programming ability, but not as much as that of a professional engineer. Some aspects of the Big-data Engineer track are included, such as Hadoop and Spark.

  • Learning to find good questions
  • Translating something vague into something actionable
  • Getting buy-in: presenting a problem and likely solution to stakeholders
  • Advanced data structures, algorithms in computer science
  • Advanced R
  • Python for data analysis (scikit-learn)
  • Basic machine learning tools (random forest, SVM, etc)
  • Beyond the basics (symbolic regression, Gaussian process models, ML tricks)
  • Optimization
  • Hive, Spark, SparkSQL
  • Machine learning at scale (MLlib)
  • Portfolio project

Tuition: 8000 € to be paid within 2 weeks of acceptance
Where: Berlin, Germany
Class size: 5-10 students
Next batch: September 16 – Dec 16 

Big-data Engineer

The Big-data Engineer track is for engineers with experience building software products. The instruction focuses on building robust, scalable, data-intensive systems. A solid introduction to machine learning is included, focusing more on implementation and putting algorithms into production.

This is arguably the longest, deepest, and most detailed Spark-centric course in existence today.  

  • Data exploration and visualization
  • Introduction to machine learning (Python)
  • Functional programming (Scala)
  • Data-intensive distributed systems (Kafka, message queues)
  • Real-world devops (Docker, Ansible)
  • Scalable databases
  • Deep dive on major machine learning methods, learning how to implement them from scratch (random forests, SVM, etc)
  • Distributed machine learning (Spark MLlib)
  • Stream processing (Spark streaming, Flink)
  • Tuning Spark
  • Portfolio project

Tuition: 8000 € to be paid within 2 weeks of acceptance
Where: Berlin, Germany
Class size: 5-10 students
Next batch: September 16 - Dec 16

Worried about accommodation, visa? You think it’ll take too long? Don’t worry. We have had people all over the world coming to DSR, and it works out just fine. Germany is very forthcoming towards foreigners with a good skillset. Berlin is far cheaper (x3) than any other major city. And our community manager will make sure you don’t waste your time; she can arrange short-term accommodation for you. Everything works in English. There are no hidden tricks, like needing a bank account to get an apartment rental, and the bank telling you they need an address to open the account :)

100% of
participants
got multiple interviews out
of DSR.

60% of
participants
had to choose from multiple job offers.

86% of
participants
got the job they wanted out of DSR.

Why you should do Data Science Retreat

Go faster

There’s plenty of good material online to learn machine learning and data science on your own. We now live in an autodidact’s paradise. The question is, how can you get there faster than everyone else?

Break the “barrier of excellence” that one reaches when learning alone

No matter how many MOOCs you do, there’s a barrier that very few people ever get past. Jump over it.

Build a serious data product

Products are the new CVs. What interviewers really want to see is “What have you done when nobody told you what to do?”

Be surrounded by other people seeking excellence

We accept about 10 people out of the 200 who apply for each batch. They are extremely motivated and have skillsets complementary to yours. Do you want to spend time in the same room with them?

Hiring Partners