If you are looking for a career where your services will be in high demand, you should find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap. So what’s getting ubiquitous and cheap? Data. And what is complementary to data? Analysis.
Prof. Hal Varian UC Berkeley, Chief Economist at Google, interviewed by Freakonomics
Co-founder & CTO of Datasalt. He’s core committer in two Hadoop-based open-source projects, Splout SQL and Pangool. Splout provides a SQL view over Hadoop's Big Data with sub-second latencies and high throughput. Pangool is an improved low-level Java API for Hadoop based on the Tuple MapReduce paradigm (ICDM 2012). Pere is an early adopter of Hadoop, working in Big Data projects since 2008. He’s also the organizer of Big Data Beers Berlin.
Mikio is a data science researcher and blogger. He previosuly was co-founder of streamdrill, a company focussing on real-time data analysis. He is part in the Berlin Big Data Competence Center which aims to bring together machine learning and scalable technologies to create the next generation of Big Data infrastructure. He is also the author of jblas, a fast matrix library for Java which is used by PayPal, and Breeze, and Apache Spark.
Adam Drake is Chief Data Officer at one of the world's most successful online travel companie. He has been in technology roles for over 15 years in a variety of industries, including online marketing, financial services, healthcare, and oil and gas. His background is in Applied Mathematics, and his interests include online learning systems, high-frequency/low-latency data processing systems, recommender systems, distributed systems, and functional programming (especially in Haskell).
Jose Quesada is the founder and director of DSR. Jose helps others to decide better, do better, or be better through data. Like everyone else, he doesn’t know what data science really is, but suspects it has to do with predicting the future before it catches you empty-handed. He has a PhD in Machine learning and worked at top labs (U. of Colorado, Boulder, Carnegie Mellon) Previously he worked as a data scientist consulant, specializing in customer lifetime value, and as the head data scientist for GetYourGuide.
Trent is co-founder & CTO of ascribe, which uses modern crypto, ML, and big data to tackle challenges in digital property ownership. His previous two startups applied ML in the enterprise semiconductor space: ADA was acquired in 2004 and Solido is going strong. He got his start doing neural networks research at the Canadian Department of National Defence in the mid 90s. He has an engineering PhD in applied ML from KU Leuven, Belgium. His interests include large scale regression, automating creativity, anything labeled "impossible", and thousand-fold improvements. He was raised on a pig farm in Canada.
Arunkumar Srinivasan holds a Bachelors degree in Electronics Engineering from India, and Masters in Bioinformatics from Germany. He is currently finishing his PhD in Bioinformatics at the Max Planck Institute. He started using R since late 2011 and is one of the main contributors to the data.table package. He routinely works with data sizes in the order of several GBs, and has a passion for developing tools and algorithms facilitating big-data analyses.
Co-founder & CTO of Oberbaum Concept, a company focused on Big Data consulting and development. Christoph is an early adopter of Hadoop, working in Big Data projects since 2009. He’s also one of the organizers of Big Data Beers Berlin Meetup.
Andreas is currently working as Principal Data Scientist at Skyscanner. Before his current position, Andreas worked for several different companies in the mobile- and online-advertising space. His work has had a strong focus on building and implementing algorithms and using data to maximise revenue. Andreas has been an avid programmer for almost 20 years and he is especially passionate about stream-based event processing and functional programming. His academic background is in theoretical Computer Science, focusing on functional programming and how to automatically prove properties of algorithms and software, in order to create provably correct systems
Marek is a true R hacker and enthusiast since the Paleozoic era of R_1.4.0. An author of best-selling Polish book on R programming and many R packages, including the famous stringi packages. Computer programmer since the age of 6 (C64 basic, C/C++, assembler, PHP, Java, VHDL, bash, Julia, Maxima, Lisp, Fortran and many others). He has a PhD in computer science and specializes in data aggregation, fusion and mining, computational statistics, and uncertainty modeling. Currently an assistant professor and a glamorous tutor and mentor the Warsaw University of Technology, Poland.
Daniel is an expert software engineer, Python programmer, and machine learning specialist. When he's not developing high-performing, end-to-end pattern recognition and predictive analytics systems for his clients, Daniel's learning new tricks to train deep neural networks more efficiently. Through his company Natural Vision, he's been successfully applying deep learning to problems in bioacoustics, computer vision, and text mining.
Jackie's interests were nurtured in the machine learning group at the MPI in Tuebingen where she worked on kernel methods and has since ventured to the probabilistic side using Bayesian modelling, and now sometimes even combines them. Her primary applications are neuroscience and image processing. She is currently at The Institute of Technology, Berlin and is putting the finishing touches on her PhD thesis about large-scale approximate inference in probabilistic models.
Yes, you will need a solid background in linear algebra and probability theory to create new algorithms. But this is very different from what you need to simply apply algorithms known to work for a class of problems. Vision + good judgement + intuition + hacking skills + natural analytic skills + craftsmanship + curiosity + Google skills can often be more useful and less expensive than advanced math knowledge. Most people are very comfortable with probabilities and linear algebra by the end of the program.
Absolutely yes. Being comfortable with at least one programming language is a prerequisite, but if you have never put a system in production, written tests, or used version control, etc. such skills comprise software engineering and craftsmanship, and you will pick those up. We use both R and Python. Most people are comfortable with both by the end of the program.
You will present biweekly with tight timing, getting feedback by your peers and one instructor. Video recordings will be reviewed by a technical communication expert. He will provide two individual sessions with laser-focused feedback. Because no matter how accurate your algorithm predictions are, if you cannot convince the decision makers in a tight time window, it will not have mattered. This all goes into making a memorable Portfolio Project presentation that makes companies take note.
We expect you to have basic programming experience and familiarity with databases. Exercises will be in R or Python. You'd need to have a basic understanding of at least one of these languages.