This post should give you an insider view of how it feels to be in the market for a deep learning engineer job. I have interviewed thousands of people in machine learning in the last five years; for deep learning, only a few dozens in the last year; I’ve been paying attention to the market, who goes where, salaries etc. It’s enough for me to form an impression.

First things first; take anything I say here with a grain of salt. The market is really chaotic and you may find spots where none of my tips applies.

Let’s start.

Where do you want to work?

Valley.

That’s not a question. If you want to have a career in the DL industry, you have to be either in the US or in China. Maybe in Canada.

Warning before reading the next paragraph: I’m writing this as a total outsider to the Valley. This is just my opinion.

The Valley seems to have a big head-start. In machine learning, and more so deep learning, industry pushes the boundaries of what is possible as much as academia. And the industry in the Valley has been trying for 10-15 years longer than anywhere else. This means there’s a high concentration of talent, data, algorithms, etc in a very small place. If you go to a meetup, the level is very high. People are putting in the work to get better (they read the paper before coming to a reading group, ask pointy questions). There are more meetups. More chances of meeting a mentor, or a study partner that is as motivated as you are.

Then, machine learning seems to show ‘winners take all.’ If you are in the market to buy a self-driving car system, you want the number one. Number two, even if it’s .5% less accurate, is not as attractive. People would die, so second best gets no clients. That helps concentration of effort, talent, and opportunity (whether this is good or bad for society is a different story!).

Last: culture. This is a biggie, and hard to replicate. People and companies in the Valley try things that are laughable to more conservative companies. They keep pushing those weird ideas past the point where anybody else would. And some do pan out. This is all is takes. It’s also damn attractive for top talent. After a while, it’s the talent that forms an attractor. You want to be there because so and so is there.

If you are outside the Valley and want a career, you are in a very precarious situation. Every day you remain where you are will make it harder for you to appear world-class to hiring managers. And immigration into the US is, rightfully, very hard. Not all of the US is good career-wise. The Valley is the undisputed king. I understand Boston is a distant second. Then NY. Both are really good places too, and in their specialty they are kings (finance and biotech respectively), probably beating the Valley. 

And that’s pretty much it.

The market is pure craziness

If you are at the top of the field, expect ‘sports star’ salaries. Perks can have the shape of a Tesla Roadster. It’s an arms race. Startups are being “acquihired” at 10M per person (compare that to the hmm… ‘normal’ 1M per engineer in non-DL companies!).

Big N (Amazon, Google, Apple, Facebook, Netflix) are ramping up salaries and their recruiters are hyper-aggressive. It’s totally brutal out there. Competition for talent is like nothing I’ve seen before. Small startups who need these people offer significant equity or meaningful work that changes the world.

The code you write affects lives.

The market is getting crowded and noisy

Let’s go back to 2014 when ‘data science’ (DS) was the new shiny thing. Newspapers were writing that it was the sexiest job of the 21st century.

What happened to data science?

Everyone and their dog turned into a data scientist.

Do you have any relationship whatsoever to data or science? Congrats, you are one. All it takes is to change your title on LinkedIn.

A friend told me that for the last DS position they published in their company (a low profile Valley company you have never heard of) they got 1000 applications. They cannot process this without 100s of hours from the hiring manager. Not practical. The field is toast.

How did we get there?

  • Little to no barriers of entry. Nobody can say you are not a data scientist if you claim to be one. Until you get the job and tank big time. Lots of wasted effort everywhere. But hey, it was so easy to change your LinkedIn title.
  • Lots of MOOCs, books, papers, posts… you gotta catch them all. It looks good on LinkedIn. What did you say? You barely remember what you watched? You have zero real experience? You cannot for the life of you deliver value to the business that hired you? Bah! Details!
  • Recruiters and hiring managers can’t tell the difference. Interviews are erratic at best. Nobody really knows what they need, and even less if you have it.

This is a perfect recipe for lots of noise, little signal, and a search process that looks interminable.

Well, it’s starting to happen for Deep Learning too.

  • MOOCs? Check!
  • Github examples where you can fill the blanks and feel like you accomplished something? Check!
  • Paid-for certifications, where the questions have four choices, and you have three chances to answer? Check!

Fortunately, Deep Learning is harder. It’s easier to see the results for the untrained eye, and there’s a higher expectation for your engineering skill. It could be we will never reach the shit signal/noise ratio of DS.

Sending your resume to a black hole is a bad idea

This is obvious, but I’m amazed at how many people do precisely this and expect results. ‘Blackholing’ not working is true for all fields, not only Deep Learning: find a person that already works at the company of your dreams, try to meet face to face, impress the hell out of them. Then ask them if they could give your resume to the right person.

Attend meetups. Go wherever they are. Don’t spend your days pressing the ‘apply’ button on websites or LinkedIn. You are one more on a pile of resumes.

Better yet: make something noteworthy, and make noise online about it. This is also called ‘having a good portfolio project.’ The more I talk to companies interviewing today, the more apparent it is: A portfolio project is decisive when making hiring judgments.

Jeremy Howard recommends it.

Andrew Ng recommends it.

Why? It’s far better at discriminating talent than any other proxy

It takes a year full-time to get good enough

I keep asking my twitter followers if they know someone that, by just doing MOOCs, got to a pro level and got a job in the industry.

Crickets.

I made it my mission to find people who went from ‘solo MOOC learner to pro’. I’ve found only one (I interviewed her in our DLR podcast). She was a senior engineer already in big data, the kind that gives talks at conferences and had a track record. It took her a full year of self-study. And she could pass a serious ‘data engineer’ interview in her sleep before she started.

She told me she expects to see more success stories like her in the future. The MOOCs and blog posts are just too good. The more people have the baseline willpower it takes, the more we will hear Hero stories like hers.

CS questions (algorithms and data structures) are one more thing you need to be good at

You are not going to like this. If you want a ‘normal’ engineering job at a big N company, you have 2-3 months of studying to get in the game. That is after you have done a full CS degree. Don’t believe me? Go to reddit.com/r/cscareerquestions . You will have stories of people studying for six months and not passing the interview.

The bar is high for being a ‘normal’ engineer. Maybe they will expect less of you, given that you had to study deep learning (far harder) for a full year?

Nope. You are still an engineer! You need to write production code.

This one could be somewhat good news for you.

You know this kid who stayed in academia for five years getting a Ph.D.? He’s a serious competitor of yours, right? Well. The good news is that he doesn’t know jack shit about writing production code. And because he’s so smart, he thinks that preparing the algorithms and data structures questions is below him. He will get obliterated in a candid interview. Unfortunately, not all interviews are fair, and he may pass only to make his team members teach him how to be an engineer in the first six months of the job. At a severe productivity cost for the team.

You will bomb interviews, even if you studied hard

Listen, the people interviewing right now often come from academia. They have strong opinions about what you should know. It may be biased by the thing they spent five years working on (their Ph.D.). They will be horrified that you don’t remember some piece of arcana that was crucial in their lab, but a non-issue everywhere else. They will want to see you writing equations on a whiteboard. They will want you to laugh at some inside joke (have you ever been ‘Schmidhubered’?). They may raise an eyebrow that you don’t have any published papers.

You may find a different persona interviewing you. Maybe he’s a mega-engineer, who was the right hand of Jeff Dean for a term. He will crucify you on a different set of topics.

We are all biased as humans. We all want to do our best, most objective selection. But it’s not easy.

The result is that even with a full year of preparation, you will still bomb interviews hard.

Do not despair.

Interviewing is an art, and nowadays you have to work hard for them.

Fortunately, courses like the Deep Learning Retreat exist, and so people who have the prerequisite skills can further develop those skills and get a job in Deep Learning.