2. Disclaimer
This is not a description of the interview process at OLX
All similarities are purely coincidental :-)
3. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
4. Job search
Network Get on LinkedIn, connect with people
Apply Find jobs (LinkedIn, Glassdoor) and apply
Build Do pet projects, share on the internet
Interview Note what’s asked, learn that better
Repeat
Heavily inspired by http://brohrer.github.io/get_data_science_job.html
5. Network
● Go to LinkedIn
● Connect to everyone you know
● Connect to people you don’t know
○ People from your university, bootcamp
○ People who work in your city
○ People who work in your field
● Engage
○ Ask how they landed the job
○ Ask if they have open positions
○ Share content
6. ● LinkedIn
● Glassdoor
● BerlinStartupJobs
● Stackoverflow Jobs
● StepStone
● XING
● Anywhere you can find
Apply even if you don’t have some of the required skills
Apply
7. Apply
Who We Are Looking For
● A strong understanding of the state of the art in machine learning methods and in-depth Data Science
techniques for practical application.
● Significant experience in the tech industry.
● Excellent educational background; plus if holding an advanced degree in Data Science, Computer
Science, Mathematics, NLP, Computational Linguistics, Physics, or similar quantitative field.
● Advanced engineering abilities to deliver flexible and scalable end-to-end machine learning solutions.
● Great coding skills in Python and knowledge of data libraries such as sklearn, NLTK, and pandas.
Knowledge of Deep Learning frameworks like Tensorflow or Keras is a plus.
● Exceptional written and verbal communication skills, with an ability to listen and show empathy.
● A self-organized individual, with excellent focus and prioritization of workload using business data and
metrics.
● A role model who enjoys mentoring team members.
Randomly selected job listing from LinkedIn
8. Apply
Who We Are Looking For
● A strong understanding of the state of the art in machine learning methods and in-depth Data Science
techniques for practical application.
● Significant experience in the tech industry.
● Excellent educational background; plus if holding an advanced degree in Data Science, Computer
Science, Mathematics, NLP, Computational Linguistics, Physics, or similar quantitative field.
● Advanced engineering abilities to deliver flexible and scalable end-to-end machine learning solutions.
● Great coding skills in Python and knowledge of data libraries such as sklearn, NLTK, and pandas.
Knowledge of Deep Learning frameworks like Tensorflow or Keras is a plus.
● Exceptional written and verbal communication skills, with an ability to listen and show empathy.
● A self-organized individual, with excellent focus and prioritization of workload using business data and
metrics.
● A role model who enjoys mentoring team members.
This is a perfect candidate
who doesn’t exist!
Randomly selected job listing from LinkedIn
9. Apply
Who We Are Looking For
● A strong understanding of the state of the art in machine learning methods and in-depth Data Science
techniques for practical application.
● Significant experience in the tech industry.
● Excellent educational background; plus if holding an advanced degree in Data Science, Computer
Science, Mathematics, NLP, Computational Linguistics, Physics, or similar quantitative field.
● Advanced engineering abilities to deliver flexible and scalable end-to-end machine learning solutions.
● Great coding skills in Python and knowledge of data libraries such as sklearn, NLTK, and pandas.
Knowledge of Deep Learning frameworks like Tensorflow or Keras is a plus.
● Exceptional written and verbal communication skills, with an ability to listen and show empathy.
● A self-organized individual, with excellent focus and prioritization of workload using business data and
metrics.
● A role model who enjoys mentoring team members.
Randomly selected job listing from LinkedIn
10. Apply
Who We Are Looking For
● A strong understanding of the state of the art in machine learning methods and in-depth Data Science
techniques for practical application.
● Significant experience in the tech industry.
● Excellent educational background; plus if holding an advanced degree in Data Science, Computer
Science, Mathematics, NLP, Computational Linguistics, Physics, or similar quantitative field.
● Advanced engineering abilities to deliver flexible and scalable end-to-end machine learning solutions.
● Great coding skills in Python and knowledge of data libraries such as sklearn, NLTK, and pandas.
Knowledge of Deep Learning frameworks like Tensorflow or Keras is a plus.
● Exceptional written and verbal communication skills, with an ability to listen and show empathy.
● A self-organized individual, with excellent focus and prioritization of workload using business data and
metrics.
● A role model who enjoys mentoring team members.
Randomly selected job listing from LinkedIn
11. Build
Note the skills — and build pet projects to learn these skills
Get an idea
Train
a model
Share code
on GitHub
Write
a blog post
12. Interview
● Don’t worry!
○ Remember, it’s not an exam
○ You also interview the company, not only the company interviews you
● Do as many interviews as you can
● After each interview:
○ Write down what was asked and your answers
○ Do a retrospective: am I satisfied with my answers? What could I do better next time?
● Rejections are normal — don’t take them personally
○ There are million reasons why they decided not to continue
○ Hired another candidate, sudden hiring freeze, company went bankrupt
○ Just keep interviewing
13. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
14. Data science interview
● Initial call with recruiter
● Screening: theory
● Screening: coding
● Home assignment
● Case study interview
● System design interview
● Behavioural interview
(not necessarily all of them and not necessarily in this order)
15. Initial call with recruiter
● General introduction
○ What the company is doing
○ What the position is about
● Tell me about yourself
● Recruiters aren’t technical! But, they can ask:
○ “Have you used X?”
○ “How many years of experience do you have with Y?”
○ “Do you know Z?”
● What are your salary expectations?
● Do you have any questions?
16. Initial call with recruiter
Tell me about yourself
● Prepare a small introduction (a few sentences)
● Know it by heart
17. Initial call with recruiter
What are your salary expectations?
I want X EUR / yearLet’s talk about it later
18. Initial call with recruiter
What are your salary expectations?
I want X EUR / yearLet’s talk about it later
● When start in a new field and don’t
know the market
● When relocate to a new city and
don’t know the market
19. Initial call with recruiter
What are your salary expectations?
I want X EUR / yearLet’s talk about it later
● When you already have a job
20. Initial call with recruiter
What are your salary expectations?
I want X EUR / yearLet’s talk about it later
● When start in a new field and don’t
know the market
● When relocate to a new city and
don’t know the market
● When you already have a job
In any case, you should have a number in mind
21. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
22. Screening: theory
● Theoretical Data Science and Machine Learning questions
● Examples
○ What is linear regression?
○ What is overfitting?
○ What is the ROC curve? When to use it?
○ What is random forest?
○ What’s the difference between random forest and gradient boosting?
● Remember: that’s a screening interview:
○ No need for in-depth answer
○ A simple explanation is enough
23. Screening: theory
What if you don’t know something?
● Tell about it. The interviewer will give hints or move on
● Don’t invent things!
26. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
29. You have this schema.
Write queries to calculate:
● The number of active ads (status=active)
● The number of events per each ad
○ broken down by event type
● The number of events per campaign
○ broken down by event type
● CTR (click-through rate) for each ad
○ CTR = number of impressions / number of clicks
SQL
30. Coding (Python)
Coding tasks to check basics of Python. For example
● You have a bunch of IDs in form “identifier-SITE”
● Count how many times each SITE appears
32. Coding (Algorithms)
Note:
● Not many companies test data scientists for CS basics and algorithms
● Check with your recruiter to be sure if you need to prepare for them
Preparation:
● LeetCode
● Write notes — to review later
33. Coding (Algorithms)
When practicing, keep notes:
this way later it’s easier to
revise your solutions
https://github.com/alexeygrigorev/leetcode-solutions/blob/master/solutions.md
36. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
37. Home assignments
Task to do at home
● A company gives a public (or private) dataset and a task
● You need to do write some code to this task at home
38. Home assignments
Examples:
● Finding the most suitable answer (chatbot company)
● Predicting the price of a car (car reseller)
● Classify a listing of an apartment as low, middle or high interest (classifieds)
40. Home assignment: example
● List as many use cases for the dataset as possible
● Pick one of the cases and describe how a model could improve our business
● Implement the model in R or Python: train and test a model, report relevant
performance metrics
● Explain each choice you made, compare it with alternatives
● Describe how you would improve the model if you had more time
41. Home assignments
Warning:
● Home assignments are very time consuming
● Sometimes, you spend time, develop the perfect solution… and get rejected
Watch out:
● Automatic mails right after you applied (talk to a human first!)
● No clear answer regarding some things (it’s hard to read reviewer’s mind)
● Difficult tasks that “could be solved in just 2 hours”
42. Home assignments
● Another project in your portfolio
● A new library / model / etc
● Time consuming
● Often rejected with no sensible
feedback
43. Home assignments: how to decline
Thank you for sending the assignment.
Unfortunately, I don’t have enough free time to finish this task in a reasonable timeframe.
However, if I had time, I’d (in a few sentences describe how you’d solve it).
I understand that it’s important for you to see the code of candidates before making the decision
to move forward. This is why I selected a few of my pet projects that look similar to your test case:
Project A and Project B. You can see them in my GitHub profile (link).
I would be happy if we still could continue our conversation because (reasons why you like this
company).
Handle with care: usually companies don’t agree to skip this step
44. Home assignment: defence session
● Prepare a presentation
● Be ready to explain how to continue and improve
My presentation
during interview
at OLX
46. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
47. Case study
● No code: a discussion on how to approach a certain problem
● Usually happens on-site, rarely as a home assignment
● Check how candidates approaches data science projects
48. A typical data science project involves:
● Problem understanding and formulation
● Data collection and data transformation
● Model training and offline evaluation
● Model deployment and online evaluation
You should cover all these steps in the case study
(Check CRISP-DM!)
Case study
https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining
49. Example:
● Imagine we wanted to build a model for predicting car prices / churning
customers. What would you do?
Sometimes more vague:
● We want to improve user engagement on our platform. How would you
approach that?
Case study
50. Case study: hint
Suggest to start with simple baselines and iterate
“How would you build a model for predicting car prices?”
● Start with mean price per model/make
● Roll it out to production
● Then improve: use linear regression or Tree-based models
52. Case study: preparation
Learning from data science competitions
For each competition, answer:
● What is the problem?
● What kind of data organizers prepared?
● What are the features? What is the target?
● What is the evaluation metric?
● What did the winners use in their solution?
53. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
54. System design interview
Focus on engineering:
● Building pipelines
● Building the system
Examples:
● Design imageboard (instagram)
● Design code snippets sharing system (pastebin)
● Design a system for online library
55. ML system design
● “Traditional” system design — for “traditional” software engineers
● ML system design — for ML engineers
Examples:
● Duplicate detection
● Spam detection
● A “smart” news feed
● Search autocomplete
● Serving deep learning models
56. ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
https://tech.olx.com/detecting-image-duplicates-at-olx-scale-7f59e4b6aef4
“Design a system for dealing with duplicates”
57. System design
● Collect as many requirements as possible from the interviewer
● Break down the system into component
● Explain trade-offs when selecting different options
○ e.g. relational vs NoSQL databases
● How will it scale if load increases 10 times? 100 times?
58. System design
How to prepare:
● Do this at work (pet projects are often not enough…)
● Look up “system design interview X”
○ “System design interview Twitter”
○ “System design interview Instagram”
○ …
● Go to conferences
● Read blogs
● Read the company’s tech blog
59. System design
● Not for junior positions!
● Also not always for middle
● Quite often for senior
60. Case studies vs System design
https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining
61. Case studies vs System design
https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining
Case studies
62. Case studies vs System design
https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining
Case studies
System Design
63. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
64. Behavioural interview
● A way to see if there’s “cultural fit”
● Often done by the hiring manager or the recruiter
65. Behavioural interview
“Tell me about time when you…”
● disagreed with your manager or colleague (or teammate for a student project)
● mentored somebody
● needed to go beyond your direct responsibilities
● were stuck, and nobody could help you
66. Behavioural interview
How to prepare:
● Do research — are the company’s values publicly available?
● For each value, think of 2-3 situations when you demonstrated it
Follow the STAR principle:
● Situation
● Task
● Action
● Result
67. Behavioural interview
STAR
● Situation: I had a problem with lib X, but nobody knew it and could help me
● Task: I needed to fix this problem it was blocking my progress
● Action: I found an online course about it and took it
● Result: I fixed the problem and showed others how to do it
69. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
70. After the interview
You get a mail from the recruiter
Unfortunately, the hiring team decided
not to move forward …
Congratulations! You’ve passed all the
stages of our interview process
71. ● Failing an interview is also a good experience
● Remember to do a retrospective!
○ You learned the questions
○ Now you know where to focus
● Build a project around skills you lack. Share it on the internet
What if you failed?
72. Offer
● Congrats!
● Don’t agree to the offer immediately
● Write about the offer to all other recruiters
● Finish all the other interviews you started
73. Salary negotiation
● Ideal situation: multiple offers
● Tell the companies about each other
A few notes:
● If you have only one offer, it’s very difficult to negotiate
● If you said you want X and you got X, you can’t do anything about it
74. Salary negotiation
Some reading:
● Career Advice and Salary Negotiations: Move Early and Move Often
● Ten Rules for Negotiating a Job Offer
75. Plan
● Job search
● Data science interview
○ Initial call with recruiter
○ Screening: theory
○ Screening: coding
○ Home assignment
○ Case study interview
○ System design interview
○ Behavioural interview
● After the interview
76. Summary
● Looking for a job: Network ⇒ Apply ⇒ Build ⇒ Interview ⇒ Repeat
● Apply even if you think you're not qualified
● Do a retrospective after the interview
● Don’t take rejections personally
● Desired salary: don’t have to tell upfront, but have a number in mind
● Practice theory and coding (SQL and Python)
● Home tests are time consuming, but it’s also a good opportunity to learn
● Learn from data science competitions
● Prepare stories in STAR format for behavioural questions
● Don’t accept the offer immediately, try to get more than one offer
78. Machine Learning Bookcamp
● Learn Machine Learning by doing projects
● Want it for free? Let me know
● (only for SPICED students on this call)
● http://bit.ly/mlbookcamp
● 40% off with code: “grigorevpc”