Fast-growing startups usually face a common set of challenges when employing machine learning. Data scientists are expected to work on new products and develop new models as well as iterate on existing ones. Once in production, models should be continuously monitored and regularly maintained as the infrastructure evolves. Before too long, data scientists end up spending most of their time doing maintenance and firefighting of existing models instead of creating new ones.
At GetYourGuide, we faced these challenges and decided to think about machine learning development holistically, which led us to our machine learning platform. Our platform uses MLflow to keep track of our machine learning life-cycle and ease the development experience. To integrate our models into our production environment, we also need to deal with additional requirements like API specification, SLOs and monitoring. To empower our data scientists, we have built a templating system that takes care of the heavy lifting of going to production, leveraging software engineering tools and ML-specific ones like BentoML.
In this talk we will present:
– Our previous approaches for deploying models and their tradeoffs
– Our data science and platform principles
– The main functionalities of our platform
– A live demo to create a new service
– Our learnings in the process
5. We’ve built the world’s largest
marketplace for travel activities…
Millions of travelers use
GetYourGuide every year
We facilitate the transaction We offer more than 40,000
activities worldwide
Connecting
customers...
...to suppliers
around the world
5
10. Amsterdam
:
Canal
Cruise
Other Data Products
We also use ML for:
• Demand forecasting
• Inventory labeling
⇒ 20+ ML projects distributed in 2
teams + delivered models to other
teams to maintain
10
11. Data Product Principles
We follow clean code principles
PoCs are temporary
We build solid, resilient deployment processes
Know our models health at every point in time
Quality,
Testing &
Monitoring
We integrate into the engineering
ecosystem and leverage
open-source
Data workflows are efficient and
cost-effective
We take reproducibility and
modularity seriously
Engineering
We Promote the Data Product
mindset
We deeply integrate with data
stakeholders
Stakeholder
Engagement
We actively manage the unknowns in our
planning
Data analytics dynamically informs our
project plans
Workflow
Customer and business value over
fancy solutions
Exploration is one of our goals
Strategy
We value small iterations on
existing models
We value explainability over pure
accuracy
Performance is proven online
Model
Our principles explained 11
13. How We Started
Pros
● Widely used by ML practitioners
● Good to start new projects &
prototype
● Great visualization
Cons
● No proper version control
● No code reuse
● No automatic testing
13
14. A Major Improvement
Pros
● Tests included in library
● Version control with code review
● Maintainable projects
Cons
● No CI/CD
● No model tracking
14
17. ML Platform Principles
• Maximize data scientist’s model
ownership
• Reproducible Machine Learning
• Reuse most of our existing
infrastructure
• Build incrementally
• Use open-source and open
standards
17
24. From MLflow to BentoML Example
import mlflow
from iris_classifier import IrisClassifier
# Load mlflow model
mlflow_model = mlflow.sklearn.load_model(model_uri)
# Create a iris classifier service instance
iris_classifier_service = IrisClassifier()
# Pack the newly trained model artifact
iris_classifier_service.pack("model", mlflow_model)
# Save the prediction service for model serving
saved_path = iris_classifier_service.save()
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import DataframeInput
from bentoml.frameworks.sklearn import SklearnModelArtifact
@artifacts([SklearnModelArtifact("model")])
class IrisClassifier(BentoService):
"""A minimum prediction service"""
@api(input=DataframeInput())
def predict(self, df: pd.DataFrame):
"""An inference API named `predict`"""
return self.artifacts.model.predict(df)
iris_classifier.py
mlflow2bentoml.py
24
27. ● The integration with the existing
architecture fosters proactive
collaboration
● The tool space is very new making
the exploration vital but time
consuming
● A design review is necessary to align
everyone
Learnings
27
28. Amsterdam:
Moco Museum
Conclusion
● As we grew, we needed to refine our
Data science process
● Software engineering good practices
+ special twist for ML
● Our platform helps Data Scientists to
○ Build faster
○ Deploy safer
○ Document automatically
28