Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
MetaConfig driven
FeatureStore@MakeMyTrip
~/Piyush
Head Data Platform Engineering
Namasté
About MakeMyTrip
Deliverables of this presentation:
- Why common feature store?
- Productionizating ML via standardization
- Machine Learni...
Motivation
Developing Unified Personalization platform for improving customer experience of millions of Indian
travellers
...
Before Feature Store : state of data platform
● Siloed Data Sets + Serving APIs created per use-case / projects leading to...
Productionizing ML via Standardization
● MetaConfigs & Feature Catalog : Documentation
● Reusability of features across pr...
Machine Learning Life Cycle
ML LifeCycle Image source : UCB RISE LABs
Addition : FEATURE PIPELINES
Prediction serving
- ASK : 10 -30 ms / < 30 ms
- Challenges : DNN : Complex models
- Hardware : GPUs / TPUs
- SageMaker pr...
FeatureStore Glossary
Feature : a measurable property of a phenomenon
under observation defined in FSConfig
FSConfig: used...
FeatureStore Components & Data Flow
User Funnel Activity
Streams
Client-Side
Server-Side
DATA CAPTURE COMPUTE + FSConfig S...
FSConfig : Feature Definitions & Metadata
Feature Name :
<Entity>::<Feature_shortname>::<
Data Time Interval>::<Refresh
Fr...
FS Store | online + historical
Output Schema (internal to the system)
● Historical Feature Data schema on S3 Parquet
|-- e...
FS-BrokerAPI : Online Feature Serving Framework
Data Access LayerREQUEST HANDLER Orchestration Layer
Orchestration +
Broke...
BoulderDB : Online Serving Store
- Build on top of RocksDB (embedded data store: developed by Facebook) : reducing
the dis...
Tools
Next Steps
- Feature Stats Visualization / Analytics & Monitoring // Feature
Catalog
- Seamless integration with Experimen...
References
- https://www.logicalclocks.com/feature-store/
- https://eng.uber.com/scaling-michelangelo/
- Airbnb : Zipline
...
Piyush Kumar
E : piyush.kumar@makemytrip.com
W : www.makemytrip.com
T : https://twitter.com/piykumar
Thank you !!
Prochain SlideShare
Chargement dans…5
×

Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serving Platform powering Machine Learning @MakeMyTrip by Piyush Kumar

387 vues

Publié le

MakeMyTrip - India's #1 online travel platform having more than 70% of the traffic from mobile apps embarked on a journey to revolutionize its customer experience by building a scalable, personalized, machine learning based platform which powers onboarding, in-funnel and post-funnel engagement flows, such as ranking, dynamic pricing, persuasions, cross-sell and propensity models.For a company like MakeMyTrip, the next wave of consumer growth is driven and powered by data products for personalization, context-aware mobile experiences. Having a better data architecture to ingest user activity streams (events), processing and data APIs enable a foundation for real-time feature generation for machine learning models.Topics include:* Why common feature-store, removing dataset fragmentation caused by usecase-by-usecase approach!* Productionizing ML via standardization : MetaConfigs & FeatureCatalog | Reducing Data-Tech Debt* Developing Real-Time Serving store over Spark Streaming, Kafka, RocksDB, Akka HTTP Data APIs* Lifecycle of feature generation | Online(Near Real-Time) & Historical(Batch) Compute* Consistent Feature Engineering & Model Deployment for DSA: DataScience AutomationAs Technology we leverage Kafka, Spark (Streaming, SQL), Scala, Python, AWS (S3, EMR, Glue and other services), DRUID, Hive, Presto, Cassandra, RocksDB, Redis, Akka HTTP

Publié dans : Technologie
  • Login to see the comments

Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serving Platform powering Machine Learning @MakeMyTrip by Piyush Kumar

  1. 1. MetaConfig driven FeatureStore@MakeMyTrip ~/Piyush Head Data Platform Engineering Namasté
  2. 2. About MakeMyTrip
  3. 3. Deliverables of this presentation: - Why common feature store? - Productionizating ML via standardization - Machine Learning Life Cycle - Prediction Serving + Challenges - FeatureStore Components - Architecture - Tools - Next Steps - References
  4. 4. Motivation Developing Unified Personalization platform for improving customer experience of millions of Indian travellers Business Goal: Through Hyper Personalization ● Raise Engagement ● Drive Conversions + Boost Revenue ● Migrating Business Rule Engines to ML Models ( across different LOBs @MakeMyTrip) Tech Goal: ● Machine Learning Models are as good as the data they are trained on. Needs good Data Management. ● ML Systems are trained on set of features, a feature is a input to model which can be a column in a dataset or complex computed metric or some other model output too ● Feature Store is a central common repository for highly curated features which are described through well structured configuration. Enables us to scale machine learning workflows @MakeMyTrip.
  5. 5. Before Feature Store : state of data platform ● Siloed Data Sets + Serving APIs created per use-case / projects leading to complex data pipelines | Machine Learning if not implemented in right manner creates high tech debt ○ Personalization : Cosmos ○ Customer Segmentation : HYDRA ○ Hotel Ranking / Sequencing + Intendo ○ DP : Dynamic / Differential Pricing : Hotel & Flights ○ Anomaly Detection, Destination trends, Demand Anomalies ● RealTime Features require Data Engineering support from Data Scientists ● Lack of standardization & discovery : Feature definitions are duplicated into the different data pipelines even if it is same / computed multiple times and change to definitions means fixing across different pipelines. ● Features used in training and serving were inconsistent
  6. 6. Productionizing ML via Standardization ● MetaConfigs & Feature Catalog : Documentation ● Reusability of features across projects / teams ● Standardized access of features between Training & Serving | Data Governance + Data Quality ● More Self-serve : Reduces Data Scientist Time on DE Tasks ● Reduce Time to get to Production for ML Projects ● Reduce Data Tech-Debt & Improved Feature Quality Feature Store : Online + Historical Data Store 1 Data Store 2 Data Store N Raw Data Data Sets 1 Data Sets N Structured Data Feature Engineering MODEL : TRAINING + DEPLOY
  7. 7. Machine Learning Life Cycle ML LifeCycle Image source : UCB RISE LABs Addition : FEATURE PIPELINES
  8. 8. Prediction serving - ASK : 10 -30 ms / < 30 ms - Challenges : DNN : Complex models - Hardware : GPUs / TPUs - SageMaker provides abstraction / middle layer between applications and complex models thru docker containers - Online : SageMaker Endpoints - Batch : Scoring : Pre-materialize predictions into a low latency store ( like redis cluster / BoulderDB) - Problems : - Requires substantial computation and space - Example doing the scoring for all customers - Costly update -> rescore everything
  9. 9. FeatureStore Glossary Feature : a measurable property of a phenomenon under observation defined in FSConfig FSConfig: used for storing config/ DSL + code to compute features, feature version information, feature analysis data and feature documentation FSCompute: Computation Engine developed over SPARK, supports mosts of the spark APIs for historical and Online(Streaming) FeatureStore : serves as a repository of features that can be used for training and evaluation of machine learning models. FeatureGroup: internal to the system, to group common compute jobs of related features having the same entity, input data sources and filter conditions, thereby optimizing the compute process. FSScheduler: Internal service to create a feature DAG(with Dependency Resolution) and trigger their execution while handling retries and back pressure. FS-DSA : Data Science Automation for Model Training + Deployment integrated with Feature Store | Enables versioned and reproducible experiments. FSBrokerAPI : Online Serving RESTful API endpoint for consumer applications
  10. 10. FeatureStore Components & Data Flow User Funnel Activity Streams Client-Side Server-Side DATA CAPTURE COMPUTE + FSConfig SERVING + STORAGE Transactional Data Booking Master FSConfig : Feature Catalog Master Datastore Product Master, User Master, Device Master New Data Streams ML Automation BT-Compute BATCH Feature Compute Jobs RT-Compute Feature Compute SERVING API Offline Models Online Models Batch BULK API (DataFrame) Feature Definitions BoulderDB REDIS Feature Storage Job Scheduler Sagemaker TRAIN Training + HPO Deploy Docker / Batch Transform
  11. 11. FSConfig : Feature Definitions & Metadata Feature Name : <Entity>::<Feature_shortname>::< Data Time Interval>::<Refresh Frequency>::<Version> Entity : <UserID>_<profileType> Short Name : listing_conversion_rank Versioning : v2 + Process : RT/BT FeatureGroup : (System Generated ID) 8fda73d1_2eee_4cfc_a20f_e9afb1 78fbc3 Entity: ["uuid", "profile_type"] Features [Array] Time Window(Refresh/ Data - Time duration): (ISO Time Interval) P1D Data Source [Array]: [user_master, txn_search] Data Store: GLUE/S3 Database Name: blueshift Table Name: [user_master, txn_search] Data Sink: Serving [Array] Data Store: GLUE Catalog/S3/Redis/BoulderDb Database Name : rocksDB_<WAL Dir Path> Table Name : rocksDB_<columnFamily> Compute Logic DSL + Spark SQL: metric_expr, group_by_expr, filter_expr, window_function, window_function_alias Code (Python/Scala/Java) : GIT/Gerrit URI Model(sagemaker) / Embedding Environment: Production Workspace: Dev/Staging/Production Namespace: <Project Name> Apache LIVY + Databricks JOBs API Config
  12. 12. FS Store | online + historical Output Schema (internal to the system) ● Historical Feature Data schema on S3 Parquet |-- entity: string (nullable = false) |-- uuid_profileType::listing_conv_rank::P30D::P15M::v1: long (nullable = false) |-- uuid_profileType::listing_view_rank::P30D::P15M::v1: long (nullable = false) |-- uuid_profileType::cnt_distinct_bk_bankid::P30D::P15M::v1: map (nullable = false) | |-- key: string | |-- value: integer (valueContainsNull = true) .. .. All features in that feature group ● Online Serving Data Schema on REDIS + BoulderDB ○ Serving at Feature Group level Key -> <Entity_id>#<Feature_group_id>/<Feature_split> Value -> Hashes key -> Feature_name Value -> Feature_value TimeStamp -> Compute_Processed_Time ○ Serving at Feature Level Key -> <Entity_id>#<Feature_name> Value -> Hashes key -> Feature_name Value -> Feature_value SERVING Config - lambda (batch_feature_name linkage for RT features) - Support for linear QUERY DAGs - MVEL based post-processing on any feature per service/model if needed Feature backfill (back_fill_required, back_fill_duration)
  13. 13. FS-BrokerAPI : Online Feature Serving Framework Data Access LayerREQUEST HANDLER Orchestration Layer Orchestration + Broker Extractors Transport Business Logics + MVEL Extractors Transport <uri>/v1/getFeature s (POST Request) AKKA(Actors) Request Validations Feature Definition Request Handler REDIS Boulder DB FeaturesbyName FeaturesbyModel FeaturesbyService
  14. 14. BoulderDB : Online Serving Store - Build on top of RocksDB (embedded data store: developed by Facebook) : reducing the distance to data on serving layer. - Steps added to compute layer: post-processing: - BT-Compute Layer after processing data through SPARK(distributed) - writes into SST Files across various executors into shared object storage : S3 - Split spark dataframe into non-overlapping ranges : individual split is sorted by KEY, then it is ingested into sst file per partition / executor - Cluster coordinator : Consul - Atomic switching of DB snapshots - Data is sharded (helps with proximity by Namespace) and replicated(RF=2)
  15. 15. Tools
  16. 16. Next Steps - Feature Stats Visualization / Analytics & Monitoring // Feature Catalog - Seamless integration with Experimentation Framework - Per User Databases on top of feature-store for Personalization - Notebook integration : More better Data Science Tools for Data Scientists with Python libraries - Perf Tools : Query Optimization & Analysis
  17. 17. References - https://www.logicalclocks.com/feature-store/ - https://eng.uber.com/scaling-michelangelo/ - Airbnb : Zipline - HopsML + Hopsworks - Go-JEK : FEAST - The Design of Systems for Real-time Prediction Serving | DataEngConf SF '18 - https://medium.com/makemytrip-engineering
  18. 18. Piyush Kumar E : piyush.kumar@makemytrip.com W : www.makemytrip.com T : https://twitter.com/piykumar Thank you !!

×