Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
© 2020 SPLUNK INC.
Real-Time Machine
Learning With Pulsar
Functions
David Kjerrumgaard
© 2020 SPLUNK INC.
During the course of this presentation, we may make forward-looking statements regarding future events ...
© 2020 SPLUNK INC.
About Me
• Author of “Pulsar in Action”
• Apache Pulsar Committer
• Principal Software Engineer and mem...
© 2020 SPLUNK INC.
The Machine Learning Model
Development Process
© 2020 SPLUNK INC.
Deployment
The ML Model is
deployed to production
Training
The ML Model is trained
against realistic da...
© 2020 SPLUNK INC.
Model Deployment
Batch vs. Real-Time
• Once a model is developed, it can be deployed in either batch (o...
© 2020 SPLUNK INC.
Problem Identification
© 2020 SPLUNK INC.
Problem Identification
What Business problem are we solving?
• When a customer places an order with our...
© 2020 SPLUNK INC.
• Multiple variables and real time conditions
impact the accuracy of the prediction.
• Involves three p...
© 2020 SPLUNK INC.
Data Acquisition
© 2020 SPLUNK INC.
Data Acquisition
Data Exploration – Identify the available data sources
© 2020 SPLUNK INC.
Data Acquisition
Exploratory Analysis
• We can calculate some relevant information about the meal
prepa...
© 2020 SPLUNK INC.
Data Acquisition
Exploratory Analysis
• We can calculate similar information about the amount of time i...
© 2020 SPLUNK INC.
Data Acquisition
Exploratory Analysis
• Since some items take longer to prepare than others, we can
cal...
© 2020 SPLUNK INC.
Data Acquisition
Data Exploration – Identify missing data sources
• Since we cannot know either the del...
© 2020 SPLUNK INC.
Data Acquisition
Data Exploration – Identify missing data sources
• The primary function of the H3 libr...
© 2020 SPLUNK INC.
Data Acquisition
• Each restaurant location will be
a fixed point, e.g., Hexagon ID
that we can then us...
© 2020 SPLUNK INC.
Model Development
© 2020 SPLUNK INC.
Model Development
10,000 Foot Level
• Data Scientists perform this task using a variety of programming
...
© 2020 SPLUNK INC.
Model Development
Development Toolkits and Model Type Selection
• Data Scientists perform this task
usi...
© 2020 SPLUNK INC.
Model Development
Feature Engineering
• From a ML Model deployment perspective, we will be responsible
...
© 2020 SPLUNK INC.
Model Training
© 2020 SPLUNK INC.
Model Training
10,000 Foot Level
• Intuitively we know that some of the feature values will have a grea...
© 2020 SPLUNK INC.
Model Training
The Process
• Select a subset of historical data to use
for training and run it through ...
© 2020 SPLUNK INC.
Model Training
Generating a Suite of Models
• In our scenario, it was best to generate multiple models ...
© 2020 SPLUNK INC.
Model Deployment
With Pulsar Functions
© 2020 SPLUNK INC.
Model Deployment
10,000 Foot View
© 2020 SPLUNK INC.
Model Deployment
Why Pulsar Functions are a good fit
© 2020 SPLUNK INC.
Model Deployment
Generic Deployed ML Model Execution Pattern
Provide the data to the ML Model in
order ...
© 2020 SPLUNK INC.
Model Deployment
• Trained models are sent to the
Pulsar Function via the
StateStore.
• The execution e...
© 2020 SPLUNK INC.
Model Deployment
• Trained models are exported to a transferable format such as PMML, etc.
• These expo...
© 2020 SPLUNK INC.
Model Deployment
• For the delivery time estimation model, a linear regression model was used
• After t...
© 2020 SPLUNK INC.
Model Deployment
The Linear Regression Model in PMML Format
Step 1
© 2020 SPLUNK INC.
Model Deployment
• This PMML file was uploaded to the StateStore using a command like the
following :
R...
© 2020 SPLUNK INC.
Model Deployment
Initialize the Execution Environment with the Trained Model
• We know the key to use t...
© 2020 SPLUNK INC.
Model Deployment
• Can be performed by a sequence of Pulsar Functions, executing
concurrently in order ...
© 2020 SPLUNK INC.
Model Deployment
• In a recent engineering blog, Uber
describes how they rely on Feature
Stores backed ...
© 2020 SPLUNK INC.
Model Deployment
Feature Store config
Execute the query to retrieve the data
Retrieve the Data
Step 3
S...
© 2020 SPLUNK INC.
Model Deployment
Create the feature vector
Add the required values to the
feature vector.
These values ...
© 2020 SPLUNK INC.
Model Deployment
Evaluate the model using the
provided feature vector as input to
get an estimated deli...
© 2020 SPLUNK INC.
Summary
• I presented a technology-agnostic design pattern for deploying ML models
inside of Pulsar Fun...
© 2020 SPLUNK INC.
Summary cont.
• Pulsar Functions enable you to dynamically swap out the trained model that is
deployed ...
© 2020 SPLUNK INC.
Summary cont.
• In order to decrease the overall latency of the prediction process, it is
recommended t...
© 2020 SPLUNK INC.
Questions
© 2020 SPLUNK INC.
Thank You
Prochain SlideShare
Chargement dans…5
×

0

Partager

Télécharger pour lire hors ligne

Real-Time Machine Learning with Pulsar Functions - Pulsar Summit NA 2021

Télécharger pour lire hors ligne

In this talk I will present a technique for deploying machine learning models to provide real-time predictions using Apache Pulsar Functions. In order to provide a prediction in real-time, the model usually receives a single data point from the caller, and is expected to provide an accurate prediction within a few milliseconds. 

Throughout this talk, I will demonstrate the steps required to deploy a fully-trained ML that predicts the delivery time for a food delivery service based upon real-time traffic information, the customer's location, and the restaurant that will be fulfilling the order.

  • Soyez le premier à aimer ceci

Real-Time Machine Learning with Pulsar Functions - Pulsar Summit NA 2021

  1. 1. © 2020 SPLUNK INC. Real-Time Machine Learning With Pulsar Functions David Kjerrumgaard
  2. 2. © 2020 SPLUNK INC. During the course of this presentation, we may make forward-looking statements regarding future events or the expected performance of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-looking statements, please review our filings with the SEC. The forward-looking statements made in this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or accurate information. We do not assume any obligation to update any forward-looking statements we may make. In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionality described or to include any such feature or functionality in a future release. Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2017 Splunk Inc. All rights reserved. Forward-Looking Statements
  3. 3. © 2020 SPLUNK INC. About Me • Author of “Pulsar in Action” • Apache Pulsar Committer • Principal Software Engineer and member of the Messaging-as-a-Service team at Splunk • Formerly the Director of Solution Architecture at Streamlio.
  4. 4. © 2020 SPLUNK INC. The Machine Learning Model Development Process
  5. 5. © 2020 SPLUNK INC. Deployment The ML Model is deployed to production Training The ML Model is trained against realistic data Development Feature Engineering, Model Building & Validation Data Acquisition Data Extraction & Exploratory Analysis What is the business problem we are solving? Problem Identification The Machine Learning Model Lifecycle
  6. 6. © 2020 SPLUNK INC. Model Deployment Batch vs. Real-Time • Once a model is developed, it can be deployed in either batch (offline) mode, or real- time (online) mode. • In batch mode the predictions are generated on a recurring schedule (e.g., hourly, daily), and then stored in a database and can be made available to developers or end-users. • Real-time machine learning refers to the process of generating the predictions on the fly as the data arrives because we need up to date information in order to make an accurate prediction. • Consider the estimated-time-to-delivery that is generated whenever a user orders food through a food delivery service such as UberEATS. It would be impossible to generate a batch of these estimates beforehand.
  7. 7. © 2020 SPLUNK INC. Problem Identification
  8. 8. © 2020 SPLUNK INC. Problem Identification What Business problem are we solving? • When a customer places an order with our food delivery business, we need to provide an accurate estimate of when the food will be delivered to the customer and continuously revise this estimate. • In order to provide an accurate delivery time estimate, you need to accurately predict three things; when the food will be ready, how long it will take the driver to pick it up, and how long it will take the driver to deliver it Provide an accurate estimate
  9. 9. © 2020 SPLUNK INC. • Multiple variables and real time conditions impact the accuracy of the prediction. • Involves three parties in every transaction that must be paired together in order to complete the transaction. • Cannot precompute all possible combinations of customers, drivers, restaurants, etc. NP-complete dataset. • Even if we could, we could not account for current conditions such as traffic. Problem Identification What makes this problem so difficult?
  10. 10. © 2020 SPLUNK INC. Data Acquisition
  11. 11. © 2020 SPLUNK INC. Data Acquisition Data Exploration – Identify the available data sources
  12. 12. © 2020 SPLUNK INC. Data Acquisition Exploratory Analysis • We can calculate some relevant information about the meal preparation time from the existing data sources such as: • The average meal preparation time for a given restaurant over a given timeframe, e.g., last hour, last 24 hours, last week. • The average meal preparation time for a given restaurant by a given day of the week or year, etc. SELECT AVG(ReadyTime – PlacedTime) from FoodOrders WHERE PlacedTime > (CURRENT_TIME – 60000) GROUP BY RestaurantId; SELECT AVG(ReadyTime – PlacedTime) from FoodOrders where DAYNAME(OrderDate) = ? GROUP BY RestaurantId;
  13. 13. © 2020 SPLUNK INC. Data Acquisition Exploratory Analysis • We can calculate similar information about the amount of time it takes to pickup an order • The average meal wait time for a given restaurant over a given timeframe, e.g., last hour, last 24 hours, last week. • The average meal wait time for a given restaurant by a given day of the week or year, etc. SELECT AVG(PickupTime – ReadyTime) from FoodOrders WHERE PlacedTime > (CURRENT_TIME – 60000) GROUP BY RestaurantId; SELECT AVG(ReadyTime – PlacedTime) from FoodOrders where DAYNAME(OrderDate) = ? GROUP BY RestaurantId;
  14. 14. © 2020 SPLUNK INC. Data Acquisition Exploratory Analysis • Since some items take longer to prepare than others, we can calculate the prep time by menu item to capture this: • The average meal preparation time for a given restaurant by menu item over a given timeframe, e.g., last hour, last 24 hours, last week. SELECT AVG(ReadyTime – PlacedTime) from FoodOrders WHERE PlacedTime > (CURRENT_TIME – 60000) GROUP BY RestaurantId, MenuItemId;
  15. 15. © 2020 SPLUNK INC. Data Acquisition Data Exploration – Identify missing data sources • Since we cannot know either the delivery address or the selected driver’s location in advance, we cannot use historical data to estimate the amount of time it takes to deliver an order. • Rather than depend upon a third-party mapping API to calculate these values dynamically at runtime; due to cost and latency limitations we can use approximation instead. • Uber’s open-source Hexagonal Hierarchy Spatial Index known as H3 allows us to breakdown a given region into various hexagons. • We can then compute the average transit time between any two hexagon regions and use that value instead. https://eng.uber.com/h3/
  16. 16. © 2020 SPLUNK INC. Data Acquisition Data Exploration – Identify missing data sources • The primary function of the H3 library is to map latitude and longitude pairs to a unique 64-bit H3 index, identifying a grid cell. • Using a hexagon as the cell shape is critical for H3 since they have only one distance between a hexagon center point and its neighbors’, compared to two distances for squares or three distances for triangles. This property greatly simplifies performing analysis and smoothing over gradients. https://eng.uber.com/h3/
  17. 17. © 2020 SPLUNK INC. Data Acquisition • Each restaurant location will be a fixed point, e.g., Hexagon ID that we can then use to calculate the average transit time between it and all the surrounding Hexagons within a given radius (50 miles) • When an order comes in, we just need to map the delivery address to a Hexagon ID. Then we can do a simple lookup to get the average transit time between the two. Data Exploration – Identify missing data sources
  18. 18. © 2020 SPLUNK INC. Model Development
  19. 19. © 2020 SPLUNK INC. Model Development 10,000 Foot Level • Data Scientists perform this task using a variety of programming languages and model types. However, the two main outputs of this process are; the model and its list of required inputs know as a “Feature Vector”
  20. 20. © 2020 SPLUNK INC. Model Development Development Toolkits and Model Type Selection • Data Scientists perform this task using a variety of programming languages and model types and based upon their experience they will choose the best tool for the job. • Model Deployers have no control over this, and therefore our approach must accommodate as many as possible.
  21. 21. © 2020 SPLUNK INC. Model Development Feature Engineering • From a ML Model deployment perspective, we will be responsible for providing values for all the features in the Feature Vector. • For our delivery time estimation model, the following features are required in the feature vector: • Avg meal prep time by restaurant, by menu item (past 1 hour, past 24 hours, by hour of day, day of week) • Avg wait time between food ready and pickup by restaurant (past 1 hr/ 24 hr, HoD, DoW) • Avg transit time between pickup location and delivery location (past 1 hr/ 24 hr, HoD, DoW, etc) using grid squares to approximate the start and end points. • Delivery Driver location
  22. 22. © 2020 SPLUNK INC. Model Training
  23. 23. © 2020 SPLUNK INC. Model Training 10,000 Foot Level • Intuitively we know that some of the feature values will have a greater impact on the prediction than others. This phenomenon is captured by assigning a weight to each vector. • The goal of the model training process is to calculate the weight to assign to each of the features in order to provide more accurate predictions. • The output of the training process is combination of a feature vector and weight vector.
  24. 24. © 2020 SPLUNK INC. Model Training The Process • Select a subset of historical data to use for training and run it through the model to generate the predictions. • Compare the predictions to the actual results, in this case the estimated delivery time vs. the actual. • Adjust the features and/or weights and re-evaluate until the results are accurate
  25. 25. © 2020 SPLUNK INC. Model Training Generating a Suite of Models • In our scenario, it was best to generate multiple models with different weights due to the irregular patterns of the business with high demand spikes around traditional mealtimes, e.g., breakfast, lunch, and dinner. • Similarly, the impact of traffic congestion patterns can also be accounted for via different weights associated with the transit time features based on the time of day and the geographical region, e.g., New York, Chicago, and L.A. • The result is a collection of models trained based on data from a specific time period and location, e.g., NY-Weekday-8am-10am, LA-Weekend-5pm- 7pm, etc. • From a ML Model deployment perspective, we will be responsible for using the proper model for the given date and time
  26. 26. © 2020 SPLUNK INC. Model Deployment With Pulsar Functions
  27. 27. © 2020 SPLUNK INC. Model Deployment 10,000 Foot View
  28. 28. © 2020 SPLUNK INC. Model Deployment Why Pulsar Functions are a good fit
  29. 29. © 2020 SPLUNK INC. Model Deployment Generic Deployed ML Model Execution Pattern Provide the data to the ML Model in order to produce the prediction Step 4 Retrieve the appropriate trained ML Model Step 1 Initialize the appropriate execution environment with the ML Model Step 2 Retrieve data for all of the Features defined in the Model’s Feature Vector Step 3
  30. 30. © 2020 SPLUNK INC. Model Deployment • Trained models are sent to the Pulsar Function via the StateStore. • The execution engine runs inside the Function itself and is initialized with the ML Model • The feature collection occurs when a triggering event arrives, allowing us to perform this time- consuming task independently and in parallel
  31. 31. © 2020 SPLUNK INC. Model Deployment • Trained models are exported to a transferable format such as PMML, etc. • These exported models are “pushed” to the state store when the data science team decides they are ready for production. • Publishing the model to the StateStore is done via a REST call, so it can be executed as part of a CI/CD pipeline or scheduled to run periodically to change the model based on the time, which is what we need in our case. Retrieve the appropriate ML Model Step 1
  32. 32. © 2020 SPLUNK INC. Model Deployment • For the delivery time estimation model, a linear regression model was used • After the Data Scientist trained the model using an R-based notebook, they exported the model to the PMML format. • PMML is an XML representation of the model which includes the feature vector, weights, and algorithm used. Retrieve the appropriate ML Model Step 1
  33. 33. © 2020 SPLUNK INC. Model Deployment The Linear Regression Model in PMML Format Step 1
  34. 34. © 2020 SPLUNK INC. Model Deployment • This PMML file was uploaded to the StateStore using a command like the following : Retrieve the appropriate ML Model Step 1 ./bin/pulsar-admin functions putstate --name MyMLEnabledFunction --namespace MyNamespace --tenant MyTenant --state "{"key":"ML-Model", "byteValue": <contents of delivery-time-estimation.pmml >}"
  35. 35. © 2020 SPLUNK INC. Model Deployment Initialize the Execution Environment with the Trained Model • We know the key to use to retrieve the PMML file, and choose the java PMML evaluator library to execute the model: Step 2 Get the PMML from the StateStore Use the correct model evaluator Parse the PMML
  36. 36. © 2020 SPLUNK INC. Model Deployment • Can be performed by a sequence of Pulsar Functions, executing concurrently in order to decrease the overall latency. • Once all features are retrieved, a populated feature vector is published to the output topic for the ML-enabled Function to consume from. • Feature data should be stored in low-latency data storage in order to meet real-time requirements. Feature stores are preferred Retrieve the Data Step 3
  37. 37. © 2020 SPLUNK INC. Model Deployment • In a recent engineering blog, Uber describes how they rely on Feature Stores backed by Apache Cassandra. • One of the keys to maximizing the performance of the feature data retrieval process is to avoid querying the transactional database at runtime by pre-computing these features and storing them in a feature store instead. Feature Stores Step 3
  38. 38. © 2020 SPLUNK INC. Model Deployment Feature Store config Execute the query to retrieve the data Retrieve the Data Step 3 SQL to retrieve the feature data we need Write the data to the output topic
  39. 39. © 2020 SPLUNK INC. Model Deployment Create the feature vector Add the required values to the feature vector. These values were retrieved from the feature store using the keys provided in the incoming message, e.g., restaurant id, etc. Execute the Model against the Feature Vector Step 4
  40. 40. © 2020 SPLUNK INC. Model Deployment Evaluate the model using the provided feature vector as input to get an estimated delivery time. Execute the Model against the Feature Vector Step 4
  41. 41. © 2020 SPLUNK INC. Summary • I presented a technology-agnostic design pattern for deploying ML models inside of Pulsar Functions that can be used to deploy models regardless of the underlying algorithm or development language in four basic steps: • Load the ML Model definition from the state store • Initialize the Model execution engine with the model definition • Retrieve the input data for the model • Execute the model against the input data to produce a prediction • Pulsar Functions are a great tool for deploying ML Models for online operation because they allow you to execute the model near the source of the data and on a per-event basis.
  42. 42. © 2020 SPLUNK INC. Summary cont. • Pulsar Functions enable you to dynamically swap out the trained model that is deployed without any code changes or downtime. This is critical when you are required to rotate the ML model periodically or need to support multiple versions of an ML model. • Pulsar Functions enable you to leverage existing data access clients which allow you to retrieve the data required to populate the model’s feature vector from a variety of data sources. • Pulsar Functions enable you to leverage existing ML execution frameworks which allows you to support the broadest range of model types. This ensures that you don’t impose any limitations on your Data Science team tooling.
  43. 43. © 2020 SPLUNK INC. Summary cont. • In order to decrease the overall latency of the prediction process, it is recommended to store your feature data in low-latency data stores such as in- memory data grids or key-value databases such as Apache Cassandra. • The Data Engineering team should be responsible for scheduling the periodic calculation of any feature data, such as average meal prep time using a high- throughput processing engine and storing it in a special database known as a Feature Store. This enables low-latency lookups at runtime.
  44. 44. © 2020 SPLUNK INC. Questions
  45. 45. © 2020 SPLUNK INC. Thank You

In this talk I will present a technique for deploying machine learning models to provide real-time predictions using Apache Pulsar Functions. In order to provide a prediction in real-time, the model usually receives a single data point from the caller, and is expected to provide an accurate prediction within a few milliseconds. 
 Throughout this talk, I will demonstrate the steps required to deploy a fully-trained ML that predicts the delivery time for a food delivery service based upon real-time traffic information, the customer's location, and the restaurant that will be fulfilling the order.

Vues

Nombre de vues

130

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

2

Actions

Téléchargements

5

Partages

0

Commentaires

0

Mentions J'aime

0

×