Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
© 2020 SPLUNK INC.
How Splunk is using
Pulsar IO
Jerry Peng
Principal Software Engineer @ Splunk | Committer and PMC membe...
© 2020 SPLUNK INC.
How Splunk is using Pulsar IO
1. Overview of Pulsar IO
2. Pulsar @ Splunk
3. Pulsar IO @ Splunk
4. Impr...
© 2020 SPLUNK INC.
Overview of Pulsar IO
A connector framework to ingress and egress data to and from Pulsar
● Source - In...
© 2020 SPLUNK INC.
Overview of Pulsar IO
● A integrated solution to answer questions like
○ What is the best way for movin...
© 2020 SPLUNK INC.
Overview of Pulsar IO
● Easy to use
○ Users to be able to ingress or egress data from and to external s...
© 2020 SPLUNK INC.
Overview of Pulsar IO
public interface Source<T> extends AutoCloseable {
/**
* Open connector with conf...
© 2020 SPLUNK INC.
Overview of Pulsar IO
● Pulsar IO is powered by Pulsar Functions framework
● Inherits all the features ...
© 2020 SPLUNK INC.
Overview of Pulsar IO
● Deployment flexibility - Run a custom source/sink or a built-in one
● Execution...
© 2020 SPLUNK INC.
Overview of Pulsar IO
● https://pulsar.apache.org/docs/en/io-overview/
● https://www.splunk.com/en_us/b...
© 2020 SPLUNK INC.
Pulsar @ Splunk
Overview
● Pulsar used for both streaming and queueing use cases
● Deployed as a multi-...
© 2020 SPLUNK INC.
Pulsar @ Splunk
DSP Architecture
© 2020 SPLUNK INC.
Pulsar IO @ Splunk
● Powers the Unified Connector Framework (UCF) of Splunk’s Data Stream
Processor pla...
© 2020 SPLUNK INC.
Pulsar IO @ Splunk
● Inherent issues with existing homegrown / legacy platform
○ Complex Architecture
○...
© 2020 SPLUNK INC.
Pulsar IO @ Splunk
Architecture
© 2020 SPLUNK INC.
Pulsar IO @ Splunk
● Large Scale Data Collection
○ Collecting large amounts of static data and ingestin...
© 2020 SPLUNK INC.
Pulsar IO Batch Source
● Designed for ingesting large amounts of static data into Pulsar
○ For example,...
© 2020 SPLUNK INC.
Pulsar IO Batch Source
Architecture
Discover
(in instance-0)
Task queue
(Intermediate
Pulsar Topic)
Col...
© 2020 SPLUNK INC.
Pulsar IO Batch Source
public interface BatchSource<T> extends AutoCloseable {
/**
* Open connector wit...
© 2020 SPLUNK INC.
Pulsar IO @ Splunk
Deployment @ Splunk
● Deploying Function Workers
separately of Brokers
● Function / ...
© 2020 SPLUNK INC.
Giving back to OSS @ Splunk
● Pulsar IO Batch Source
● Re-designed core pieces of the Pulsar Functions ...
© 2020 SPLUNK INC.
Improvements to Pulsar IO @ Splunk
● Pulsar Functions uses an internal topic
called the “metadata” topi...
© 2020 SPLUNK INC.
Improvements to Pulsar IO @ Splunk
● Problem
○ When the leader worker isn't processing assignment messa...
© 2020 SPLUNK INC.
Using Exclusive Producer
© 2020 SPLUNK INC.
Improvements to Pulsar IO @ Splunk
● Problem
○ Previously, there no mechanism for re-balancing instance...
© 2020 SPLUNK INC.
Future of Pulsar IO @ Splunk
● Autoscaling
○ Workers and instances
■ Autoscale workers on K8s using K8s...
© 2020 SPLUNK INC.
Connectors as a Service
● Providing a platform for all connectors to run on at Splunk based on Pulsar IO
© 2020 SPLUNK INC.
Connectors as a Service
Continued...
© 2020 SPLUNK INC.
Connectors as a Service
Continued...
© 2020 SPLUNK INC.
Thank you!
We are hiring!
jerryp@splunk.com
Prochain SlideShare
Chargement dans…5
×

0

Partager

Télécharger pour lire hors ligne

How Splunk Is Using Pulsar IO

Télécharger pour lire hors ligne

At Splunk, we have made the decision to deprecate a home-brewed platform that powers the DSP's (Data Stream Processor) connector framework in favor of a framework that is powered by Pulsar IO.
In this talk, I will go over our evaluation and decision process on choosing to use the Pulsar IO framework. I will also discuss how the Splunk's DSP product is leveraging the Pulsar IO framework and especially batch sources that was recently added to Pulsar IO. I will conclude the talk with discussing the various improvements that we at Splunk have contributed to the Pulsar Functions/IO framework to increase scalability and stability. In my final remarks, I will also discuss how we intend to leverage and use Pulsar IO/Functions further in the future at Splunk.

  • Soyez le premier à aimer ceci

How Splunk Is Using Pulsar IO

  1. 1. © 2020 SPLUNK INC. How Splunk is using Pulsar IO Jerry Peng Principal Software Engineer @ Splunk | Committer and PMC member @ Apache {Pulsar, Heron, Storm} Pulsar Summit 2021
  2. 2. © 2020 SPLUNK INC. How Splunk is using Pulsar IO 1. Overview of Pulsar IO 2. Pulsar @ Splunk 3. Pulsar IO @ Splunk 4. Improvements to Pulsar IO / Functions 5. Future of Pulsar IO @ Splunk Agenda
  3. 3. © 2020 SPLUNK INC. Overview of Pulsar IO A connector framework to ingress and egress data to and from Pulsar ● Source - Ingress data into Pulsar from an external system. ● Sink - Egress data from Pulsar to an external system. What is Pulsar IO?
  4. 4. © 2020 SPLUNK INC. Overview of Pulsar IO ● A integrated solution to answer questions like ○ What is the best way for moving data into and out of the Pulsar? ○ Where should I run my application to publish data to or consume data from Pulsar? ○ How should I run my application to publish data to or consume data from Pulsar? Why Pulsar IO?
  5. 5. © 2020 SPLUNK INC. Overview of Pulsar IO ● Easy to use ○ Users to be able to ingress or egress data from and to external systems without having to write any code ○ Built-in connectors ● Managed Runtime ○ A user does not need to worry about where and how to run a connector ○ Execution, scheduling, scaling, and fault tolerance taken care by runtime ● Flexible Runtime ○ Run instances as threads, processes, K8s pods, etc. Design Goals
  6. 6. © 2020 SPLUNK INC. Overview of Pulsar IO public interface Source<T> extends AutoCloseable { /** * Open connector with configuration. * * @param config initialization config * @param sourceContext environment where the source connector is running * @throws Exception IO type exceptions when opening a connector */ void open(final Map<String, Object> config, SourceContext sourceContext) throws Exception; /** * Reads the next message from source. * If source does not have any new messages, this call should block. * @return next message from source. The return result should never be null * @throws Exception */ Record<T> read() throws Exception; } public interface Sink<T> extends AutoCloseable { /** * Open connector with configuration. * * @param config initialization config * @param sinkContext environment where the sink connector is running * @throws Exception IO type exceptions when opening a connector */ void open(final Map<String, Object> config, SinkContext sinkContext) throws Exception; /** * Write a message to Sink. * * @param record record to write to sink * @throws Exception */ void write(Record<T> record) throws Exception; } API
  7. 7. © 2020 SPLUNK INC. Overview of Pulsar IO ● Pulsar IO is powered by Pulsar Functions framework ● Inherits all the features and benefits of the Pulsar Functions framework Architecture and Execution
  8. 8. © 2020 SPLUNK INC. Overview of Pulsar IO ● Deployment flexibility - Run a custom source/sink or a built-in one ● Execution flexibility - Sources and sinks can run as part of an existing cluster, as a standalone process, on Kubernetes, etc. ● Parallelism - To increase the throughput of a sink or source, multiple instances of sources and sink can be run by just adding a simple configuration. ● Load balancing - If sources and sink are run in “cluster” mode ● Fault-tolerance, monitoring, and metrics - If sources and sink are run in “cluster” mode, the worker service as part of the Pulsar function framework will automatically monitor deployed sources and sinks. When nodes fail, sources and sink we be redeployed to operational nodes. Metrics are also automatically collected ● Dynamic updates - Each connector’s parallelism, source code, ingress and egress topics, and many other configurations can be changed on the fly ● Stateful - Has access to State API Benefits Summarized
  9. 9. © 2020 SPLUNK INC. Overview of Pulsar IO ● https://pulsar.apache.org/docs/en/io-overview/ ● https://www.splunk.com/en_us/blog/it/introducing-pulsar-io.html References
  10. 10. © 2020 SPLUNK INC. Pulsar @ Splunk Overview ● Pulsar used for both streaming and queueing use cases ● Deployed as a multi-tenant SaaS offering internally ● Data nervous system at Splunk ○ Moving / routing data ○ Connecting apps / services together
  11. 11. © 2020 SPLUNK INC. Pulsar @ Splunk DSP Architecture
  12. 12. © 2020 SPLUNK INC. Pulsar IO @ Splunk ● Powers the Unified Connector Framework (UCF) of Splunk’s Data Stream Processor platform. ● Responsible for data ingress and egress of the DSP platform What Pulsar IO was used for?
  13. 13. © 2020 SPLUNK INC. Pulsar IO @ Splunk ● Inherent issues with existing homegrown / legacy platform ○ Complex Architecture ○ Scalability Issues ○ Performance ○ Infra Cost ○ Maintainability ● Leverage Open Source ○ Already using Pulsar, why not leverage more of its functionality? ○ Cost and risks of maintaining homegrown / proprietary platforms and protocols. ○ Leverage existing OSS connectors ○ Engage with community Why Pulsar IO was chosen?
  14. 14. © 2020 SPLUNK INC. Pulsar IO @ Splunk Architecture
  15. 15. © 2020 SPLUNK INC. Pulsar IO @ Splunk ● Large Scale Data Collection ○ Collecting large amounts of static data and ingesting them into DSP ● Requirements ○ Ingest petabytes of data per day ○ Tens of thousands of connectors ○ Low startup time ○ Cost effective Use Case
  16. 16. © 2020 SPLUNK INC. Pulsar IO Batch Source ● Designed for ingesting large amounts of static data into Pulsar ○ For example, ingesting data stored in AWS S3 or GCS ● Two phases ○ Discovery - discover data to collect and ingest into pulsar. The discover phase with output a stream of tasks for the collect phase to execute. ■ Discovery phase triggered by user defined logic but usually periodically ■ Discovery phase only run on one instance i.e instance-0 ■ This is done to reduce redundant API calls to external systems that enforce rate limits and charge per call ○ Collect - collect the data. Execute tasks generated by discovery phase ■ Run in parallel among instances Overview
  17. 17. © 2020 SPLUNK INC. Pulsar IO Batch Source Architecture Discover (in instance-0) Task queue (Intermediate Pulsar Topic) Collect in instance-0 Collect in instance-1 Collect in instance-n Event Queue (Configured output Pulsar topic) . . . Trigger (for example, every 5 minutes) Tasks Tasks Events
  18. 18. © 2020 SPLUNK INC. Pulsar IO Batch Source public interface BatchSource<T> extends AutoCloseable { /** * Open connector with configuration. * * @param config config that's supplied for source * @param context environment where the source connector is running * @throws Exception IO type exceptions when opening a connector */ void open( final Map<String, Object> config, SourceContext context) throws Exception; /** * Discovery phase of a connector. This phase will only be run on one instance, i.e. instance 0, of the connector. * Implementations use the taskEater consumer to output serialized representation of tasks as they are discovered. * * @param taskEater function to notify the framework about the new task received. * @throws Exception during discover */ void discover(Consumer< byte[]> taskEater) throws Exception; /** * Called when a new task appears for this connector instance. * * @param task the serialized representation of the task */ void prepare( byte[] task) throws Exception; /** * Read data and return a record * Return null if no more records are present for this task * @return a record */ Record<T> readNext() throws Exception; } public interface BatchSourceTriggerer { /** * initializes the Triggerer with given config. Note that the triggerer doesn't start running * until start is called. * * @param config config needed for triggerer to run * @param sourceContext The source context associated with the source * The parameter passed to this trigger function is an optional description of the event that caused the trigger * @throws Exception throws any exceptions when initializing */ void init(Map<String, Object> config, SourceContext sourceContext) throws Exception; /** * Triggerer should actually start looking out for trigger conditions. * * @param trigger The function to be called when its time to trigger the discover * This function can be passed any metadata about this particular * trigger event as its argument * This method should return immediately. It is expected that implementations will use their own mechanisms * to schedule the triggers. */ void start(Consumer<String> trigger); /** * Triggerer should stop triggering. * */ void stop(); } API
  19. 19. © 2020 SPLUNK INC. Pulsar IO @ Splunk Deployment @ Splunk ● Deploying Function Workers separately of Brokers ● Function / Connector instances running as threads with Worker JVM process ○ Connectors are viewed as vetted / safe code. ● Leveraging built-in connectors functionality of Pulsar IO ● State store for Pulsar Functions, i.e. table service, deployed as a independent cluster as well. Deployment @ Splunk
  20. 20. © 2020 SPLUNK INC. Giving back to OSS @ Splunk ● Pulsar IO Batch Source ● Re-designed core pieces of the Pulsar Functions architecture to improve scalability and stability ● Performance testing ○ Running 100,000 instances (10,000 sources, 10 instances each) ○ Petabytes per day ingest rates ● Numerous bug fixes Overview for Pulsar IO / Functions
  21. 21. © 2020 SPLUNK INC. Improvements to Pulsar IO @ Splunk ● Pulsar Functions uses an internal topic called the “metadata” topic to hold a log of functions/sources/sinks submitted to run. Previously, this topic grows unbounded over time. At Splunk, we re-designed and re-implemented the metadata registration workflow in Pulsar Functions to support topic compaction so that old metadata can be safely truncated. ● https://github.com/apache/pulsar/pull/7255 Metadata Topic Compaction
  22. 22. © 2020 SPLUNK INC. Improvements to Pulsar IO @ Splunk ● Problem ○ When the leader worker isn't processing assignment messages fast enough. The background routine that checks for unassigned functions instances will trigger scheduler to schedule and write more assignments to the assignment topic. There is essentially a feedback loop that can cause many assignment updates to be published in the assignment topic that are unnecessary. ● Modification ○ When a worker becomes the leader, it stops tailing the assignment topic. Since, the leader runs the scheduling process, it will already now which instances are assigned to which worker thus it is unnecessary for it to tail the assignment topic. ● https://github.com/apache/pulsar/pull/7237 Improving Scheduling Performance and Stability
  23. 23. © 2020 SPLUNK INC. Using Exclusive Producer
  24. 24. © 2020 SPLUNK INC. Improvements to Pulsar IO @ Splunk ● Problem ○ Previously, there no mechanism for re-balancing instances scheduled on Function Workers. ○ Scheduling of instances to workers may become skewed over time ● Modification ○ Add interface for rebalance strategy. ○ Allow users to trigger a rebalance on-demand ○ Implement ability to automatically periodically rebalance ● https://github.com/apache/pulsar/pull/7237 Automatic Re-balancing of Instances
  25. 25. © 2020 SPLUNK INC. Future of Pulsar IO @ Splunk ● Autoscaling ○ Workers and instances ■ Autoscale workers on K8s using K8s HPAs almost done ● Resource Aware Scheduling ○ Scheduling that takes into account ● Continue to work on getting topic compaction to work fully with the internal topics used by Pulsar Functions ● Integrating client memory limits to Pulsar functions ○ Adding per producer and consumer memory limits ■ More isolation even when running instances as threads ● More fixes and optimizations!
  26. 26. © 2020 SPLUNK INC. Connectors as a Service ● Providing a platform for all connectors to run on at Splunk based on Pulsar IO
  27. 27. © 2020 SPLUNK INC. Connectors as a Service Continued...
  28. 28. © 2020 SPLUNK INC. Connectors as a Service Continued...
  29. 29. © 2020 SPLUNK INC. Thank you! We are hiring! jerryp@splunk.com

At Splunk, we have made the decision to deprecate a home-brewed platform that powers the DSP's (Data Stream Processor) connector framework in favor of a framework that is powered by Pulsar IO. In this talk, I will go over our evaluation and decision process on choosing to use the Pulsar IO framework. I will also discuss how the Splunk's DSP product is leveraging the Pulsar IO framework and especially batch sources that was recently added to Pulsar IO. I will conclude the talk with discussing the various improvements that we at Splunk have contributed to the Pulsar Functions/IO framework to increase scalability and stability. In my final remarks, I will also discuss how we intend to leverage and use Pulsar IO/Functions further in the future at Splunk.

Vues

Nombre de vues

302

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

0

Actions

Téléchargements

5

Partages

0

Commentaires

0

Mentions J'aime

0

×