Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

•

3 j'aime•4,644 vues

At Dropbox we are currently handling approximately 10,000,000 messages per second at peak across our handful of Kafka clusters. The largest of which has hit throughputs of 7,000,000 per second (~30 Gbps) on only 20 nodes. We’ll walk you through the steps we took to get where we are, the design that works for us — and those that didn’t. We’ll talk about the tooling we had to build and what we want to see exist. We’ll dive deeper into configuration and provide a blueprint you can follow. We’ll talk about the trials and tribulations of using Kafka — including ways we’ve set our clusters on fire, ways we’ve lost data, ways we’ve turned our hairs gray, and ways we’ve heroically saved the day for our users. Finally, we’ll spend time on some of the work we’re doing to handle consumer coordination across our many different systems and to integrate Kafka into a well established corporate infrastructure. (I.e., making Kafka “”play nice”” with everybody.)

Ingénierie

Deploying Kafka at Dropbox
Alternately: how to handle 10,000,000 QPS in one cluster (but don't)

The Plan
• Welcome
• Use Case
• Initial Design
• Iterations of Woe
• Current Setup
• Future Plans

Your Speakers
• Mark Smith <zorkian@dropbox.com> 
formerly of Google, Bump, StumbleUpon, etc 
likes small airplanes and not getting paged 
• Sean Fellows <fellows@dropbox.com> 
formerly of Google 
likes corgis and distributed systems

Dropbox
• Over 500 million signups
• Exabyte scale storage system
• Multiple hardware locations + AWS

Log Events
• Wide distribution (1,000 categories)
• Several do >1M QPS each + long tail
• About 200TB/day (raw)
• Payloads range from empty to 15MB JSON blobs

Current System
• Existing system based on Scribe + HDFS
• Aggregate to single destination for analytics
• Powers Hive and standard map-reduce type analytics 
Want: real-time stream processing!

Initial Design
• One big cluster
• 20 brokers: 96GB RAM, 16x2TB disk, JBOD config
• ZK ensemble run separately (5 members)
• Kafka 0.8.2 from Github
• LinkedIn configuration recommendations

Unexpected Catastrophes
• Disks failure or reaching 100%
• Repair is manual, won't expire unless caught up
• Crash looping, controller load
• Simultaneous restarts
• Even graceful, recovery is sometimes very bad (even 0.9!)
• Rebalancing is dangerous
• Saturates disks, partitions fall out of ISRs, offline, etc

System Errors
• Controller issues
• Sometimes goes AWOL with e.g. big rebalances
• Can have multiple controllers (during serial operations)
• Cascading OOMs
• Too many connections

Lack of Tooling
• Usually left to the reader
• Few best practices
• But we love Kafka Manager
• More to come later!

Newer Clients
• State of Go/Python clients
• Bad behavior at scale
• Laserbeam, retries, backoff
• Too many connections == OOM
• Good clients take time

Bad Configs
• Many, many tunables -- lots of rope
• Unclean leader election
• Preferred leader automation
• Disk threads (thanks Gwen!)
• Little modern documentation on running at scale
• Todd Palino helped us out early, tho, so thank you!

Hardware
• Hardware RAID 10
• ~25TB usable/box (spinning rust)
• During broker replacement
• 200ms p99 commit latency down to 10ms!
• Failure tolerance, full disk protection
• Canary cluster

Monitoring
• MPS vs QPS (metadata reqs!)
• Bad Stuff graph
• Disk utilization/latency
• Heap usage
• Number of controllers

Tooling
• Rolling restarter (health checks!)
• Rate limited partition rebalancer (MPS)
• Config verifier/enforcer
• Coordinated consumption (pre-0.9)
• Auditing framework

Customer Culture
• Topics : organization :: partitions : scale
• Do not hash to partitions
• No ordering requirements
• Namespaces and ownership are required

Success! x
• Kafka goes fast (18M+ MPS on 20 brokers)
• Multiple parallel consumption
• Low latency (at high produce rates)
• 0.9 is leaps ahead of 0.8.2 (upgrade!)
• Supportable by a small team (at our scale)

The Future
• Big is fun but has problems
• Open source our tooling
• Moving towards replication
• Automatic up-partitioning and rebalancing
• Expanding auditing to clients
• Low volume latencies

Deploying Kafka at Dropbox
• Mark Smith <zorkian@dropbox.com>
• Sean Fellows <fellows@dropbox.com>
We would love to talk with other people who are running Kafka at similar
scales. Email us!
And... questions! (If we have time.)

Contenu connexe

Tendances

"This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. This presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity. The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine."

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Amazon Web Services

Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails?

confluent

Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka

confluent

Data pipeline with kafka

Mole Wong

Exactly-once Stream Processing with Kafka Streams

Guozhang Wang

Netflix Keystone Pipeline at Samza Meetup 10-13-2015

Monal Daxini

Architecture of a Kafka camus infrastructure

mattlieber

Kinesis to Kafka Bridge is a Samza job that replicates AWS Kinesis to a configurable set of Kafka topics and vice versa. It enables integration between AWS and the rest of LinkedIn. It supports replicating streams in any LinkedIn fabric, any AWS account, and any AWS region. DynamoDB Stream to Kafka Bridge is built on top of Kinesis to Kafka Bridge. It enables data replication from AWS DynamoDB to LinkedIn. In this presentation we will talk about how we designed the system and how we use it in LinkedIn.

Bridging the Gap: Connecting AWS and Kafka

Pengfei (Jason) Li

Jay Kreps is a Principal Staff Engineer at LinkedIn where he is the lead architect for online data infrastructure. He is among the original authors of several open source projects including a distributed key-value store called Project Voldemort, a messaging system called Kafka, and a stream processing system called Samza. This talk gives an introduction to Apache Kafka, a distributed messaging system. It will cover both how Kafka works, as well as how it is used at LinkedIn for log aggregation, messaging, ETL, and real-time stream processing.

Apache Kafka at LinkedIn

Discover Pinterest

Deploying Confluent Platform for Production

confluent

In this session, Netflix provides an overview of Keystone, their new data pipeline. The session covers how Netflix migrated from Suro to Keystone, including the reasons behind the transition and the challenges of zero loss while processing over 400 billion events daily. The session covers in detail how they deploy, operate, and scale Kafka, Samza, Docker, and Apache Mesos in AWS to manage 8 million events & 17 GB per second during peak.

(BDT318) How Netflix Handles Up To 8 Million Events Per Second

Amazon Web Services

Apache Storm In Retail Context

Karthik Deivasigamani

How to Lock Down Apache Kafka and Keep Your Streams Safe

confluent

Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this presentation, we will reveal how we architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. We will demonstrate the scalability, performance and cost effectiveness of Apache Kafka, Cassandra and Kubernetes, with results from our experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day.

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...

Paul Brebner

BY Jun Rao From the Bay Area Apache Kafka September 2016 Meetup. Abstract: To manage the ever-increasing volume and velocity of data within your company you have successfully made the transition from single machines and one-off solutions to large, distributed stream infrastructures in your data center powered by Apache Kafka. But what needs to be done if one data center is not enough? In this session we describe building resilient data pipelines with Apache Kafka that span multiple data centers and points of presence. We provide an overview of best practices and common patterns while covering key areas such as architecture guidelines, data replication and mirroring as well as disaster scenarios and failure handling.

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

confluent

Multi cluster, multitenant and hierarchical kafka messaging service slideshare

Allen (Xiaozhong) Wang

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber

confluent

Kafka at scale facebook israel

Gwen (Chen) Shapira

Modern search systems provide incredible feature sets, developer-friendly APIs, and low latency indexing and query response. By some measures, these systems operate "at scale," but rarely is that quantified. Customers of Rocana typically look to push ingest rates in excess of 1 million events per second, retaining years of data online for query, with the expectation of sub-second response times for any reasonably sized subset of data. We quickly found that the tradeoffs made by general purpose search systems, while right for common use cases, were less appropriate for these high cardinality, large scale use cases. This session details the architecture, tradeoffs, and interesting implementation decisions made in building a new time series optimized distributed search system using Apache Lucene, Kafka, and HDFS. Data ingestion and durability, index and metadata organization, storage, query scheduling and optimization, and failure modes will be covered. Finally, a summary of the results achieved will be shown.

High cardinality time series search: A new level of scale - Data Day Texas 2016

Eric Sammer

Many enterprises have a large technical debt in legacy applications hosted in on-premises data centers. There is a strong desire to modernize and move to a cloud-based infrastructure, but the world won’t stop for you to transition. Existing applications need to be supported and enhanced; data from legacy platforms is required to make decisions that drive the business. On the other hand, data from cloud-based applications does not exist in a vacuum. Legacy applications need access to these cloud data sources and vice versa. Can an enterprise have it both ways? Can new applications be built in the cloud while existing applications are maintained in a private data center? Monsanto has adopted a cloud-first mentality—today most new development is focused on the cloud. However, this transition did not happen overnight. Chrix Finne and Bob Lehmann share their experience building and implementing a Kafka-based cross-data-center streaming platform to facilitate the move to the cloud—in the process, kick-starting Monsanto’s transition from batch to stream processing. Details include an overview of the challenges involved in transitioning to the cloud and a deep dive into the cross-data-center stream platform architecture, including best practices for running this architecture in production and a summary of the benefits seen after deploying this architecture.

Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform

confluent

Tendances (20)

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails?

Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka

Data pipeline with kafka

Exactly-once Stream Processing with Kafka Streams

Netflix Keystone Pipeline at Samza Meetup 10-13-2015

Architecture of a Kafka camus infrastructure

Bridging the Gap: Connecting AWS and Kafka

Apache Kafka at LinkedIn

Deploying Confluent Platform for Production

(BDT318) How Netflix Handles Up To 8 Million Events Per Second

Apache Storm In Retail Context

How to Lock Down Apache Kafka and Keep Your Streams Safe

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

Multi cluster, multitenant and hierarchical kafka messaging service slideshare

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber

Kafka at scale facebook israel

High cardinality time series search: A new level of scale - Data Day Texas 2016

Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform

En vedette

Kafka At Scale in the Cloud

confluent

Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

confluent

The Enterprise Service Bus is Dead! Long live the Enterprise Service Bus, Rim...

confluent

The session will discuss how Uber evolved its stream processing system to handle a number of use cases in Uber Marketplace, with a focus on how Apache Kafka and Apache Samza played an important role in building a robust and efficient data pipeline. The use cases include but not limited to realtime aggregation of geospatial time series, computing key metrics as well as forecasting of marketplace dynamics, and extracting patterns from various event streams. The session will present how Kafka and Samza are used to meet the requirements of the use cases, what additional tools are needed, and lessons learned from operating the pipeline.

Stream Processing with Kafka in Uber, Danny Yuan

confluent

101 ways to configure kafka - badly (Kafka Summit)

Henning Spjelkavik

In the financial industry, losing data is unacceptable. Financial firms are adopting Kafka for their critical applications. Kafka provides the low latency, high throughput, high availability, and scale that these applications require. But can it also provide complete reliability? As a system architect, when asked “Can you guarantee that we will always get every transaction,” you want to be able to say “Yes” with total confidence. In this session, we will go over everything that happens to a message – from producer to consumer, and pinpoint all the places where data can be lost – if you are not careful. You will learn how developers and operation teams can work together to build a bulletproof data pipeline with Kafka. And if you need proof that you built a reliable system – we’ll show you how you can build the system to prove this too.

When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...

confluent

Building a real-time pipeline from scratch that is able to handle billion+ transactions per day, store, analyze and visualize it all in real-time has never been easier. In this build-as-we-go talk, we’ll create a front-to-back architecture that does exactly that. * we’ll start with a simple producer emitting a few messages and publishing them onto a Kafka queue * on consuming end of the queue a Spark-based Streamliner process will pick them up and store in MemSQL * ZoomData will connect to MemSQL for real-time visualization where we’ll be able to ask various questions and see answers change as data is flowing through the system * we’ll quickly make the entire pipeline more complex by increasing the amount of data as well as complexity of the data, until reaching 100K transactions per second As we walk through this demo, we will touch on cross data-center Kafka and MemSQL set-ups, speed limitations if any as well as echo back to real-life use cases of a similar set-up used in Goldman’s Asset Management division for the purposes of Portfolio Management & Trading.

Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...

confluent

The Rise of Real Time

confluent

At Under Armour Connected Fitness, we’ve built an event streaming platform on top of Kafka and the Confluent stack that makes it easy for developers to produce and consume schema-based events without requiring direct knowledge of Kafka. We are constantly trying to improve the developer experience. The platform consists of multiple federated Kafka clusters, a schema registry, a topology service, an archiver and specialized client libraries and Web / CLI tools that assist developers with producer and consumer workflows. In this talk, we will take a deeper dive into the design and implementation of a Scala/Java implementation of our client library that allows developers to produce or consume events without worrying about the underlying infrastructure and their location while enjoying the benefits of data compatibility through schemas. We’ll also look at an HTTP based client proxy that exposes the same API but for languages without our native support. Finally, we’ll walk through Web and CLI tools we built to make working with the platform easier. The content of this talk will be primarily aimed at software developers looking for ideas on how to build Kafka client tools that allow producer/consumer interactions protected by schema-based event definitions while hiding details of the underlying infrastructure.

Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...

confluent

With 60+ products and over 24% of the US GDP flowing through it, system integration is a tough problem for Intuit. Seasonality, scale, and massive peaks in products like TurboTax, QuickBooks, and Mint.com add extra layers of difficulty when building shared data services around transaction and user graphs, clickstream processing, a/b testing, and personalization. To reduce complexity and latency, we’ve implemented Kafka as the backbone across these data services. This allows us to asynchronously trigger relevant processing, elegantly scaling up and down as needed around peaks, all without the need for point-to-point integrations. In this talk, we share what we’ve learned about Kafka at Intuit and describe our data services architecture. We found that Kafka is invaluable in achieving a scalable, clean architecture, allowing engineering teams to focus less on integration and more on product development.

Kafka, Killer of Point-to-Point Integrations, Lucian Lita

confluent

Presented at Kafka Summit 2016 Operating out of multiple datacenters is a large part of most disaster recovery plans, but it brings extra complications to our data pipelines. Instead of having a straight path from front to back, it now has forks and dead ends and odd little use cases that don’t match up with a perfect view of the world. This talk will focus on how to best utilize Apache Kafka in this world, including basic architectures for multi-datacenter and multi-tier clusters. We will also touch on how to assure messages make it from producer to consumer, and how to monitor the entire ecosystem.

More Datacenters, More Problems

Todd Palino

The concept of stream processing has been around for a while and most software systems continuously transform streams of inputs into streams of outputs. Yet the idea of directly modeling stream processing in infrastructure systems is just coming into its own after a few decades on the periphery. At its core, stream processing is simple: read data in, process it, and maybe emit some data out. So why are there so many stream processing frameworks that all define their own terminology? And are the components of each even comparable? Why do I need to know about spouts or DStreams just to process a simple sequence of records? Depending on your application’s requirements, you may not need a framework. This talk will be delivered by one of the creators of the popular stream data systems Apache Kafka and will abstract away the details of individual frameworks while describing the key features they provide. These core features include scalability and parallelism through data partitioning, fault tolerance and event processing order guarantees, support for stateful stream processing, and handy stream processing primitives such as windowing. Based on our experience building and scaling Kafka to handle streams that captured hundreds of billions of records per day — this presentation will help you understand how to map practical data problems to stream processing and how to write applications that process streams of data at scale.

Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...

confluent

Siphon is a highly available and reliable distributed pub/sub system built using Apache Kafka. It is used to publish, discover and subscribe to near real-time data streams for operational and product intelligence. Siphon is used as a “Databus” by a variety of producers and subscribers in Microsoft, and is compliant with security and privacy requirements. It has a built-in Auditing and Quality control. This session will provide an overview of the use of Kafka at Microsoft, and then deep dive into Siphon. We will describe an important business scenario and talk about the technical details of the system in the context of that scenario. We will also cover the design and implementation of the service, the scale, and real world production experiences from operating the service in the Microsoft cloud environment.

Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar

confluent

Apache Beam (unified Batch and strEAM processing!) is a new Apache incubator project. Originally based on years of experience developing Big Data infrastructure within Google (such as MapReduce, FlumeJava, and MillWheel), it has now been donated to the OSS community at large. Come learn about the fundamentals of out-of-order stream processing, and how Beam’s powerful tools for reasoning about time greatly simplify this complex task. Beam provides a model that allows developers to focus on the four important questions that must be answered by any stream processing pipeline: What results are being calculated? Where in event time are they calculated? When in processing time are they materialized? How do refinements of results relate? Furthermore, by cleanly separating these questions from runtime characteristics, Beam programs become portable across multiple runtime environments, both proprietary (e.g., Google Cloud Dataflow) and open-source (e.g., Flink, Spark, et al).

Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry

confluent

Protecting your data at rest with Apache Kafka by Confluent and Vormetric

confluent

Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...

Chris Fregly

In the last few years, Apache Kafka has been used extensively in enterprises for real-time data collecting, delivering, and processing. In this presentation, Jun Rao, Co-founder, Confluent, gives a deep dive on some of the key internals that help make Kafka popular. - Companies like LinkedIn are now sending more than 1 trillion messages per day to Kafka. Learn about the underlying design in Kafka that leads to such high throughput. - Many companies (e.g., financial institutions) are now storing mission critical data in Kafka. Learn how Kafka supports high availability and durability through its built-in replication mechanism. - One common use case of Kafka is for propagating updatable database records. Learn how a unique feature called compaction in Apache Kafka is designed to solve this kind of problem more naturally.

Deep Dive into Apache Kafka

confluent

Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...

Helena Edelson

This is a talk given at ApacheCon 2015 If data is the lifeblood of high technology, Apache Kafka is the circulatory system in use at LinkedIn. It is used for moving every type of data around between systems, and it touches virtually every server, every day. This can only be accomplished with multiple Kafka clusters, installed at several sites, and they must all work together to assure no message loss, and almost no message duplication. In this presentation, we will discuss the architectural choices behind how the clusters are deployed, and the tools and processes that have been developed to manage them. Todd Palino will also discuss some of the challenges of running Kafka at this scale, and how they are being addressed both operationally and in the Kafka development community. Note - there are a significant amount of slide notes on each slide that goes into detail. Please make sure to check out the downloaded file to get the full content!

Kafka at Scale: Multi-Tier Architectures

Todd Palino

With the introduction of connect and streams API in 2016, Apache Kafka is becoming the defacto solution for anyone looking to build a streaming platform. The community continues to add additional capabilities to make it the complete solution for streaming data. Join us as we review the latest additions in Apache Kafka 0.10.2. In addition, we’ll cover what’s new in Confluent Enterprise 3.2 that makes it possible for running Kafka at scale.

What's new in Confluent 3.2 and Apache Kafka 0.10.2

confluent

En vedette (20)

Kafka At Scale in the Cloud

Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

The Enterprise Service Bus is Dead! Long live the Enterprise Service Bus, Rim...

Stream Processing with Kafka in Uber, Danny Yuan

101 ways to configure kafka - badly (Kafka Summit)

When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...

Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...

The Rise of Real Time

Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...

Kafka, Killer of Point-to-Point Integrations, Lucian Lita

More Datacenters, More Problems

Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...

Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar

Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry

Protecting your data at rest with Apache Kafka by Confluent and Vormetric

Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...

Deep Dive into Apache Kafka

Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...

Kafka at Scale: Multi-Tier Architectures

What's new in Confluent 3.2 and Apache Kafka 0.10.2

Similaire à Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Kafka is a high-throughput, fault-tolerant, scalable platform for building high-volume near-real-time data pipelines. This presentation is about tuning Kafka pipelines for high-performance. Select configuration parameters and deployment topologies essential to achieve higher throughput and low latency across the pipeline are discussed. Lessons learned in troubleshooting and optimizing a truly global data pipeline that replicates 100GB data under 25 minutes is discussed.

Tuning kafka pipelines

Sumant Tambe

Presentation Summary: By leveraging on memory mapped files, the Chronicle Engine supports large maps that easily can exceed the size of your server’s RAM, thus allowing application developers to create huge JVM:s where data can be obtained quickly and with predictable latency. The Chronicle Engine can be synchronized with an underlying database using Speedment so that your in-memory maps will be “alive” and change whenever data changes in the underlying database. Speedment can also automatically derive domain models directly from the database so that you can start using the solution very quickly. Because the Java Maps are mapped onto files, the maps can also be shared instantly between several JVM:s and when you restart a JVM, it may start very quickly without having to reload data from the underlying database. The mapped files can be hundreds of terabytes which has been done in real world deployment cases.

Java one2015 - Work With Hundreds of Hot Terabytes in JVMs

Speedment, Inc.

This presentation covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Readers will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster. This presentation is intended for people with a general understanding of Cassandra, but it not required to have experience running it in production.

Diagnosing Problems in Production - Cassandra

Jon Haddad

Dissecting Scalable Database Architectures

hypertable

HBase Low Latency, StrataNYC 2014

Nick Dimiduk

Apache Performance Tuning: Scaling Out

Sander Temme

How does Apache Pegasus (incubating) community develop at SensorsData

acelyc1112009

Best practices for highly available and large scale SolrCloud

Anshum Gupta

Drupal performance

Piyuesh Kumar

This sessions covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Attendees will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster. This talk is intended for people with a general understanding of Cassandra, but it not required to have experience running it in production.

Cassandra Day Atlanta 2015: Diagnosing Problems in Production

DataStax Academy

Speaker(s): Jon Haddad, Apache Cassandra Evangelist at DataStax This sessions covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Attendees will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster. This talk is intended for people with a general understanding of Cassandra, but it not required to have experience running it in production.

Cassandra Day Chicago 2015: Diagnosing Problems in Production

DataStax Academy

Cassandra Day London 2015: Diagnosing Problems in Production

DataStax Academy

Webinar - DreamObjects/Ceph Case Study

Ceph Community

Performance out

Andrea Martinez

Advanced Operations

DataStax Academy

Diagnosing Problems in Production (Nov 2015)

Jon Haddad

Webinar: Diagnosing Apache Cassandra Problems in Production

DataStax Academy

Webinar: Diagnosing Apache Cassandra Problems in Production

DataStax Academy

Performance_Out.pptx

sanjanabal

2 7

Oleg Petrov

Similaire à Deploying Kafka at Dropbox, Mark Smith, Sean Fellows (20)

Tuning kafka pipelines

Java one2015 - Work With Hundreds of Hot Terabytes in JVMs

Diagnosing Problems in Production - Cassandra

Dissecting Scalable Database Architectures

HBase Low Latency, StrataNYC 2014

Apache Performance Tuning: Scaling Out

How does Apache Pegasus (incubating) community develop at SensorsData

Best practices for highly available and large scale SolrCloud

Drupal performance

Cassandra Day Atlanta 2015: Diagnosing Problems in Production

Cassandra Day Chicago 2015: Diagnosing Problems in Production

Cassandra Day London 2015: Diagnosing Problems in Production

Webinar - DreamObjects/Ceph Case Study

Performance out

Advanced Operations

Diagnosing Problems in Production (Nov 2015)

Webinar: Diagnosing Apache Cassandra Problems in Production

Performance_Out.pptx

2 7

Plus de confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...

confluent

Santander Stream Processing with Apache Flink

confluent

Unlocking the Power of IoT: A comprehensive approach to real-time insights

confluent

El Stream processing es un requisito previo de la pila de data streaming, que impulsa aplicaciones y pipelines en tiempo real. Permite una mayor portabilidad de datos, una utilización optimizada de recursos y una mejor experiencia del cliente al procesar flujos de datos en tiempo real. En nuestro taller práctico híbrido, aprenderás cómo filtrar, unir y enriquecer fácilmente datos en tiempo real dentro de Confluent Cloud utilizando nuestro servicio Flink sin servidor.

Workshop híbrido: Stream Processing con Flink

confluent

Our talk will explore the transformative impact of integrating Confluent, HiveMQ, and SparkPlug in Industry 4.0, emphasizing the creation of a Unified Namespace. In addition to the creation of a Unified Namespace, our webinar will also delve into Stream Governance and Scaling, highlighting how these aspects are crucial for managing complex data flows and ensuring robust, scalable IIoT-Platforms. You will learn how to ensure data accuracy and reliability, expand your data processing capabilities, and optimize your data management processes. Don't miss out on this opportunity to learn from industry experts and take your business to the next level.

Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...

confluent

La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.

AWS Immersion Day Mapfre - Confluent

confluent

Eventos y Microservicios - Santander TechTalk

confluent

Q&A with Confluent Experts: Navigating Networking in Confluent Cloud

confluent

Citi TechTalk Session 2: Kafka Deep Dive

confluent

Traditional data pipelines often face scalability issues and challenges related to cost, their monolithic design, and reliance on batch data processing. They also typically operate under the premise that all data needs to be stored in a single centralized data source before it's put to practical use. Confluent Cloud on Amazon Web Services (AWS) provides a fully managed cloud-native platform that helps you simplify the way you build real-time data flows using streaming data pipelines and Apache Kafka.

Build real-time streaming data pipelines to AWS with Confluent

confluent

Q&A with Confluent Professional Services: Confluent Service Mesh

confluent

Citi Tech Talk: Event Driven Kafka Microservices

confluent

An in depth look at how Confluent is being used in the financial services industry. Gain an understanding of how organisations are utilising data in motion to solve common problems and gain benefits from their real time data capabilities. It will look more deeply into some specific use cases and show how Confluent technology is used to manage costs and mitigate risks. This session is aimed at Solutions Architects, Sales Engineers and Pre Sales, and also the more technically minded business aligned people. Whilst this is not a deeply technical session, a level of knowledge around Kafka would be helpful.

Confluent & GSI Webinars series - Session 3

confluent

Transforming applications built with traditional messaging solutions such as TIBCO, MQ and Solace to be scalable, reliable and ready for the move to cloud How can applications built with traditional messaging technologies like TIBCO, Solace and IBM MQ be modernised and be made cloud ready? What are the advantages to Event Streaming approaches to pub/sub vs traditional message queues? What are the strengeths and weaknesses of both approaches, and what use cases and requirements are actually a better fit for messaging than Kafka?

Citi Tech Talk: Messaging Modernization

confluent

Citi Tech Talk: Data Governance for streaming and real time data

confluent

Confluent & GSI Webinars series: Session 2

confluent

Vous apprendrez également à : • Créer plus rapidement des produits et fonctionnalités à l’aide d’une suite complète de connecteurs et d’outils de gestion des flux, et à connecter vos environnements à des pipelines de données • Protéger vos données et charges de travail les plus critiques grâce à des garanties intégrées en matière de sécurité, de gouvernance et de résilience • Déployer Kafka à grande échelle en quelques minutes tout en réduisant les coûts et la charge opérationnelle associés

Data In Motion Paris 2023

confluent

Confluent Partner Tech Talk with Synthesis

confluent

The Future of Application Development - API Days - Melbourne 2023

confluent

The Playful Bond Between REST And Data Streams

confluent

Plus de confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...

Santander Stream Processing with Apache Flink

Unlocking the Power of IoT: A comprehensive approach to real-time insights

Workshop híbrido: Stream Processing con Flink

Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...

AWS Immersion Day Mapfre - Confluent

Eventos y Microservicios - Santander TechTalk

Q&A with Confluent Experts: Navigating Networking in Confluent Cloud

Citi TechTalk Session 2: Kafka Deep Dive

Build real-time streaming data pipelines to AWS with Confluent

Q&A with Confluent Professional Services: Confluent Service Mesh

Citi Tech Talk: Event Driven Kafka Microservices

Confluent & GSI Webinars series - Session 3

Citi Tech Talk: Messaging Modernization

Citi Tech Talk: Data Governance for streaming and real time data

Confluent & GSI Webinars series: Session 2

Data In Motion Paris 2023

Confluent Partner Tech Talk with Synthesis

The Future of Application Development - API Days - Melbourne 2023

The Playful Bond Between REST And Data Streams

Dernier

FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads

Arindam Chakraborty, Ph.D., P.E. (CA, TX)

2016EF22_0 solar project report rooftop projects

smsksolar

Call Girl Meerut Indira Call Now: 8617697112 Meerut Escorts Booking Contact Details WhatsApp Chat: +91-8617697112 Meerut Escort Service includes providing maximum physical satisfaction to their clients as well as engaging conversation that keeps your time enjoyable and entertaining. Plus they look fabulously elegant; making an impressionable. Independent Escorts Meerut understands the value of confidentiality and discretion - they will go the extra mile to meet your needs. Simply contact them via text messaging or through their online profiles; they'd be more than delighted to accommodate any request or arrange a romantic date or fun-filled night together. We provide –

(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7

Call Girls in Nagpur High Profile Call Girls

GAS POWER CYCLES Cycles: Otto, Diesel, Dual, Brayton - Calculation of mean effective pressure - Air standard efficiency - Comparison of cycles INTERNAL COMBUSTION ENGINES Classification - Components and their function - valve timing diagram and port timing diagram - actual and theoretical p-v diagram of two stroke and four stroke engines – carburettor - diesel pump and injector system - battery and magneto ignition system - principles of combustion and detonation in CI engines - lubrication and cooling systems - performance parameters and calculations.

Thermal Engineering Unit - I & II . ppt

DineshKumar4165

Rule 1 − Check for the blocks connected in series and simplify. Rule 2 − Check for the blocks connected in parallel and simplify. Rule 3 − Check for the blocks connected in feedback loop and simplify. Rule 4 − If there is difficulty with take-off point while simplifying, shift it towards right. Rule 5 − If there is difficulty with summing point while simplifying, shift it towards left. Rule 6 − Repeat the above steps till you get the simplified form, i.e., single block.

Block diagram reduction techniques in control systems.ppt

NANDHAKUMARA10

UNIT - IV - Air Compressors and its Performance

sivaprakash250

Unit 1 - Soil Classification and Compaction.pdf

RagavanV2

KubeKraft presentation @CloudNativeHooghly

sanyuktamishra911

Call Girl Bhosari Indira Call Now: 8617697112 Bhosari Escorts Booking Contact Details WhatsApp Chat: +91-8617697112 Bhosari Escort Service includes providing maximum physical satisfaction to their clients as well as engaging conversation that keeps your time enjoyable and entertaining. Plus they look fabulously elegant; making an impressionable. Independent Escorts Bhosari understands the value of confidentiality and discretion - they will go the extra mile to meet your needs. Simply contact them via text messaging or through their online profiles; they'd be more than delighted to accommodate any request or arrange a romantic date or fun-filled night together. We provide –

(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7

Call Girls in Nagpur High Profile Call Girls

Design For Accessibility: Getting it right from the start

Quintin Balsdon

Call Girl Aurangabad Indira Call Now: 8617697112 Aurangabad Escorts Booking Contact Details WhatsApp Chat: +91-8617697112 Aurangabad Escort Service includes providing maximum physical satisfaction to their clients as well as engaging conversation that keeps your time enjoyable and entertaining. Plus they look fabulously elegant; making an impressionable. Independent Escorts Aurangabad understands the value of confidentiality and discretion - they will go the extra mile to meet your needs. Simply contact them via text messaging or through their online profiles; they'd be more than delighted to accommodate any request or arrange a romantic date or fun-filled night together. We provide –

(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7

Call Girls in Nagpur High Profile Call Girls

STEAM NOZZLES AND TURBINES Flow of steam through nozzles, shapes of nozzles, effect of friction, critical pressure ratio, supersaturated flow - impulse and reaction principles, velocity diagram, work done and efficiency – types of compounding - governors. AIR COMPRESSORS Classification - working principle - type of compressors, work of compression with and without clearance - volumetric efficiency - isothermal and isentropic efficiency of reciprocating compressors - multistage air compressor with inter cooling.

Thermal Engineering -unit - III & IV.ppt

DineshKumar4165

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

ssuser89054b

Work-Permit-Receiver-in-Saudi-Aramco.pptx

JuliansyahHarahap1

notes on Evolution Of Analytic Scalability.ppt

MsecMca

Increased aeration of the soil; Stabilized soil structure; Higher and more diversified crop production; Better workability of the land; Earlier planting dates; Reduction of peak discharges by an increased temporary storage of water in the soil decomposition of organic matter; soil subsidence; reduced irrigation efficiency; increased risk of drought. excessive leaching of valuable nutrients from the soil; downstream environmental damage by salty or otherwise polluted drainage water; the presence of ditches, canals, and structures impending accessibility and interfering with other infrastructural elements of the land.

chapter 5.pptx: drainage and irrigation engineering

mulugeta48

This presentation takes a deep dive into the methodologies of HAZID and HAZOP, two cornerstone risk assessment techniques in the oil and gas industry. Over 12 slides, we compare the structured approaches of both HAZID and HAZOP, detail their individual steps, and discuss their benefits and drawbacks. The aim is to provide professionals with concise insight to make informed decisions about which method suits their project's needs. By the end, viewers will have understood each method's strategic value in the pursuit of workplace safety and efficiency.

Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...

soginsider

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking Escorts Service Available Whatsapp SABANA ☎️ : [+91-7001035870] Escorts Service are always ready to make their clients happy. Their exotic looks and sexy personalities are sure to turn heads. You can enjoy with them, including massages and erotic encounters. Our area Escorts are young and sexy, so you can expect to have an exotic time with them. They are trained to satiate your naughty nerves and they can handle anything that you want. They are also intelligent, so they know how to make you feel comfortable and relaxed Independent Escorts Service They know all the sex positions and can satisfy you in any way that you desire. They can even give you erotic massages to help you relax before your session. This is essential, because a man who is stressed won’t be receptive to the pleasures of sex. They also know how to play with your sexy organs, so you’ll have plenty of foreplay and cuddling. P252024SS SERVICE ✅ ❣️ ⭐➡️HOT & SEXY MODELS // COLLEGE GIRLS HOUSE WIFE RUSSIAN , AIR HOSTES ,VIP MODELS . AVAILABLE FOR COMPLETE ENJOYMENT WITH HIGH PROFILE INDIAN MODEL AVAILABLE HOTEL & HOME ★ SAFE AND SECURE HIGH CLASS SERVICE AFFORDABLE RATE ★ SATISFACTION,UNLIMITED ENJOYMENT. ★ All Meetings are confidential and no information is provided to any one at any cost. ★ EXCLUSIVE PROFILes Are Safe and Consensual with Most Limits Respected ★ Service Available In: - HOME & HOTEL Star Hotel Service .In Call & Out call SeRvIcEs : ★ A-Level (star escort) ★ Strip-tease ★ BBBJ (Bareback Blowjob)Receive advanced sexual techniques in different mode make their life more pleasurable. ★ Spending time in hotel rooms ★ BJ (Blowjob Without a Condom) ★ Completion (Oral to completion) ★ Covered (Covered blowjob Without condom ★ANAL SERVICES.

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking

dharasingh5698

Call girls in delhi ✔️✔️🔝 9953056974 🔝✔️✔️Welcome To Vip Escort Services In Delhi [ ]Noida Gurgaon 24/7 Open Sex Escort Services With Happy Ending ServiCe Done By Most Attractive Charming Soft Spoken Bold Beautiful Full Cooperative Independent Escort Girls ServiCe In All-Star Hotel And Home Service In All Over Delhi, Noida, Gurgaon, Faridabad, Ghaziabad, Greater Noida, • IN CALL AND OUT CALL SERVICE IN DELHI NCR • 3* 5* 7* HOTELS SERVICE IN DELHI NCR • 24 HOURS AVAILABLE IN DELHI NCR • INDIAN, RUSSIAN, PUNJABI, KASHMIRI ESCORTS • REAL MODELS, COLLEGE GIRLS, HOUSE WIFE, ALSO AVAILABLE • SHORT TIME AND FULL TIME SERVICE AVAILABLE • HYGIENIC FULL AC NEAT AND CLEAN ROOMS AVAIL. IN HOTEL 24 HOURS • DAILY NEW ESCORTS STAFF AVAILABLE • MINIMUM TO MAXIMUM RANGE AVAILABLE. Call Girls in Delhi & Independent Escort Service – CALL GIRLS SERVICE DELHI NCR Vip call girls in Delhi Call Girls in Delhi, Call Girl Service 24×7 open Call Girls in Delhi Best Delhi Escorts in Delhi Low Rate Call Girls In Saket Delhi X~CALL GIRLS IN Ramesh Nagar Metro best Delhi call girls and Delhi escort service. CALL GIRLS SERVICE IN ALL DELHI … (Delhi) Call Girls in (Chanakyapuri) Hot And Sexy Independent Model Escort Service In Delhi Unlimited Enjoy Genuine 100% Profiles And Trusted Door Step Call Girls Feel Free To Call Us Female Service Hot Busty & Sexy Party Girls Available For Complete Enjoyment. We Guarantee Full Satisfaction & In Case Of Any Unhappy Experience, We Would Refund Your Fees, Without Any Questions Asked. Feel Free To Call Us Female Service Provider Hours Opens Thanks. Delhi Escorts Services 100% secure Services.Incall_OutCall Available and outcall Services provide. We are available 24*7 for Full Night and short Time Escort Services all over Delhi NCR. Delhi All Hotel Services available 3* 4* 5* Call Call Delhi Escorts Services And Delhi Call Girl Agency 100% secure Services in my agency. Incall and outcall Services provide. We are available 24*7 for Full Night and short Time Escort Services my agency in all over New Delhi Delhi All Hotel Services available my agency SERVICES [✓✓✓] Housewife College Girl VIP Escort Independent Girl Aunty Without a Condom sucking )? Sexy Aunty.DSL (Dick Sucking Lips)? DT (Dining at the Toes English Spanking) Doggie (Sex style from no behind)?? OutCall- All Over Delhi Noida Gurgaon 24/7 FOR APPOINTMENT Call/Whatsop / 9953056974

Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service

9953056974 Low Rate Call Girls In Saket, Delhi NCR

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking Escorts Service Available Whatsapp SABANA ☎️ : [+91-7001035870] Escorts Service are always ready to make their clients happy. Their exotic looks and sexy personalities are sure to turn heads. You can enjoy with them, including massages and erotic encounters. Our area Escorts are young and sexy, so you can expect to have an exotic time with them. They are trained to satiate your naughty nerves and they can handle anything that you want. They are also intelligent, so they know how to make you feel comfortable and relaxed Independent Escorts Service They know all the sex positions and can satisfy you in any way that you desire. They can even give you erotic massages to help you relax before your session. This is essential, because a man who is stressed won’t be receptive to the pleasures of sex. They also know how to play with your sexy organs, so you’ll have plenty of foreplay and cuddling. P252024SS SERVICE ✅ ❣️ ⭐➡️HOT & SEXY MODELS // COLLEGE GIRLS HOUSE WIFE RUSSIAN , AIR HOSTES ,VIP MODELS . AVAILABLE FOR COMPLETE ENJOYMENT WITH HIGH PROFILE INDIAN MODEL AVAILABLE HOTEL & HOME ★ SAFE AND SECURE HIGH CLASS SERVICE AFFORDABLE RATE ★ SATISFACTION,UNLIMITED ENJOYMENT. ★ All Meetings are confidential and no information is provided to any one at any cost. ★ EXCLUSIVE PROFILes Are Safe and Consensual with Most Limits Respected ★ Service Available In: - HOME & HOTEL Star Hotel Service .In Call & Out call SeRvIcEs : ★ A-Level (star escort) ★ Strip-tease ★ BBBJ (Bareback Blowjob)Receive advanced sexual techniques in different mode make their life more pleasurable. ★ Spending time in hotel rooms ★ BJ (Blowjob Without a Condom) ★ Completion (Oral to completion) ★ Covered (Covered blowjob Without condom ★ANAL SERVICES.

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking

dharasingh5698

Dernier (20)

FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads

2016EF22_0 solar project report rooftop projects

(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7

Thermal Engineering Unit - I & II . ppt

Block diagram reduction techniques in control systems.ppt

UNIT - IV - Air Compressors and its Performance

Unit 1 - Soil Classification and Compaction.pdf

KubeKraft presentation @CloudNativeHooghly

(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7

Design For Accessibility: Getting it right from the start

(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7

Thermal Engineering -unit - III & IV.ppt

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Work-Permit-Receiver-in-Saudi-Aramco.pptx

notes on Evolution Of Analytic Scalability.ppt

chapter 5.pptx: drainage and irrigation engineering

Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking

Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

1. Deploying Kafka at Dropbox Alternately: how to handle 10,000,000 QPS in one cluster (but don't)

2. The Plan • Welcome • Use Case • Initial Design • Iterations of Woe • Current Setup • Future Plans

3. Your Speakers • Mark Smith <zorkian@dropbox.com>  formerly of Google, Bump, StumbleUpon, etc  likes small airplanes and not getting paged  • Sean Fellows <fellows@dropbox.com>  formerly of Google  likes corgis and distributed systems

4. The Plan • Welcome • Use Case • Initial Design • Iterations of Woe • Current Setup • Future Plans

5. Dropbox • Over 500 million signups • Exabyte scale storage system • Multiple hardware locations + AWS

6. Log Events • Wide distribution (1,000 categories) • Several do >1M QPS each + long tail • About 200TB/day (raw) • Payloads range from empty to 15MB JSON blobs

7. Current System • Existing system based on Scribe + HDFS • Aggregate to single destination for analytics • Powers Hive and standard map-reduce type analytics  Want: real-time stream processing!

8. The Plan • Welcome • Use Case • Initial Design • Iterations of Woe • Current Setup • Future Plans

9. Initial Design • One big cluster • 20 brokers: 96GB RAM, 16x2TB disk, JBOD config • ZK ensemble run separately (5 members) • Kafka 0.8.2 from Github • LinkedIn configuration recommendations

10. The Plan • Welcome • Use Case • Initial Design • Iterations of Woe • Current Setup • Future Plans

11. Unexpected Catastrophes • Disks failure or reaching 100% • Repair is manual, won't expire unless caught up • Crash looping, controller load • Simultaneous restarts • Even graceful, recovery is sometimes very bad (even 0.9!) • Rebalancing is dangerous • Saturates disks, partitions fall out of ISRs, offline, etc

12. System Errors • Controller issues • Sometimes goes AWOL with e.g. big rebalances • Can have multiple controllers (during serial operations) • Cascading OOMs • Too many connections

13. Lack of Tooling • Usually left to the reader • Few best practices • But we love Kafka Manager • More to come later!

14. Newer Clients • State of Go/Python clients • Bad behavior at scale • Laserbeam, retries, backoff • Too many connections == OOM • Good clients take time

15. Bad Configs • Many, many tunables -- lots of rope • Unclean leader election • Preferred leader automation • Disk threads (thanks Gwen!) • Little modern documentation on running at scale • Todd Palino helped us out early, tho, so thank you!

16. The Plan • Welcome • Use Case • Initial Design • Iterations of Woe • Current Setup • Future Plans

17. Hardware • Hardware RAID 10 • ~25TB usable/box (spinning rust) • During broker replacement • 200ms p99 commit latency down to 10ms! • Failure tolerance, full disk protection • Canary cluster

18. Monitoring • MPS vs QPS (metadata reqs!) • Bad Stuff graph • Disk utilization/latency • Heap usage • Number of controllers

19.

20. Tooling • Rolling restarter (health checks!) • Rate limited partition rebalancer (MPS) • Config verifier/enforcer • Coordinated consumption (pre-0.9) • Auditing framework

21.

22. Customer Culture • Topics : organization :: partitions : scale • Do not hash to partitions • No ordering requirements • Namespaces and ownership are required

23. Success! x • Kafka goes fast (18M+ MPS on 20 brokers) • Multiple parallel consumption • Low latency (at high produce rates) • 0.9 is leaps ahead of 0.8.2 (upgrade!) • Supportable by a small team (at our scale)

24. The Plan • Welcome • Use Case • Initial Design • Iterations of Woe • Current Setup • Future Plans

25. The Future • Big is fun but has problems • Open source our tooling • Moving towards replication • Automatic up-partitioning and rebalancing • Expanding auditing to clients • Low volume latencies

26. Deploying Kafka at Dropbox • Mark Smith <zorkian@dropbox.com> • Sean Fellows <fellows@dropbox.com> We would love to talk with other people who are running Kafka at similar scales. Email us! And... questions! (If we have time.)

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Similaire à Deploying Kafka at Dropbox, Mark Smith, Sean Fellows (20)

Plus de confluent

Plus de confluent (20)

Dernier

Dernier (20)

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows