SlideShare une entreprise Scribd logo
1  sur  73
Software Development & Arch @ LinkedIn
1
Sid Anand
QCon SF 2014
@r39132
About Me
2
*
Current Life…
 Chief Architect @ ClipMine, a video discovery
company
 QCon SF Program Committee member
 Dad to a very energetic 2 year old boy
Previous Life…
 Architect in Search and Distributed Data @
LinkedIn
 Cloud Data Architect @ Netflix
 VP Engineering at Etsy
 Software Developer at eBay
@r39132 2
A Closer Look @ LinkedIn
3@r39132 3
LinkedIn
4
*
***
Then
• Created in 2002 in Reid Hoffman’s living room
• In its first month of operation, LinkedIn added 4500 members!
@r39132 4
LinkedIn
5
*
Then
• Created in 2002 in Reid Hoffman’s living room
• In its first month of operation, LinkedIn added 4500 members!
Now
• 332M members in 200 countries
• 2 members sign up every second
• >60% of members overseas
• In Q3’14, 75% of new members came from overseas
@r39132 5
LinkedIn
6
*
Then
• Created in 2002 in Reid Hoffman’s living room
• In its first month of operation, LinkedIn added 4500 members!
Now
• 332M members in 200 countries
• 2 members sign up every second
• >60% of members overseas
• In Q3’14, 75% of new members are coming from overseas
• Fastest growing demographic is not geographic, it’s students!
• > 10% of user base already and growing!
@r39132 6
LinkedIn
7
*
Member-growth started to ramp up during 2011, when we IPO’d
• 2010 : 55M
• 2011 : 90M (IPO)
• 2012 : 145M
• Q3’14 : 332M
(note : numbers reflect start of year)
We added ~ same number of users in 2010 than over previous 6 years!
@r39132 7
LinkedIn
8
*
***
Employee-growth also started to ramp up during 2011
• 2010 : 500
• 2011 : 1K (IPO)
• 2012 : 2100
• Q3’14: 6K (25% in Engineering)
(note : numbers reflect start of year)
@r39132 8
9@r39132 9
10@r39132 10
Alan Shepard
• 2nd man in space
• 5th person to walk on the moon!
• 1st person to hit a golf ball on the
moon!
LinkedIn
11@r39132 11
When asked by reporters what he thought about while
awaiting liftoff, he replied: "The fact that every part of this
ship was built by the lowest bidder"
How did LinkedIn scale for
company and member growth?
12@r39132 12
Software Development
Challenges
13@r39132 13
14
Circa 2011
• On my first day at LinkedIn, I felt pretty excited!
Software Development : Challenges
@r39132
Linux Desktop
• 8 Core
• 64GB Ram
Mac Air
15
Circa 2011
• On my first day at LinkedIn, I felt pretty excited!
Software Development : Challenges
@r39132
Linux Desktop
• 8 Core
• 64GB Ram
Mac Air
16
Circa 2011
• Then I tried to compile the code on my laptop!
Software Development : Challenges
@r39132
Linux Desktop
• 8 Core
• 64GB Ram
Mac Air
17
Circa 2011
• 300+ code projects in a single SVN Repo
• SVN checkout world & go-to-lunch
• Needed a server-grade machine to compile it!
• Ant build (world) & go-make-espresso
• Almost every WAR was built from source not intermediate JARs
• To test your code locally, you needed to locally deploy every service that
your code depended on! (maybe 20)
• So, yes, you need a machine that typically lives in your data center!
Software Development : Challenges
@r39132
18
Circa 2011
• Assume that your code is now
• Written
• Compiled
• Locally Tested
• What Next?
Software Development : Challenges
@r39132
19
Circa 2011
• 500+ developers were checking code into the master branch on the single
repo!
• So, someone broke master every day!
• So
• 3 hours to write, build, and locally test code
• 3 days to commit it!
Software Development : Challenges
@r39132
20
Software Development : Challenges
@r39132
21
Now (Solved)
• Do what the open-source world does with some improvements!
• Break the monolithic repo into many individual Git Repos!
• Have WARs depend on intermediate JARs – don’t not build the world!
• Do not deploy the world for local testing – just connect your Dev
machine to a test environment!
• What are the improvements?
Software Development : Challenges
@r39132
Software Development
Life Cycle
22@r39132 22
23
Software Development
@r39132
1. Alice commits code to Git
2. Alice sends a Review Board request
to Bob & Cathy, owners of the files!
3. Both Bob & Cathy give ship-its
4. Alice amends her commit message with :
RB=<review board id>
BUILD-WAR=<list of wars to build>
Code Reviews
24
Software Development
@r39132
1. Alice pushes code to our Gitorious server where the following
verifications:
1. Pre-push Sanity Checks! Must pass of push rejected!
1. Have all owners of the changed files given ship-its?
2. Does the code build?
2. For JAR builds, also build upstream WARs!
3. Run Integration Tests!
Code Push (Git Push)
25
Software Development
@r39132
1. Assuming that all checks passed, the WAR is now
available
2. Our system automatically deploys all wars to test
servers
3. QA verifies the new builds
QA Test / Staging
26
Software Development
@r39132
1. Service owner Dave canaries the new WAR
2. Our EKG system then compares the canary machine to one control
machine for 1 hour of product traffic for the following:
1. CPU, Memory increase
2. Fan-in/Fan-out increase
3. Error rate increase
4. Latency increase
Production - Canary
27
Software Development
@r39132
1. Service owner Dave reviews the EKG report
2. If it looks acceptable, he promotes the build to the rest of the cluster in all
data centers
Production - Promotion
How did LinkedIn scale for
company and member growth?
28@r39132 28
Architectural
Practices
29@r39132 29
Web
Servers
Oracle
LinkedIn Architecture
@r39132 30
Proto-typical Use – Case
• A member updates her profile with new skills, job title,
and education
• She also accepts a connection request from another
member
Behind the scenes
• Web servers commit data to Oracle
• What Happens Next?
Web
Servers
Oracle
LinkedIn Architecture
@r39132 31
What Happens Next?
Profile Updates
• She should should become instantly searchable by her
new skills, job title, & education!
• New groups and job ads should be recommended to her
Connection Updates
• The news feed should instantly reflect content updates
from her new connection!
• Also, based on the new connection, the PYMK widget
should discover a new 2nd degree neighborhood!
Web
Servers
(writers)
Oracle
LinkedIn Architecture
@r39132 32
Databus
Search
Caches
Graph
Recommender
Systems
(PYMK, Jobs)
DownstreamStreams
DW
33
We also have a data pipeline to capture high-throughput events
that we need to count!
Databases are not a good place to do high-TP atomic counting!
Kafka is!
• This is typically used for ranking signals
• E.g. counts member page views to determine who are “hot”
LinkedIn : Architecture
@r39132
Web
Servers
(writers)
Oracle
LinkedIn Architecture
@r39132 34
Kafka
Databus
Search
Systems
Caches
Graph Systems
Recommender
Systems
DownstreamStreams
DW
LinkedIn Architecture : Single Data Center!
@r39132 35
LinkedIn : Architecture : Single Data Center!
@r39132 36
LinkedIn : Architecture : Multi-data Center Project
@r39132 37
LinkedIn Architecture : Rule 1
@r39132 38
Partition your user base across the data centers!
e.g. using Akamai GTM
LinkedIn Architecture : Rule1
@r39132 39
Problem!
User 1 (mapped to DC1) updates his profile! How will User 2 (mapped to DC2)
see it?
LinkedIn Architecture : Rule 2
@r39132 40
Link your data centers together at the data fabric level!
Not a new concept! Cassandra has been doing it for a few years now in the
OLTP database space!
LinkedIn Architecture : Rule 2
@r39132 41
Link your data centers together at the data fabric level!
Not a new concept! Cassandra has been doing it for a few years now in the
OLTP database space!
LinkedIn’s Sources of Truth 
• We have to make both work in across
multiple data centers!
LinkedIn Architecture : Rule 2
@r39132 42
Link your data centers together at the data fabric level!
Not a new concept! Cassandra has been doing it for a few years now in the
OLTP database space!
LinkedIn’s Sources of Truth 
• We have to make both work in across
multiple data centers!
• Oracle is fairly easy : we use Oracle
Golden-gate!
• Kafka is also pretty easy!
LinkedIn : Kafka Multi-Data Center
@r39132 43
Kafka
Local
Producer
Consumer
of Local
Events
Kafka Data Center 1
LinkedIn : Kafka Multi-Data Center
@r39132 44
Kafka
Local
Producer
Consumer
of Local
Events
Kafka
Local
Producer
Consumer
of Local
Events
Kafka Data Center 2Kafka Data Center 1
LinkedIn : Kafka Multi-Colo
@r39132 45
Kafka
Local
Producer
Consumer
of Local
Events Consumer
of Global
Events
Kafka
Local
Producer
Consumer
of Local
Events
Kafka Data Center 2Kafka Data Center 1
LinkedIn : Kafka Multi-Colo
@r39132 46
Kafka
Local
Producer
Kafka
Global
Consumer
of Local
Events Consumer
of Global
Events
Kafka
Local
Producer
Consumer
of Local
Events
Kafka Data Center 2Kafka Data Center 1
LinkedIn : Kafka Multi-Colo
@r39132 47
Kafka
Local
Producer
Kafka
Global
Consumer
of Local
Events Consumer
of Global
Events
Kafka
Local
Producer
Kafka
Global
Consumer
of Local
EventsConsumer
of Global
Events
Kafka Data Center 2Kafka Data Center 1
LinkedIn Architecture : Rule 3
@r39132 48
Don’t make any web service calls between data centers!
It kills latency, which kills availability!
LinkedIn : Architecture
@r39132 49
How did LinkedIn scale for
company and member growth?
50@r39132 50
LinkedIn Search
51@r39132 51
52
LinkedIn Search
@r39132
Why is Search important to LinkedIn?
• Search is a significant income driver!
• 332M members that recruiters pay to find! (Recruiter
Search)
• 2M+ jobs that companies pay to list so you can find them!
(Job Search)
What Makes LinkedIn Search
Unique?
53@r39132 53
54
LinkedIn Search : Federated
@r39132
LinkedIn Search : Federated
@r39132 55
• We index many entities
• members, jobs, companies, groups, universities, articles, slides, etc..
• These are separate (vertical) search-engines!
LinkedIn Search : Federated
@r39132 56
• We index many entities
• members, jobs, companies, groups, universities, articles, slides, etc..
• These are separate (vertical) search-engines!
• When a user enters “sr software engineer”, which index should we look in?
• Jobs, members, groups?
LinkedIn Search : Federated
@r39132 57
• We index many entities
• members, jobs, companies, groups, universities, articles, slides, etc..
• These are separate (vertical) search-engines!
• When a user enters “sr software engineer” , which index should we look in?
• Jobs, members, groups?
• Can we simply send the request to all of the search engines and then show
the most relevant results?
• No
• Ranks (scores) are not comparable across verticals
LinkedIn Search : Federated
@r39132 58
• We index many entities
• members, jobs, companies, groups, universities, articles, slides, etc..
• These are separate (vertical) search-engines!
• When a user enters “sr software engineer” , which index should we look in?
• Jobs, members, groups?
• Can we simply send the request to all of the search engines and then show
the most relevant results?
• No
• Ranks (scores) are not comparable across verticals
• What if we pick a vertical based on a user feature?
• Job seeker sees jobs, recruiter sees members
• Intent Detection : done by Federator
LinkedIn Search : Query Rewriting
@r39132 59
• Say a recruiter searches for “sr software eng”
• There are 20+ ways to represent this title
• senior swe
• sr swe
• senior software engineer
LinkedIn Search : Query Rewriting
@r39132 60
• Say a recruiter searches for “sr software eng”
• There are 20+ ways to represent this title
• senior swe
• sr swe
• senior software engineer
• To solve this, we can use a title standarizer, though not every title may
have a canonical form!
• If a standardized title exists, we can rewrite the user query
• title:sr AND title:software AND title:eng  std_title:sswe234
LinkedIn Search : Query Rewriting
@r39132 61
• Say a recruiter searches for “sr software eng”
• There are 20+ ways to represent this title
• senior swe
• sr swe
• senior software engineer
• To solve this, we developed a title standarizer!
• If a standardized title exists, we can rewrite the user query
• title:sr AND title:software AND title:eng  std_title:sswe234
• Query Rewriting helps by expanding the search space by methods such
as synonym expansion, spell correction, etc… So we need it!
LinkedIn Search : Flexible Scoring
@r39132 62
• We index many entities!
• Companies, Members, Universities, etc…
• We use different scoring formulas and signals for each vertical
• We need a way to easily plug-in different custom scorers!
LinkedIn Search : Open Source
@r39132 63
• Leading open source alternatives (e.g. Lucene, ElasticSearch,
SOLR) do not offer these!
• Search Federation
• Pluggable Query Rewriting
• Pluggable and Flexible Scoring
• They DO offer some distributed system management, which we will
have to re-invent unfortunately
LinkedIn Search : Open Source
@r39132 64
• Leading open source alternatives (e.g. Lucene, ElasticSearch,
SOLR) do not offer these!
• Search Federation
• Pluggable Query Rewriting
• Pluggable and Flexible Scoring
• They DO offer some distributed system management, which we will
have to re-invent unfortunately
• So, we created Galene, LinkedIn’s new search architecture!
https://engineering.linkedin.com/search/did-you-mean-galene
y Questions?
65@r39132 65
Bonus Slides
66@r39132 66
Galene Architecture
67@r39132 67
68
Galene Architecture : Querying
@r39132
Federator
Frontend
Browser
Vertical
Search
Node
Vertical
Broker
• Query Rewriting (Pluggable)
• Scatter-gather across shards
• Lucene (optionally sharded)
• Scoring (Pluggable)
• Query Intent Detection
• Result Blending
Other
Verticals
….
69
Galene Architecture : Indexing (Offline)
@r39132
Federator
Frontend
Browser
Vertical
Search
Node
Hadoop
Vertical
Indexer
Node
Vertical
Broker
Index
Distribution
Service
Offline Index Building and
Distribution
• Batch-oriented, built daily
• Builds offline ranking and rewriting
models
• Rebuilds Indexes when new fields
added
70
Galene Architecture
@r39132
Federator
Frontend
Browser
Vertical
Search
Node
Hadoop
Vertical
Indexer
Node
Vertical
Broker
Index
Distribution
Service
Offline Index Building and
Distribution
• Bit-Torrent-based Index Distribution Service
• Pushes new indexes and models to running
services
71
Galene Architecture
@r39132
Federator
Frontend
Browser
Vertical
Search
Node
Vertical
Live
Updater
Hadoop
Vertical
Indexer
Node
Vertical
Broker
Index
Distribution
Service
KafkaDatabus
Kafka
Samza
Online Index Updates
• Online (near-real-time) indexer
• Updates indexes between Hadoop builds
72
Galene Architecture
@r39132
Federator
Frontend
Browser
Vertical
Search
Node
Vertical
Live
Updater
Hadoop
Vertical
Indexer
Node
Vertical
Broker
Index
Distribution
Service
KafkaDatabus
Kafka
Samza
Periodic Index
Optimization
• Snapshots live data
into a compact format
• Send ss-index to
search nodes over bit-
torrent
y Questions?
73@r39132 73

Contenu connexe

Tendances

Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR BenchmarksExtending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR BenchmarksJamie Grier
 
From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondVasia Kalavri
 
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu KasinathanSpark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu KasinathanDatabricks
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsLightbend
 
Community Update May 2016 (January - May) | Berlin Apache Flink Meetup
Community Update May 2016 (January - May) | Berlin Apache Flink MeetupCommunity Update May 2016 (January - May) | Berlin Apache Flink Meetup
Community Update May 2016 (January - May) | Berlin Apache Flink MeetupRobert Metzger
 
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuApache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuSlim Baltagi
 
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Apache Fink 1.0: A New Era  for Real-World Streaming AnalyticsApache Fink 1.0: A New Era  for Real-World Streaming Analytics
Apache Fink 1.0: A New Era for Real-World Streaming AnalyticsSlim Baltagi
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Databricks
 
Spark Summit EU talk by Yiannis Gkoufas
Spark Summit EU talk by Yiannis GkoufasSpark Summit EU talk by Yiannis Gkoufas
Spark Summit EU talk by Yiannis GkoufasSpark Summit
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkFlink Forward
 
Apache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiApache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiSlim Baltagi
 
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Evan Chan
 
QCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache FlinkQCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache FlinkRobert Metzger
 
Stateful Stream Processing at In-Memory Speed
Stateful Stream Processing at In-Memory SpeedStateful Stream Processing at In-Memory Speed
Stateful Stream Processing at In-Memory SpeedJamie Grier
 
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...Natan Silnitsky
 
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Till Rohrmann
 
Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads
Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloadsTill Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads
Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloadsFlink Forward
 
SparkOscope: Enabling Apache Spark Optimization through Cross Stack Monitorin...
SparkOscope: Enabling Apache Spark Optimization through Cross Stack Monitorin...SparkOscope: Enabling Apache Spark Optimization through Cross Stack Monitorin...
SparkOscope: Enabling Apache Spark Optimization through Cross Stack Monitorin...Databricks
 
What Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registriesWhat Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registriesAlexander Dean
 

Tendances (20)

Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR BenchmarksExtending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
 
From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyond
 
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu KasinathanSpark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu Kasinathan
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive Streams
 
Community Update May 2016 (January - May) | Berlin Apache Flink Meetup
Community Update May 2016 (January - May) | Berlin Apache Flink MeetupCommunity Update May 2016 (January - May) | Berlin Apache Flink Meetup
Community Update May 2016 (January - May) | Berlin Apache Flink Meetup
 
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuApache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
 
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Apache Fink 1.0: A New Era  for Real-World Streaming AnalyticsApache Fink 1.0: A New Era  for Real-World Streaming Analytics
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
 
Spark Summit EU talk by Yiannis Gkoufas
Spark Summit EU talk by Yiannis GkoufasSpark Summit EU talk by Yiannis Gkoufas
Spark Summit EU talk by Yiannis Gkoufas
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
 
Apache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiApache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim Baltagi
 
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
 
QCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache FlinkQCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache Flink
 
Stateful Stream Processing at In-Memory Speed
Stateful Stream Processing at In-Memory SpeedStateful Stream Processing at In-Memory Speed
Stateful Stream Processing at In-Memory Speed
 
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
 
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
 
Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads
Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloadsTill Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads
Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads
 
SparkOscope: Enabling Apache Spark Optimization through Cross Stack Monitorin...
SparkOscope: Enabling Apache Spark Optimization through Cross Stack Monitorin...SparkOscope: Enabling Apache Spark Optimization through Cross Stack Monitorin...
SparkOscope: Enabling Apache Spark Optimization through Cross Stack Monitorin...
 
What Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registriesWhat Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registries
 

En vedette

The "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInThe "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInSam Shah
 
NoSQL, Growing up at Oracle
NoSQL, Growing up at OracleNoSQL, Growing up at Oracle
NoSQL, Growing up at OracleDATAVERSITY
 
Automated Schema Design for NoSQL Databases
Automated Schema Design for NoSQL DatabasesAutomated Schema Design for NoSQL Databases
Automated Schema Design for NoSQL DatabasesMichael Mior
 
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of View
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of ViewNoSQL Plus MySQL From MySQL Practitioner\'s Point Of View
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of ViewAlex Esterkin
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015NoSQLmatters
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutesKaren Lopez
 
NoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL ApplicationsNoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL ApplicationsMichael Mior
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersKaren Lopez
 
Introduction to rest.li
Introduction to rest.liIntroduction to rest.li
Introduction to rest.liJoe Betz
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresDATAVERSITY
 
Non-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresNon-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresJoël Perras
 
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)Michael Bleigh
 
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Alexandre Morgaut
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In DepthFabio Fumarola
 
Introduction to Apache Airflow - Data Day Seattle 2016
Introduction to Apache Airflow - Data Day Seattle 2016Introduction to Apache Airflow - Data Day Seattle 2016
Introduction to Apache Airflow - Data Day Seattle 2016Sid Anand
 
How to Use LinkedIn to Impact Every Stage of the Marketing Funnel
How to Use LinkedIn to Impact Every Stage of the Marketing FunnelHow to Use LinkedIn to Impact Every Stage of the Marketing Funnel
How to Use LinkedIn to Impact Every Stage of the Marketing FunnelLinkedIn
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2Fabio Fumarola
 
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachSlides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachDATAVERSITY
 

En vedette (20)

The "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInThe "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedIn
 
NoSQL, Growing up at Oracle
NoSQL, Growing up at OracleNoSQL, Growing up at Oracle
NoSQL, Growing up at Oracle
 
Automated Schema Design for NoSQL Databases
Automated Schema Design for NoSQL DatabasesAutomated Schema Design for NoSQL Databases
Automated Schema Design for NoSQL Databases
 
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of View
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of ViewNoSQL Plus MySQL From MySQL Practitioner\'s Point Of View
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of View
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutes
 
NoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL ApplicationsNoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL Applications
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data Modelers
 
Introduction to rest.li
Introduction to rest.liIntroduction to rest.li
Introduction to rest.li
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data Stores
 
Non-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresNon-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value Stores
 
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
 
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
 
NoSQL meets Microservices
NoSQL meets MicroservicesNoSQL meets Microservices
NoSQL meets Microservices
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Introduction to Apache Airflow - Data Day Seattle 2016
Introduction to Apache Airflow - Data Day Seattle 2016Introduction to Apache Airflow - Data Day Seattle 2016
Introduction to Apache Airflow - Data Day Seattle 2016
 
Real-World NoSQL Schema Design
Real-World NoSQL Schema DesignReal-World NoSQL Schema Design
Real-World NoSQL Schema Design
 
How to Use LinkedIn to Impact Every Stage of the Marketing Funnel
How to Use LinkedIn to Impact Every Stage of the Marketing FunnelHow to Use LinkedIn to Impact Every Stage of the Marketing Funnel
How to Use LinkedIn to Impact Every Stage of the Marketing Funnel
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachSlides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
 

Similaire à Software Developer and Architecture @ LinkedIn (QCon SF 2014)

Software Development & Architecture @ LinkedIn
Software Development & Architecture @ LinkedInSoftware Development & Architecture @ LinkedIn
Software Development & Architecture @ LinkedInC4Media
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Sid Anand
 
Building a Bridge to a Legacy Application: How Hard Can That Be?
Building a Bridge to a Legacy Application: How Hard Can That Be?Building a Bridge to a Legacy Application: How Hard Can That Be?
Building a Bridge to a Legacy Application: How Hard Can That Be?M. Scott Ford
 
Design for scale
Design for scaleDesign for scale
Design for scaleDoug Lampe
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedInGuozhang Wang
 
AWS Summit Berlin 2012 Talk on Web Data Commons
AWS Summit Berlin 2012 Talk on Web Data CommonsAWS Summit Berlin 2012 Talk on Web Data Commons
AWS Summit Berlin 2012 Talk on Web Data CommonsHannes Mühleisen
 
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012Amazon Web Services
 
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...MongoDB
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseLINAGORA
 
Webhooks with Azure Functions - Live 360 Conference
Webhooks with Azure Functions - Live 360 ConferenceWebhooks with Azure Functions - Live 360 Conference
Webhooks with Azure Functions - Live 360 ConferenceSparkPost
 
DESIGN West 2013 Presentation: Accelerating Android Development and Delivery
DESIGN West 2013 Presentation: Accelerating Android Development and DeliveryDESIGN West 2013 Presentation: Accelerating Android Development and Delivery
DESIGN West 2013 Presentation: Accelerating Android Development and DeliveryDavid Rosen
 
Framing the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLFraming the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLInside Analysis
 
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraCassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraDataStax Academy
 
Utilizing open-data
Utilizing open-dataUtilizing open-data
Utilizing open-dataccalnan
 
Utilizing Open Government Data Using Drupal
Utilizing Open Government Data Using DrupalUtilizing Open Government Data Using Drupal
Utilizing Open Government Data Using Drupalccalnan
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and FluentdN Masahiro
 
containerd and what it means for the container ecosystem
containerd and what it means for the container ecosystemcontainerd and what it means for the container ecosystem
containerd and what it means for the container ecosystemJustin Steele
 
containerd and what it means for the container ecosystem
containerd and what it means for the container ecosystemcontainerd and what it means for the container ecosystem
containerd and what it means for the container ecosystemJustin Steele
 
Migration from Redshift to Spark
Migration from Redshift to SparkMigration from Redshift to Spark
Migration from Redshift to SparkSky Yin
 

Similaire à Software Developer and Architecture @ LinkedIn (QCon SF 2014) (20)

Software Development & Architecture @ LinkedIn
Software Development & Architecture @ LinkedInSoftware Development & Architecture @ LinkedIn
Software Development & Architecture @ LinkedIn
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)
 
Building a Bridge to a Legacy Application: How Hard Can That Be?
Building a Bridge to a Legacy Application: How Hard Can That Be?Building a Bridge to a Legacy Application: How Hard Can That Be?
Building a Bridge to a Legacy Application: How Hard Can That Be?
 
Design for scale
Design for scaleDesign for scale
Design for scale
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
AWS Summit Berlin 2012 Talk on Web Data Commons
AWS Summit Berlin 2012 Talk on Web Data CommonsAWS Summit Berlin 2012 Talk on Web Data Commons
AWS Summit Berlin 2012 Talk on Web Data Commons
 
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
 
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entreprise
 
Webhooks with Azure Functions - Live 360 Conference
Webhooks with Azure Functions - Live 360 ConferenceWebhooks with Azure Functions - Live 360 Conference
Webhooks with Azure Functions - Live 360 Conference
 
DESIGN West 2013 Presentation: Accelerating Android Development and Delivery
DESIGN West 2013 Presentation: Accelerating Android Development and DeliveryDESIGN West 2013 Presentation: Accelerating Android Development and Delivery
DESIGN West 2013 Presentation: Accelerating Android Development and Delivery
 
Framing the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLFraming the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQL
 
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraCassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
 
Utilizing open-data
Utilizing open-dataUtilizing open-data
Utilizing open-data
 
Utilizing Open Government Data Using Drupal
Utilizing Open Government Data Using DrupalUtilizing Open Government Data Using Drupal
Utilizing Open Government Data Using Drupal
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and Fluentd
 
Getting Started With Sparklyr
Getting Started With SparklyrGetting Started With Sparklyr
Getting Started With Sparklyr
 
containerd and what it means for the container ecosystem
containerd and what it means for the container ecosystemcontainerd and what it means for the container ecosystem
containerd and what it means for the container ecosystem
 
containerd and what it means for the container ecosystem
containerd and what it means for the container ecosystemcontainerd and what it means for the container ecosystem
containerd and what it means for the container ecosystem
 
Migration from Redshift to Spark
Migration from Redshift to SparkMigration from Redshift to Spark
Migration from Redshift to Spark
 

Plus de Sid Anand

Building High Fidelity Data Streams (QCon London 2023)
Building High Fidelity Data Streams (QCon London 2023)Building High Fidelity Data Streams (QCon London 2023)
Building High Fidelity Data Streams (QCon London 2023)Sid Anand
 
Building & Operating High-Fidelity Data Streams - QCon Plus 2021
Building & Operating High-Fidelity Data Streams - QCon Plus 2021Building & Operating High-Fidelity Data Streams - QCon Plus 2021
Building & Operating High-Fidelity Data Streams - QCon Plus 2021Sid Anand
 
Low Latency Fraud Detection & Prevention
Low Latency Fraud Detection & PreventionLow Latency Fraud Detection & Prevention
Low Latency Fraud Detection & PreventionSid Anand
 
YOW! Data Keynote (2021)
YOW! Data Keynote (2021)YOW! Data Keynote (2021)
YOW! Data Keynote (2021)Sid Anand
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowSid Anand
 
Cloud Native Predictive Data Pipelines (micro talk)
Cloud Native Predictive Data Pipelines (micro talk)Cloud Native Predictive Data Pipelines (micro talk)
Cloud Native Predictive Data Pipelines (micro talk)Sid Anand
 
Cloud Native Data Pipelines (GoTo Chicago 2017)
Cloud Native Data Pipelines (GoTo Chicago 2017)Cloud Native Data Pipelines (GoTo Chicago 2017)
Cloud Native Data Pipelines (GoTo Chicago 2017)Sid Anand
 
Cloud Native Data Pipelines (DataEngConf SF 2017)
Cloud Native Data Pipelines (DataEngConf SF 2017)Cloud Native Data Pipelines (DataEngConf SF 2017)
Cloud Native Data Pipelines (DataEngConf SF 2017)Sid Anand
 
LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)
LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)
LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)Sid Anand
 
Hands On with Maven
Hands On with MavenHands On with Maven
Hands On with MavenSid Anand
 
Learning git
Learning gitLearning git
Learning gitSid Anand
 
LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)Sid Anand
 
LinkedIn Data Infrastructure (QCon London 2012)
LinkedIn Data Infrastructure (QCon London 2012)LinkedIn Data Infrastructure (QCon London 2012)
LinkedIn Data Infrastructure (QCon London 2012)Sid Anand
 
Linked in nosql_atnetflix_2012_v1
Linked in nosql_atnetflix_2012_v1Linked in nosql_atnetflix_2012_v1
Linked in nosql_atnetflix_2012_v1Sid Anand
 
Keeping Movies Running Amid Thunderstorms!
Keeping Movies Running Amid Thunderstorms!Keeping Movies Running Amid Thunderstorms!
Keeping Movies Running Amid Thunderstorms!Sid Anand
 
OSCON Data 2011 -- NoSQL @ Netflix, Part 2
OSCON Data 2011 -- NoSQL @ Netflix, Part 2OSCON Data 2011 -- NoSQL @ Netflix, Part 2
OSCON Data 2011 -- NoSQL @ Netflix, Part 2Sid Anand
 
Intuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the CloudIntuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the CloudSid Anand
 
Svccg nosql 2011_v4
Svccg nosql 2011_v4Svccg nosql 2011_v4
Svccg nosql 2011_v4Sid Anand
 
Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Sid Anand
 

Plus de Sid Anand (19)

Building High Fidelity Data Streams (QCon London 2023)
Building High Fidelity Data Streams (QCon London 2023)Building High Fidelity Data Streams (QCon London 2023)
Building High Fidelity Data Streams (QCon London 2023)
 
Building & Operating High-Fidelity Data Streams - QCon Plus 2021
Building & Operating High-Fidelity Data Streams - QCon Plus 2021Building & Operating High-Fidelity Data Streams - QCon Plus 2021
Building & Operating High-Fidelity Data Streams - QCon Plus 2021
 
Low Latency Fraud Detection & Prevention
Low Latency Fraud Detection & PreventionLow Latency Fraud Detection & Prevention
Low Latency Fraud Detection & Prevention
 
YOW! Data Keynote (2021)
YOW! Data Keynote (2021)YOW! Data Keynote (2021)
YOW! Data Keynote (2021)
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache Airflow
 
Cloud Native Predictive Data Pipelines (micro talk)
Cloud Native Predictive Data Pipelines (micro talk)Cloud Native Predictive Data Pipelines (micro talk)
Cloud Native Predictive Data Pipelines (micro talk)
 
Cloud Native Data Pipelines (GoTo Chicago 2017)
Cloud Native Data Pipelines (GoTo Chicago 2017)Cloud Native Data Pipelines (GoTo Chicago 2017)
Cloud Native Data Pipelines (GoTo Chicago 2017)
 
Cloud Native Data Pipelines (DataEngConf SF 2017)
Cloud Native Data Pipelines (DataEngConf SF 2017)Cloud Native Data Pipelines (DataEngConf SF 2017)
Cloud Native Data Pipelines (DataEngConf SF 2017)
 
LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)
LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)
LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)
 
Hands On with Maven
Hands On with MavenHands On with Maven
Hands On with Maven
 
Learning git
Learning gitLearning git
Learning git
 
LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)
 
LinkedIn Data Infrastructure (QCon London 2012)
LinkedIn Data Infrastructure (QCon London 2012)LinkedIn Data Infrastructure (QCon London 2012)
LinkedIn Data Infrastructure (QCon London 2012)
 
Linked in nosql_atnetflix_2012_v1
Linked in nosql_atnetflix_2012_v1Linked in nosql_atnetflix_2012_v1
Linked in nosql_atnetflix_2012_v1
 
Keeping Movies Running Amid Thunderstorms!
Keeping Movies Running Amid Thunderstorms!Keeping Movies Running Amid Thunderstorms!
Keeping Movies Running Amid Thunderstorms!
 
OSCON Data 2011 -- NoSQL @ Netflix, Part 2
OSCON Data 2011 -- NoSQL @ Netflix, Part 2OSCON Data 2011 -- NoSQL @ Netflix, Part 2
OSCON Data 2011 -- NoSQL @ Netflix, Part 2
 
Intuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the CloudIntuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the Cloud
 
Svccg nosql 2011_v4
Svccg nosql 2011_v4Svccg nosql 2011_v4
Svccg nosql 2011_v4
 
Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)
 

Dernier

VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...aditipandeya
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Roomishabajaj13
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Roomgirls4nights
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsstephieert
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 

Dernier (20)

Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girls
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 

Software Developer and Architecture @ LinkedIn (QCon SF 2014)

  • 1. Software Development & Arch @ LinkedIn 1 Sid Anand QCon SF 2014 @r39132
  • 2. About Me 2 * Current Life…  Chief Architect @ ClipMine, a video discovery company  QCon SF Program Committee member  Dad to a very energetic 2 year old boy Previous Life…  Architect in Search and Distributed Data @ LinkedIn  Cloud Data Architect @ Netflix  VP Engineering at Etsy  Software Developer at eBay @r39132 2
  • 3. A Closer Look @ LinkedIn 3@r39132 3
  • 4. LinkedIn 4 * *** Then • Created in 2002 in Reid Hoffman’s living room • In its first month of operation, LinkedIn added 4500 members! @r39132 4
  • 5. LinkedIn 5 * Then • Created in 2002 in Reid Hoffman’s living room • In its first month of operation, LinkedIn added 4500 members! Now • 332M members in 200 countries • 2 members sign up every second • >60% of members overseas • In Q3’14, 75% of new members came from overseas @r39132 5
  • 6. LinkedIn 6 * Then • Created in 2002 in Reid Hoffman’s living room • In its first month of operation, LinkedIn added 4500 members! Now • 332M members in 200 countries • 2 members sign up every second • >60% of members overseas • In Q3’14, 75% of new members are coming from overseas • Fastest growing demographic is not geographic, it’s students! • > 10% of user base already and growing! @r39132 6
  • 7. LinkedIn 7 * Member-growth started to ramp up during 2011, when we IPO’d • 2010 : 55M • 2011 : 90M (IPO) • 2012 : 145M • Q3’14 : 332M (note : numbers reflect start of year) We added ~ same number of users in 2010 than over previous 6 years! @r39132 7
  • 8. LinkedIn 8 * *** Employee-growth also started to ramp up during 2011 • 2010 : 500 • 2011 : 1K (IPO) • 2012 : 2100 • Q3’14: 6K (25% in Engineering) (note : numbers reflect start of year) @r39132 8
  • 10. 10@r39132 10 Alan Shepard • 2nd man in space • 5th person to walk on the moon! • 1st person to hit a golf ball on the moon!
  • 11. LinkedIn 11@r39132 11 When asked by reporters what he thought about while awaiting liftoff, he replied: "The fact that every part of this ship was built by the lowest bidder"
  • 12. How did LinkedIn scale for company and member growth? 12@r39132 12
  • 14. 14 Circa 2011 • On my first day at LinkedIn, I felt pretty excited! Software Development : Challenges @r39132 Linux Desktop • 8 Core • 64GB Ram Mac Air
  • 15. 15 Circa 2011 • On my first day at LinkedIn, I felt pretty excited! Software Development : Challenges @r39132 Linux Desktop • 8 Core • 64GB Ram Mac Air
  • 16. 16 Circa 2011 • Then I tried to compile the code on my laptop! Software Development : Challenges @r39132 Linux Desktop • 8 Core • 64GB Ram Mac Air
  • 17. 17 Circa 2011 • 300+ code projects in a single SVN Repo • SVN checkout world & go-to-lunch • Needed a server-grade machine to compile it! • Ant build (world) & go-make-espresso • Almost every WAR was built from source not intermediate JARs • To test your code locally, you needed to locally deploy every service that your code depended on! (maybe 20) • So, yes, you need a machine that typically lives in your data center! Software Development : Challenges @r39132
  • 18. 18 Circa 2011 • Assume that your code is now • Written • Compiled • Locally Tested • What Next? Software Development : Challenges @r39132
  • 19. 19 Circa 2011 • 500+ developers were checking code into the master branch on the single repo! • So, someone broke master every day! • So • 3 hours to write, build, and locally test code • 3 days to commit it! Software Development : Challenges @r39132
  • 20. 20 Software Development : Challenges @r39132
  • 21. 21 Now (Solved) • Do what the open-source world does with some improvements! • Break the monolithic repo into many individual Git Repos! • Have WARs depend on intermediate JARs – don’t not build the world! • Do not deploy the world for local testing – just connect your Dev machine to a test environment! • What are the improvements? Software Development : Challenges @r39132
  • 23. 23 Software Development @r39132 1. Alice commits code to Git 2. Alice sends a Review Board request to Bob & Cathy, owners of the files! 3. Both Bob & Cathy give ship-its 4. Alice amends her commit message with : RB=<review board id> BUILD-WAR=<list of wars to build> Code Reviews
  • 24. 24 Software Development @r39132 1. Alice pushes code to our Gitorious server where the following verifications: 1. Pre-push Sanity Checks! Must pass of push rejected! 1. Have all owners of the changed files given ship-its? 2. Does the code build? 2. For JAR builds, also build upstream WARs! 3. Run Integration Tests! Code Push (Git Push)
  • 25. 25 Software Development @r39132 1. Assuming that all checks passed, the WAR is now available 2. Our system automatically deploys all wars to test servers 3. QA verifies the new builds QA Test / Staging
  • 26. 26 Software Development @r39132 1. Service owner Dave canaries the new WAR 2. Our EKG system then compares the canary machine to one control machine for 1 hour of product traffic for the following: 1. CPU, Memory increase 2. Fan-in/Fan-out increase 3. Error rate increase 4. Latency increase Production - Canary
  • 27. 27 Software Development @r39132 1. Service owner Dave reviews the EKG report 2. If it looks acceptable, he promotes the build to the rest of the cluster in all data centers Production - Promotion
  • 28. How did LinkedIn scale for company and member growth? 28@r39132 28
  • 30. Web Servers Oracle LinkedIn Architecture @r39132 30 Proto-typical Use – Case • A member updates her profile with new skills, job title, and education • She also accepts a connection request from another member Behind the scenes • Web servers commit data to Oracle • What Happens Next?
  • 31. Web Servers Oracle LinkedIn Architecture @r39132 31 What Happens Next? Profile Updates • She should should become instantly searchable by her new skills, job title, & education! • New groups and job ads should be recommended to her Connection Updates • The news feed should instantly reflect content updates from her new connection! • Also, based on the new connection, the PYMK widget should discover a new 2nd degree neighborhood!
  • 33. 33 We also have a data pipeline to capture high-throughput events that we need to count! Databases are not a good place to do high-TP atomic counting! Kafka is! • This is typically used for ranking signals • E.g. counts member page views to determine who are “hot” LinkedIn : Architecture @r39132
  • 35. LinkedIn Architecture : Single Data Center! @r39132 35
  • 36. LinkedIn : Architecture : Single Data Center! @r39132 36
  • 37. LinkedIn : Architecture : Multi-data Center Project @r39132 37
  • 38. LinkedIn Architecture : Rule 1 @r39132 38 Partition your user base across the data centers! e.g. using Akamai GTM
  • 39. LinkedIn Architecture : Rule1 @r39132 39 Problem! User 1 (mapped to DC1) updates his profile! How will User 2 (mapped to DC2) see it?
  • 40. LinkedIn Architecture : Rule 2 @r39132 40 Link your data centers together at the data fabric level! Not a new concept! Cassandra has been doing it for a few years now in the OLTP database space!
  • 41. LinkedIn Architecture : Rule 2 @r39132 41 Link your data centers together at the data fabric level! Not a new concept! Cassandra has been doing it for a few years now in the OLTP database space! LinkedIn’s Sources of Truth  • We have to make both work in across multiple data centers!
  • 42. LinkedIn Architecture : Rule 2 @r39132 42 Link your data centers together at the data fabric level! Not a new concept! Cassandra has been doing it for a few years now in the OLTP database space! LinkedIn’s Sources of Truth  • We have to make both work in across multiple data centers! • Oracle is fairly easy : we use Oracle Golden-gate! • Kafka is also pretty easy!
  • 43. LinkedIn : Kafka Multi-Data Center @r39132 43 Kafka Local Producer Consumer of Local Events Kafka Data Center 1
  • 44. LinkedIn : Kafka Multi-Data Center @r39132 44 Kafka Local Producer Consumer of Local Events Kafka Local Producer Consumer of Local Events Kafka Data Center 2Kafka Data Center 1
  • 45. LinkedIn : Kafka Multi-Colo @r39132 45 Kafka Local Producer Consumer of Local Events Consumer of Global Events Kafka Local Producer Consumer of Local Events Kafka Data Center 2Kafka Data Center 1
  • 46. LinkedIn : Kafka Multi-Colo @r39132 46 Kafka Local Producer Kafka Global Consumer of Local Events Consumer of Global Events Kafka Local Producer Consumer of Local Events Kafka Data Center 2Kafka Data Center 1
  • 47. LinkedIn : Kafka Multi-Colo @r39132 47 Kafka Local Producer Kafka Global Consumer of Local Events Consumer of Global Events Kafka Local Producer Kafka Global Consumer of Local EventsConsumer of Global Events Kafka Data Center 2Kafka Data Center 1
  • 48. LinkedIn Architecture : Rule 3 @r39132 48 Don’t make any web service calls between data centers! It kills latency, which kills availability!
  • 50. How did LinkedIn scale for company and member growth? 50@r39132 50
  • 52. 52 LinkedIn Search @r39132 Why is Search important to LinkedIn? • Search is a significant income driver! • 332M members that recruiters pay to find! (Recruiter Search) • 2M+ jobs that companies pay to list so you can find them! (Job Search)
  • 53. What Makes LinkedIn Search Unique? 53@r39132 53
  • 54. 54 LinkedIn Search : Federated @r39132
  • 55. LinkedIn Search : Federated @r39132 55 • We index many entities • members, jobs, companies, groups, universities, articles, slides, etc.. • These are separate (vertical) search-engines!
  • 56. LinkedIn Search : Federated @r39132 56 • We index many entities • members, jobs, companies, groups, universities, articles, slides, etc.. • These are separate (vertical) search-engines! • When a user enters “sr software engineer”, which index should we look in? • Jobs, members, groups?
  • 57. LinkedIn Search : Federated @r39132 57 • We index many entities • members, jobs, companies, groups, universities, articles, slides, etc.. • These are separate (vertical) search-engines! • When a user enters “sr software engineer” , which index should we look in? • Jobs, members, groups? • Can we simply send the request to all of the search engines and then show the most relevant results? • No • Ranks (scores) are not comparable across verticals
  • 58. LinkedIn Search : Federated @r39132 58 • We index many entities • members, jobs, companies, groups, universities, articles, slides, etc.. • These are separate (vertical) search-engines! • When a user enters “sr software engineer” , which index should we look in? • Jobs, members, groups? • Can we simply send the request to all of the search engines and then show the most relevant results? • No • Ranks (scores) are not comparable across verticals • What if we pick a vertical based on a user feature? • Job seeker sees jobs, recruiter sees members • Intent Detection : done by Federator
  • 59. LinkedIn Search : Query Rewriting @r39132 59 • Say a recruiter searches for “sr software eng” • There are 20+ ways to represent this title • senior swe • sr swe • senior software engineer
  • 60. LinkedIn Search : Query Rewriting @r39132 60 • Say a recruiter searches for “sr software eng” • There are 20+ ways to represent this title • senior swe • sr swe • senior software engineer • To solve this, we can use a title standarizer, though not every title may have a canonical form! • If a standardized title exists, we can rewrite the user query • title:sr AND title:software AND title:eng  std_title:sswe234
  • 61. LinkedIn Search : Query Rewriting @r39132 61 • Say a recruiter searches for “sr software eng” • There are 20+ ways to represent this title • senior swe • sr swe • senior software engineer • To solve this, we developed a title standarizer! • If a standardized title exists, we can rewrite the user query • title:sr AND title:software AND title:eng  std_title:sswe234 • Query Rewriting helps by expanding the search space by methods such as synonym expansion, spell correction, etc… So we need it!
  • 62. LinkedIn Search : Flexible Scoring @r39132 62 • We index many entities! • Companies, Members, Universities, etc… • We use different scoring formulas and signals for each vertical • We need a way to easily plug-in different custom scorers!
  • 63. LinkedIn Search : Open Source @r39132 63 • Leading open source alternatives (e.g. Lucene, ElasticSearch, SOLR) do not offer these! • Search Federation • Pluggable Query Rewriting • Pluggable and Flexible Scoring • They DO offer some distributed system management, which we will have to re-invent unfortunately
  • 64. LinkedIn Search : Open Source @r39132 64 • Leading open source alternatives (e.g. Lucene, ElasticSearch, SOLR) do not offer these! • Search Federation • Pluggable Query Rewriting • Pluggable and Flexible Scoring • They DO offer some distributed system management, which we will have to re-invent unfortunately • So, we created Galene, LinkedIn’s new search architecture! https://engineering.linkedin.com/search/did-you-mean-galene
  • 68. 68 Galene Architecture : Querying @r39132 Federator Frontend Browser Vertical Search Node Vertical Broker • Query Rewriting (Pluggable) • Scatter-gather across shards • Lucene (optionally sharded) • Scoring (Pluggable) • Query Intent Detection • Result Blending Other Verticals ….
  • 69. 69 Galene Architecture : Indexing (Offline) @r39132 Federator Frontend Browser Vertical Search Node Hadoop Vertical Indexer Node Vertical Broker Index Distribution Service Offline Index Building and Distribution • Batch-oriented, built daily • Builds offline ranking and rewriting models • Rebuilds Indexes when new fields added
  • 70. 70 Galene Architecture @r39132 Federator Frontend Browser Vertical Search Node Hadoop Vertical Indexer Node Vertical Broker Index Distribution Service Offline Index Building and Distribution • Bit-Torrent-based Index Distribution Service • Pushes new indexes and models to running services

Notes de l'éditeur

  1. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  2. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  3. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  4. It could take 3 hours to write and test code and 3 days to get it successfully committed and built!
  5. It could take 3 hours to write and test code and 3 days to get it successfully committed and built!
  6. It could take 3 hours to write and test code and 3 days to get it successfully committed and built!
  7. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  8. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  9. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  10. 2 Major Reasons 2 have MDC: What happens if a DC goes down! Planned : PL/SQL (Triggers, Stored Procedures) cannot be updated online. Can’t do a rolling push for every thing! MySQL upgrade! Unplanned : Bad code push, power outage, network outage Need to scale beyond your current DC DC have a fixed max capacity! Also, sometimes it becomes important to have close proximity to users
  11. When I think about data centers going down, I think about drinking.. Dos equis beer specifically! And then I listen the advice of this man!
  12. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  13. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  14. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  15. We approach this by targeting specific verticals using our Intent Detection Algorithm! E.g. john smith looks like a person, only hit member index E.g. IBM looks like a company or a job title in members or in jobs, hit company, member, and jobs E.g. Stanford looks like a university that someone works at, attended —> hit university, company, member E.g. Software Engineer  : if we infer user is a recruiter or hiring manager, show candidates, else show jobs!
  16. We approach this by targeting specific verticals using our Intent Detection Algorithm! E.g. john smith looks like a person, only hit member index E.g. IBM looks like a company or a job title in members or in jobs, hit company, member, and jobs E.g. Stanford looks like a university that someone works at, attended —> hit university, company, member E.g. Software Engineer  : if we infer user is a recruiter or hiring manager, show candidates, else show jobs!
  17. We approach this by targeting specific verticals using our Intent Detection Algorithm! E.g. john smith looks like a person, only hit member index E.g. IBM looks like a company or a job title in members or in jobs, hit company, member, and jobs E.g. Stanford looks like a university that someone works at, attended —> hit university, company, member E.g. Software Engineer  : if we infer user is a recruiter or hiring manager, show candidates, else show jobs!
  18. We approach this by targeting specific verticals using our Intent Detection Algorithm! E.g. john smith looks like a person, only hit member index E.g. IBM looks like a company or a job title in members or in jobs, hit company, member, and jobs E.g. Stanford looks like a university that someone works at, attended —> hit university, company, member E.g. Software Engineer  : if we infer user is a recruiter or hiring manager, show candidates, else show jobs!
  19. R2 User types in “Senior software Engineer”, but member represent this title in 20+ different ways!  sr swe, senior swe, senior software eng, etc.. To solve this, we have a title standardizer! However, it may not standardize for every title. So, we need to rewrite the query before hitting Lucene, our core search engine! title:senior AND title:software AND title:engineer  if standardized title exists,  stdtitle:sswe2331 This is also called synonym expansion! So, we need some smart rewriter! Rewriters also handle spelling correction! “Senior sofware Enginer” —> “Senior software Engineer” The goal is to expand the search space and to also sometimes target results for more efficient retrieval
  20. R2 User types in “Senior software Engineer”, but member represent this title in 20+ different ways!  sr swe, senior swe, senior software eng, etc.. To solve this, we have a title standardizer! However, it may not standardize for every title. So, we need to rewrite the query before hitting Lucene, our core search engine! title:senior AND title:software AND title:engineer  if standardized title exists,  stdtitle:sswe2331 This is also called synonym expansion! So, we need some smart rewriter! Rewriters also handle spelling correction! “Senior sofware Enginer” —> “Senior software Engineer” The goal is to expand the search space and to also sometimes target results for more efficient retrieval
  21. R2 User types in “Senior software Engineer”, but member represent this title in 20+ different ways!  sr swe, senior swe, senior software eng, etc.. To solve this, we have a title standardizer! However, it may not standardize for every title. So, we need to rewrite the query before hitting Lucene, our core search engine! title:senior AND title:software AND title:engineer  if standardized title exists,  stdtitle:sswe2331 This is also called synonym expansion! So, we need some smart rewriter! Rewriters also handle spelling correction! “Senior sofware Enginer” —> “Senior software Engineer” The goal is to expand the search space and to also sometimes target results for more efficient retrieval
  22. R3 Flexible scoring/ranking per vertical! For example, if you have heard about search, then you have heard about TF-IDF, a measure of how well a document matches a search query based on text in both query and document! This is typically used for ranking in web search. However, we are doing entity search across a social graph!  If you search for “Mark”, we don’t just show all people named Mark! We can only show results in your 1-2-3 degree network! We prioritize results based on a combination of social network signals and text similarlity (TF-IDF) For example, we sort results by connect weight : the % of first degree connections the searcher has in common with each retrieved member!
  23. We approach this by targeting specific verticals using our Intent Detection Algorithm! E.g. john smith looks like a person, only hit member index E.g. IBM looks like a company or a job title in members or in jobs, hit company, member, and jobs E.g. Stanford looks like a university that someone works at, attended —> hit university, company, member E.g. Software Engineer  : if we infer user is a recruiter or hiring manager, show candidates, else show jobs!
  24. We approach this by targeting specific verticals using our Intent Detection Algorithm! E.g. john smith looks like a person, only hit member index E.g. IBM looks like a company or a job title in members or in jobs, hit company, member, and jobs E.g. Stanford looks like a university that someone works at, attended —> hit university, company, member E.g. Software Engineer  : if we infer user is a recruiter or hiring manager, show candidates, else show jobs!
  25. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  26. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  27. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.
  28. For us, fundamentally changing the way the world works begins with our mission statement: To connect the world’s professionals to make them more productive and successful. This means not only helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.