SlideShare une entreprise Scribd logo
1  sur  29
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
 Vijay Aruswamy,
 Staff Engineer, Big Data Operations,
 LinkedIn Corporation
 https://www.linkedin.com/in/vijayaruswamy
2
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Outline
 LinkedIn Overview
 Why Data is important for LinkedIn
 Linkedin’s Big Data Eco-System
 How Automic tools are helping LinkedIn
3
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Our Mission
 Connect the world's professionals to make
them more productive and successful.
4
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 5
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
LinkedIn – Worlds Largest Professional Network
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Outline
 LinkedIn Overview
 Why Data is important for LinkedIn
 Linkedin’s Big Data Eco-System
 How Automic tools are helping LinkedIn
7
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
“What Gets measured, gets fixed”
-David Henke, Former SVP Operations, LinkedIn
8
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 9
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Few Data Driven Products
 People You May Know (PYMK)
 Companies you may be Interested
 Jobs you may be interested
 Groups you may like
 Who Viewed your profile
 Economic Graph Challenge
10
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Few Data Driven Products
11
 People You May Know (PYMK)
 Companies you may be Interested
 Jobs you may be interested
 Groups you may like
 Who Viewed your profile
 Economic Graph Challenge
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Few Data Driven Products
12
 People You May Know (PYMK)
 Companies you may be Interested
 Jobs you may be interested
 Groups you may like
 Who Viewed your profile
 Economic Graph Challenge
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Few Data Driven Products
13
 People You May Know (PYMK)
 Companies you may be Interested
 Jobs you may be interested
 Groups you may like
 Who Viewed your profile
 Economic Graph Challenge
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Few Data Driven Products
14
 People You May Know (PYMK)
 Companies you may be Interested
 Jobs you may be interested
 Groups you may like
 Who Viewed your profile
 Economic Graph Challenge
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Few Data Driven Products
15
 People You May Know (PYMK)
 Companies you may be Interested
 Jobs you may be interested
 Groups you may like
 Who Viewed your profile
 Economic Graph Challenge
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Outline
 LinkedIn Overview
 Why Data is important for LinkedIn
 Linkedin's Big Data Eco-System
 How Automic tools are helping LinkedIn
16
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 17
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Type of Data at “LinkedIn”
Behavioral Data
18
Identity Data Social Data
+ +
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
What does “Big Data” mean at LinkedIn
19
Analytical Challenges & Complexity
Data
Volume
+ ∞
+ ∞
Social Media Data
Web/Behavior
Data
CRM Data
Member Data
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 20
High Level Data Flow
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Camus
 Camus is a MapReduce job to load data from Kafka into HDFS. It is capable of
incrementally copying data from Kafka into HDFS
 http://etl.svbtle.com/setting-up-camus-linkedins-kafka-to-hdfs-pipeline
21
 Unified data ingestion system for internal and external data sources. Gobblin
uses a worker framework where each records run through the four stages of
extraction, conversion, quality checking before writing.
 https://engineering.linkedin.com/data-ingestion/gobblin-big-data-ease
Gobblin
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 22
High Level Data Flow Cont..
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Automic
 Data driven scheduling - A process will not execute before the data dependency is
satisfied.
 Typical time series roll-up hierarchy (hour :: day :: week :: month :: quarter :: year) are
handled by Azkaban
 Processes should execute only when the input data sets are available
23
 Grouping -Organize components and workflows into common area for maintenance,
enhancements
 Supports External dependencies
 Use of Global Variable –Keep storing commonly used password in one place.
 Throttling --Assign Jobs to Queues, Schedule when jobs are to run throughout the day,
Hold jobs under the same flow
 Load Balancing --Assign queues to run on a particular server
 Monitoring --Graphical Explorer
Azkaban
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
LinkedIn’s Big Data Architecture
Online DBs - Prod DCs
Espress
o
Service Metrics
Web Tracking
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 25
LinkedIn’s Application Manager
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Type of jobs scheduled by Automic
 External ETL
 ODS ETL
 Hadoop ETL
 Teradata ETL
 User Input ETL
 Historical Loads
 One-time data fixes
26
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 27
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Data Volume
28
 How many Kafka topics (tracking + service) do we dump on Hadoop?
– ~ 900+, Tracking : 300 (/data/tracking) + Service : 682 (/data/service)
– Data size/day of above?
 ~10 TB
 How many online DB tables do we have on Hadoop?
– ~300+ (Oracle, Espresso, MySql) tables
– Data size?
 ~8 TB
 Capacity of DWH on Teradata
– ~186 TB overall with 6 month retention, ~3 TB every day
– ~340k unique queries/day (248k from users and ~ 90K from ETL)
 Capacity of Hadoop
– Biggest cluster 5 PB with 2500+ nodes
– ETL clusters 3.1 PB with 360+ nodes
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Q & A
29

Contenu connexe

Tendances

ARA - More than Continuous Integrations and Continuous Delivery
ARA - More than Continuous Integrations and Continuous DeliveryARA - More than Continuous Integrations and Continuous Delivery
ARA - More than Continuous Integrations and Continuous DeliveryCA | Automic Software
 
Power of ONE Automation through Web Services
Power of ONE Automation through Web ServicesPower of ONE Automation through Web Services
Power of ONE Automation through Web ServicesCA | Automic Software
 
Automic Banner Lessons from the Field
Automic Banner Lessons from the FieldAutomic Banner Lessons from the Field
Automic Banner Lessons from the FieldCA | Automic Software
 
Automating Banner Financial Aid Dataload - San Jacinto College
Automating Banner Financial Aid Dataload - San Jacinto CollegeAutomating Banner Financial Aid Dataload - San Jacinto College
Automating Banner Financial Aid Dataload - San Jacinto CollegeCA | Automic Software
 
AppSphere 15 - What's New in Java: Leveraging Java in Hybrid Cloud
AppSphere 15 - What's New in Java: Leveraging Java in Hybrid CloudAppSphere 15 - What's New in Java: Leveraging Java in Hybrid Cloud
AppSphere 15 - What's New in Java: Leveraging Java in Hybrid CloudAppDynamics
 
AppSphere 15 - Driving APM Adoption in Complex, Global Environments
AppSphere 15 - Driving APM Adoption in Complex, Global EnvironmentsAppSphere 15 - Driving APM Adoption in Complex, Global Environments
AppSphere 15 - Driving APM Adoption in Complex, Global EnvironmentsAppDynamics
 
AppSphere 15 - Deep Dive into AppDynamics Application Analytics
AppSphere 15 - Deep Dive into AppDynamics Application AnalyticsAppSphere 15 - Deep Dive into AppDynamics Application Analytics
AppSphere 15 - Deep Dive into AppDynamics Application AnalyticsAppDynamics
 
AppSphere 15 - Expedia Lessons from the Trenches: Managing AppDynamics at Scale
AppSphere 15 - Expedia Lessons from the Trenches: Managing AppDynamics at ScaleAppSphere 15 - Expedia Lessons from the Trenches: Managing AppDynamics at Scale
AppSphere 15 - Expedia Lessons from the Trenches: Managing AppDynamics at ScaleAppDynamics
 
AppDynamics VS New Relic – The Complete Guide
AppDynamics VS New Relic – The Complete GuideAppDynamics VS New Relic – The Complete Guide
AppDynamics VS New Relic – The Complete GuideTakipi
 
AppSphere 15 - How AppDynamics is Shaking up the Synthetic Monitoring Product...
AppSphere 15 - How AppDynamics is Shaking up the Synthetic Monitoring Product...AppSphere 15 - How AppDynamics is Shaking up the Synthetic Monitoring Product...
AppSphere 15 - How AppDynamics is Shaking up the Synthetic Monitoring Product...AppDynamics
 
New Relic + Apprenda Webinar
New Relic + Apprenda WebinarNew Relic + Apprenda Webinar
New Relic + Apprenda WebinarPeter Duke
 
AppSphere 15 - Preparing for System Failure: How Pearson used AppDynamics to ...
AppSphere 15 - Preparing for System Failure: How Pearson used AppDynamics to ...AppSphere 15 - Preparing for System Failure: How Pearson used AppDynamics to ...
AppSphere 15 - Preparing for System Failure: How Pearson used AppDynamics to ...AppDynamics
 
The Future of APM and Why It Requires Analytics Everywhere!
The Future of APM and Why It Requires Analytics Everywhere!The Future of APM and Why It Requires Analytics Everywhere!
The Future of APM and Why It Requires Analytics Everywhere!New Relic
 
AppSphere 15 - DevOps and Agile: AppDynamics in Continuous Integration Enviro...
AppSphere 15 - DevOps and Agile: AppDynamics in Continuous Integration Enviro...AppSphere 15 - DevOps and Agile: AppDynamics in Continuous Integration Enviro...
AppSphere 15 - DevOps and Agile: AppDynamics in Continuous Integration Enviro...AppDynamics
 
Building a System That Never Stops New Relic at Scale
Building a System That Never Stops New Relic at ScaleBuilding a System That Never Stops New Relic at Scale
Building a System That Never Stops New Relic at ScaleNew Relic
 
Measuring and Maximizing the Business Impact of Network Automation
Measuring and Maximizing the Business Impact of Network AutomationMeasuring and Maximizing the Business Impact of Network Automation
Measuring and Maximizing the Business Impact of Network AutomationItential
 
AppSphere 15 - Process, Culture and Tools: The Transformation of Gannett and ...
AppSphere 15 - Process, Culture and Tools: The Transformation of Gannett and ...AppSphere 15 - Process, Culture and Tools: The Transformation of Gannett and ...
AppSphere 15 - Process, Culture and Tools: The Transformation of Gannett and ...AppDynamics
 
Apama and Terracotta World: Getting Started in Predictive Analytics
Apama and Terracotta World: Getting Started in Predictive Analytics Apama and Terracotta World: Getting Started in Predictive Analytics
Apama and Terracotta World: Getting Started in Predictive Analytics Software AG
 
[Webinar] Modern Network Compliance: How to Get Proactive with Compliance Val...
[Webinar] Modern Network Compliance: How to Get Proactive with Compliance Val...[Webinar] Modern Network Compliance: How to Get Proactive with Compliance Val...
[Webinar] Modern Network Compliance: How to Get Proactive with Compliance Val...Itential
 
Oracle Management Cloud, OMC architecture
Oracle Management Cloud, OMC architecture Oracle Management Cloud, OMC architecture
Oracle Management Cloud, OMC architecture Samir El-Nabawy
 

Tendances (20)

ARA - More than Continuous Integrations and Continuous Delivery
ARA - More than Continuous Integrations and Continuous DeliveryARA - More than Continuous Integrations and Continuous Delivery
ARA - More than Continuous Integrations and Continuous Delivery
 
Power of ONE Automation through Web Services
Power of ONE Automation through Web ServicesPower of ONE Automation through Web Services
Power of ONE Automation through Web Services
 
Automic Banner Lessons from the Field
Automic Banner Lessons from the FieldAutomic Banner Lessons from the Field
Automic Banner Lessons from the Field
 
Automating Banner Financial Aid Dataload - San Jacinto College
Automating Banner Financial Aid Dataload - San Jacinto CollegeAutomating Banner Financial Aid Dataload - San Jacinto College
Automating Banner Financial Aid Dataload - San Jacinto College
 
AppSphere 15 - What's New in Java: Leveraging Java in Hybrid Cloud
AppSphere 15 - What's New in Java: Leveraging Java in Hybrid CloudAppSphere 15 - What's New in Java: Leveraging Java in Hybrid Cloud
AppSphere 15 - What's New in Java: Leveraging Java in Hybrid Cloud
 
AppSphere 15 - Driving APM Adoption in Complex, Global Environments
AppSphere 15 - Driving APM Adoption in Complex, Global EnvironmentsAppSphere 15 - Driving APM Adoption in Complex, Global Environments
AppSphere 15 - Driving APM Adoption in Complex, Global Environments
 
AppSphere 15 - Deep Dive into AppDynamics Application Analytics
AppSphere 15 - Deep Dive into AppDynamics Application AnalyticsAppSphere 15 - Deep Dive into AppDynamics Application Analytics
AppSphere 15 - Deep Dive into AppDynamics Application Analytics
 
AppSphere 15 - Expedia Lessons from the Trenches: Managing AppDynamics at Scale
AppSphere 15 - Expedia Lessons from the Trenches: Managing AppDynamics at ScaleAppSphere 15 - Expedia Lessons from the Trenches: Managing AppDynamics at Scale
AppSphere 15 - Expedia Lessons from the Trenches: Managing AppDynamics at Scale
 
AppDynamics VS New Relic – The Complete Guide
AppDynamics VS New Relic – The Complete GuideAppDynamics VS New Relic – The Complete Guide
AppDynamics VS New Relic – The Complete Guide
 
AppSphere 15 - How AppDynamics is Shaking up the Synthetic Monitoring Product...
AppSphere 15 - How AppDynamics is Shaking up the Synthetic Monitoring Product...AppSphere 15 - How AppDynamics is Shaking up the Synthetic Monitoring Product...
AppSphere 15 - How AppDynamics is Shaking up the Synthetic Monitoring Product...
 
New Relic + Apprenda Webinar
New Relic + Apprenda WebinarNew Relic + Apprenda Webinar
New Relic + Apprenda Webinar
 
AppSphere 15 - Preparing for System Failure: How Pearson used AppDynamics to ...
AppSphere 15 - Preparing for System Failure: How Pearson used AppDynamics to ...AppSphere 15 - Preparing for System Failure: How Pearson used AppDynamics to ...
AppSphere 15 - Preparing for System Failure: How Pearson used AppDynamics to ...
 
The Future of APM and Why It Requires Analytics Everywhere!
The Future of APM and Why It Requires Analytics Everywhere!The Future of APM and Why It Requires Analytics Everywhere!
The Future of APM and Why It Requires Analytics Everywhere!
 
AppSphere 15 - DevOps and Agile: AppDynamics in Continuous Integration Enviro...
AppSphere 15 - DevOps and Agile: AppDynamics in Continuous Integration Enviro...AppSphere 15 - DevOps and Agile: AppDynamics in Continuous Integration Enviro...
AppSphere 15 - DevOps and Agile: AppDynamics in Continuous Integration Enviro...
 
Building a System That Never Stops New Relic at Scale
Building a System That Never Stops New Relic at ScaleBuilding a System That Never Stops New Relic at Scale
Building a System That Never Stops New Relic at Scale
 
Measuring and Maximizing the Business Impact of Network Automation
Measuring and Maximizing the Business Impact of Network AutomationMeasuring and Maximizing the Business Impact of Network Automation
Measuring and Maximizing the Business Impact of Network Automation
 
AppSphere 15 - Process, Culture and Tools: The Transformation of Gannett and ...
AppSphere 15 - Process, Culture and Tools: The Transformation of Gannett and ...AppSphere 15 - Process, Culture and Tools: The Transformation of Gannett and ...
AppSphere 15 - Process, Culture and Tools: The Transformation of Gannett and ...
 
Apama and Terracotta World: Getting Started in Predictive Analytics
Apama and Terracotta World: Getting Started in Predictive Analytics Apama and Terracotta World: Getting Started in Predictive Analytics
Apama and Terracotta World: Getting Started in Predictive Analytics
 
[Webinar] Modern Network Compliance: How to Get Proactive with Compliance Val...
[Webinar] Modern Network Compliance: How to Get Proactive with Compliance Val...[Webinar] Modern Network Compliance: How to Get Proactive with Compliance Val...
[Webinar] Modern Network Compliance: How to Get Proactive with Compliance Val...
 
Oracle Management Cloud, OMC architecture
Oracle Management Cloud, OMC architecture Oracle Management Cloud, OMC architecture
Oracle Management Cloud, OMC architecture
 

En vedette

Application Catalog and Approval Runbooks Sample
Application Catalog and Approval Runbooks SampleApplication Catalog and Approval Runbooks Sample
Application Catalog and Approval Runbooks SampleJames Donnelly
 
Eating our Own Dogfood - How Automic Automates
Eating our Own Dogfood - How Automic AutomatesEating our Own Dogfood - How Automic Automates
Eating our Own Dogfood - How Automic AutomatesCA | Automic Software
 
Run Book Automation versus WorkLoad Automation
Run Book Automation versus WorkLoad AutomationRun Book Automation versus WorkLoad Automation
Run Book Automation versus WorkLoad AutomationAnne Plancius
 
Network ESC - Media, Arts & Entertainment Staffing
Network ESC - Media, Arts & Entertainment StaffingNetwork ESC - Media, Arts & Entertainment Staffing
Network ESC - Media, Arts & Entertainment StaffingScot Sherman
 
Fnbe september 2015 assignment 1
Fnbe september 2015 assignment 1Fnbe september 2015 assignment 1
Fnbe september 2015 assignment 1Li Jing
 
Sales force developer_course_outline
Sales force developer_course_outlineSales force developer_course_outline
Sales force developer_course_outlineAbdul Ghani
 
Network ESC - Advertising, Marketing, Digital/Creative & P.R. staffing
Network ESC - Advertising, Marketing, Digital/Creative & P.R. staffingNetwork ESC - Advertising, Marketing, Digital/Creative & P.R. staffing
Network ESC - Advertising, Marketing, Digital/Creative & P.R. staffingScot Sherman
 
Saianand Natarajan Cv
Saianand Natarajan CvSaianand Natarajan Cv
Saianand Natarajan CvSai Natarajan
 
Amy Mackey_Resume 2015
Amy Mackey_Resume 2015Amy Mackey_Resume 2015
Amy Mackey_Resume 2015Amy Mackey
 
Network ESC - Human Resources Staffing
Network ESC - Human Resources StaffingNetwork ESC - Human Resources Staffing
Network ESC - Human Resources StaffingScot Sherman
 
MichaelDawson_20150611
MichaelDawson_20150611MichaelDawson_20150611
MichaelDawson_20150611Michael Dawson
 
Neeraj Singh(Java Developer)(Imagination Learning System)(2 Years 3 Months)(2...
Neeraj Singh(Java Developer)(Imagination Learning System)(2 Years 3 Months)(2...Neeraj Singh(Java Developer)(Imagination Learning System)(2 Years 3 Months)(2...
Neeraj Singh(Java Developer)(Imagination Learning System)(2 Years 3 Months)(2...Neeraj Singh
 
AnnaLapushnerFullResume
AnnaLapushnerFullResumeAnnaLapushnerFullResume
AnnaLapushnerFullResumeAnna Lapushner
 

En vedette (19)

Application Catalog and Approval Runbooks Sample
Application Catalog and Approval Runbooks SampleApplication Catalog and Approval Runbooks Sample
Application Catalog and Approval Runbooks Sample
 
Eating our Own Dogfood - How Automic Automates
Eating our Own Dogfood - How Automic AutomatesEating our Own Dogfood - How Automic Automates
Eating our Own Dogfood - How Automic Automates
 
Automic World 2016 Announcement
Automic World 2016 AnnouncementAutomic World 2016 Announcement
Automic World 2016 Announcement
 
Run Book Automation versus WorkLoad Automation
Run Book Automation versus WorkLoad AutomationRun Book Automation versus WorkLoad Automation
Run Book Automation versus WorkLoad Automation
 
Network ESC - Media, Arts & Entertainment Staffing
Network ESC - Media, Arts & Entertainment StaffingNetwork ESC - Media, Arts & Entertainment Staffing
Network ESC - Media, Arts & Entertainment Staffing
 
Fnbe september 2015 assignment 1
Fnbe september 2015 assignment 1Fnbe september 2015 assignment 1
Fnbe september 2015 assignment 1
 
Muthu resume
Muthu resumeMuthu resume
Muthu resume
 
Sales force developer_course_outline
Sales force developer_course_outlineSales force developer_course_outline
Sales force developer_course_outline
 
Network ESC - Advertising, Marketing, Digital/Creative & P.R. staffing
Network ESC - Advertising, Marketing, Digital/Creative & P.R. staffingNetwork ESC - Advertising, Marketing, Digital/Creative & P.R. staffing
Network ESC - Advertising, Marketing, Digital/Creative & P.R. staffing
 
Saianand Natarajan Cv
Saianand Natarajan CvSaianand Natarajan Cv
Saianand Natarajan Cv
 
Amy Mackey_Resume 2015
Amy Mackey_Resume 2015Amy Mackey_Resume 2015
Amy Mackey_Resume 2015
 
Sidhanth profile
Sidhanth profileSidhanth profile
Sidhanth profile
 
VaibhavTambde_cv
VaibhavTambde_cvVaibhavTambde_cv
VaibhavTambde_cv
 
Network ESC - Human Resources Staffing
Network ESC - Human Resources StaffingNetwork ESC - Human Resources Staffing
Network ESC - Human Resources Staffing
 
MichaelDawson_20150611
MichaelDawson_20150611MichaelDawson_20150611
MichaelDawson_20150611
 
Neeraj Singh(Java Developer)(Imagination Learning System)(2 Years 3 Months)(2...
Neeraj Singh(Java Developer)(Imagination Learning System)(2 Years 3 Months)(2...Neeraj Singh(Java Developer)(Imagination Learning System)(2 Years 3 Months)(2...
Neeraj Singh(Java Developer)(Imagination Learning System)(2 Years 3 Months)(2...
 
AnnaLapushnerFullResume
AnnaLapushnerFullResumeAnnaLapushnerFullResume
AnnaLapushnerFullResume
 
LindaResume2B
LindaResume2BLindaResume2B
LindaResume2B
 
Rajendra Thota CV
Rajendra Thota CVRajendra Thota CV
Rajendra Thota CV
 

Similaire à How Linkedin uses Automic for Big Data Processes

Big data arch_analytics
Big data arch_analyticsBig data arch_analytics
Big data arch_analyticsSrinu Adira
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInMinh-Hoang Nguyen
 
Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...
Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...
Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...Vitaly Gordon
 
Computing Professional Identity for the Economic Graph
Computing Professional Identity for the Economic GraphComputing Professional Identity for the Economic Graph
Computing Professional Identity for the Economic GraphVitaly Gordon
 
RoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarRoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarSmart Insights
 
Hive at LinkedIn
Hive at LinkedIn Hive at LinkedIn
Hive at LinkedIn mislam77
 
Data, Interconnectedness & The Internet of Things
Data, Interconnectedness & The Internet of Things Data, Interconnectedness & The Internet of Things
Data, Interconnectedness & The Internet of Things Software AG
 
Building a Self-Service Hadoop Platform at Linkedin with Azkaban
Building a Self-Service Hadoop Platform at Linkedin with AzkabanBuilding a Self-Service Hadoop Platform at Linkedin with Azkaban
Building a Self-Service Hadoop Platform at Linkedin with AzkabanDataWorks Summit
 
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...David Chen
 
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and UncertaintyAgile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and UncertaintyTamrMarketing
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnIBM Danmark
 
Semantically-Interlinked Based on Rich Site Summary Bank for Sites of Indones...
Semantically-Interlinked Based on Rich Site Summary Bank for Sites of Indones...Semantically-Interlinked Based on Rich Site Summary Bank for Sites of Indones...
Semantically-Interlinked Based on Rich Site Summary Bank for Sites of Indones...IRJET Journal
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsPeter Skomoroch
 
5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...Dr. Wilfred Lin (Ph.D.)
 
Data centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data developmentData centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data developmentKevin Lee
 
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...Usama Fayyad
 
GDPR: Is Your Organization Ready for the General Data Protection Regulation?
GDPR: Is Your Organization Ready for the General Data Protection Regulation?GDPR: Is Your Organization Ready for the General Data Protection Regulation?
GDPR: Is Your Organization Ready for the General Data Protection Regulation?DATUM LLC
 
Data Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceData Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceDATAVERSITY
 

Similaire à How Linkedin uses Automic for Big Data Processes (20)

Big data arch_analytics
Big data arch_analyticsBig data arch_analytics
Big data arch_analytics
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedIn
 
Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...
Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...
Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...
 
EMA Analyst Slides: 2013 Big Data Research Results
EMA Analyst Slides: 2013 Big Data Research ResultsEMA Analyst Slides: 2013 Big Data Research Results
EMA Analyst Slides: 2013 Big Data Research Results
 
The value of our data
The value of our dataThe value of our data
The value of our data
 
Computing Professional Identity for the Economic Graph
Computing Professional Identity for the Economic GraphComputing Professional Identity for the Economic Graph
Computing Professional Identity for the Economic Graph
 
RoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarRoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology Webinar
 
Hive at LinkedIn
Hive at LinkedIn Hive at LinkedIn
Hive at LinkedIn
 
Data, Interconnectedness & The Internet of Things
Data, Interconnectedness & The Internet of Things Data, Interconnectedness & The Internet of Things
Data, Interconnectedness & The Internet of Things
 
Building a Self-Service Hadoop Platform at Linkedin with Azkaban
Building a Self-Service Hadoop Platform at Linkedin with AzkabanBuilding a Self-Service Hadoop Platform at Linkedin with Azkaban
Building a Self-Service Hadoop Platform at Linkedin with Azkaban
 
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
 
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and UncertaintyAgile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren Ravn
 
Semantically-Interlinked Based on Rich Site Summary Bank for Sites of Indones...
Semantically-Interlinked Based on Rich Site Summary Bank for Sites of Indones...Semantically-Interlinked Based on Rich Site Summary Bank for Sites of Indones...
Semantically-Interlinked Based on Rich Site Summary Bank for Sites of Indones...
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data Products
 
5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...
 
Data centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data developmentData centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data development
 
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
 
GDPR: Is Your Organization Ready for the General Data Protection Regulation?
GDPR: Is Your Organization Ready for the General Data Protection Regulation?GDPR: Is Your Organization Ready for the General Data Protection Regulation?
GDPR: Is Your Organization Ready for the General Data Protection Regulation?
 
Data Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceData Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and Governance
 

Plus de CA | Automic Software

How eBay does Automatic Outage Planning
How eBay does Automatic Outage PlanningHow eBay does Automatic Outage Planning
How eBay does Automatic Outage PlanningCA | Automic Software
 
Maintenance and Management Best Practices from Support
Maintenance and Management Best Practices from SupportMaintenance and Management Best Practices from Support
Maintenance and Management Best Practices from SupportCA | Automic Software
 
Automating Rackspace with ONE Automation
Automating Rackspace with ONE AutomationAutomating Rackspace with ONE Automation
Automating Rackspace with ONE AutomationCA | Automic Software
 
ONE Automation Platform - v11 Features and Functions
ONE Automation Platform - v11 Features and FunctionsONE Automation Platform - v11 Features and Functions
ONE Automation Platform - v11 Features and FunctionsCA | Automic Software
 
Business Automation - Cloud Automation Orchestration Service - Nordea
Business Automation - Cloud Automation Orchestration Service - NordeaBusiness Automation - Cloud Automation Orchestration Service - Nordea
Business Automation - Cloud Automation Orchestration Service - NordeaCA | Automic Software
 
DevOps in Digital Transformation- Brillio
DevOps in Digital Transformation- BrillioDevOps in Digital Transformation- Brillio
DevOps in Digital Transformation- BrillioCA | Automic Software
 
Integrating ONE Automation with Business Systems with the API
Integrating ONE Automation with Business Systems with the APIIntegrating ONE Automation with Business Systems with the API
Integrating ONE Automation with Business Systems with the APICA | Automic Software
 
Platform-as-a-Service for Automated Business Autocomes - Cap Gemini
Platform-as-a-Service for Automated Business Autocomes - Cap GeminiPlatform-as-a-Service for Automated Business Autocomes - Cap Gemini
Platform-as-a-Service for Automated Business Autocomes - Cap GeminiCA | Automic Software
 
Banner Upgrade from AM to v11 - Clemson University
Banner Upgrade from AM to v11 - Clemson UniversityBanner Upgrade from AM to v11 - Clemson University
Banner Upgrade from AM to v11 - Clemson UniversityCA | Automic Software
 
7 Reasons Why Applications Are The Business
7 Reasons Why Applications Are The Business7 Reasons Why Applications Are The Business
7 Reasons Why Applications Are The BusinessCA | Automic Software
 

Plus de CA | Automic Software (10)

How eBay does Automatic Outage Planning
How eBay does Automatic Outage PlanningHow eBay does Automatic Outage Planning
How eBay does Automatic Outage Planning
 
Maintenance and Management Best Practices from Support
Maintenance and Management Best Practices from SupportMaintenance and Management Best Practices from Support
Maintenance and Management Best Practices from Support
 
Automating Rackspace with ONE Automation
Automating Rackspace with ONE AutomationAutomating Rackspace with ONE Automation
Automating Rackspace with ONE Automation
 
ONE Automation Platform - v11 Features and Functions
ONE Automation Platform - v11 Features and FunctionsONE Automation Platform - v11 Features and Functions
ONE Automation Platform - v11 Features and Functions
 
Business Automation - Cloud Automation Orchestration Service - Nordea
Business Automation - Cloud Automation Orchestration Service - NordeaBusiness Automation - Cloud Automation Orchestration Service - Nordea
Business Automation - Cloud Automation Orchestration Service - Nordea
 
DevOps in Digital Transformation- Brillio
DevOps in Digital Transformation- BrillioDevOps in Digital Transformation- Brillio
DevOps in Digital Transformation- Brillio
 
Integrating ONE Automation with Business Systems with the API
Integrating ONE Automation with Business Systems with the APIIntegrating ONE Automation with Business Systems with the API
Integrating ONE Automation with Business Systems with the API
 
Platform-as-a-Service for Automated Business Autocomes - Cap Gemini
Platform-as-a-Service for Automated Business Autocomes - Cap GeminiPlatform-as-a-Service for Automated Business Autocomes - Cap Gemini
Platform-as-a-Service for Automated Business Autocomes - Cap Gemini
 
Banner Upgrade from AM to v11 - Clemson University
Banner Upgrade from AM to v11 - Clemson UniversityBanner Upgrade from AM to v11 - Clemson University
Banner Upgrade from AM to v11 - Clemson University
 
7 Reasons Why Applications Are The Business
7 Reasons Why Applications Are The Business7 Reasons Why Applications Are The Business
7 Reasons Why Applications Are The Business
 

Dernier

Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoKayode Fayemi
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Delhi Call girls
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfSkillCertProExams
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedDelhi Call girls
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Baileyhlharris
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar TrainingKylaCullinane
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIINhPhngng3
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatmentnswingard
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCamilleBoulbin1
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 

Dernier (20)

ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 

How Linkedin uses Automic for Big Data Processes

  • 1. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
  • 2. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.  Vijay Aruswamy,  Staff Engineer, Big Data Operations,  LinkedIn Corporation  https://www.linkedin.com/in/vijayaruswamy 2
  • 3. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Outline  LinkedIn Overview  Why Data is important for LinkedIn  Linkedin’s Big Data Eco-System  How Automic tools are helping LinkedIn 3
  • 4. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Our Mission  Connect the world's professionals to make them more productive and successful. 4
  • 5. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 5
  • 6. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. LinkedIn – Worlds Largest Professional Network
  • 7. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Outline  LinkedIn Overview  Why Data is important for LinkedIn  Linkedin’s Big Data Eco-System  How Automic tools are helping LinkedIn 7
  • 8. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. “What Gets measured, gets fixed” -David Henke, Former SVP Operations, LinkedIn 8
  • 9. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 9
  • 10. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Few Data Driven Products  People You May Know (PYMK)  Companies you may be Interested  Jobs you may be interested  Groups you may like  Who Viewed your profile  Economic Graph Challenge 10
  • 11. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Few Data Driven Products 11  People You May Know (PYMK)  Companies you may be Interested  Jobs you may be interested  Groups you may like  Who Viewed your profile  Economic Graph Challenge
  • 12. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Few Data Driven Products 12  People You May Know (PYMK)  Companies you may be Interested  Jobs you may be interested  Groups you may like  Who Viewed your profile  Economic Graph Challenge
  • 13. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Few Data Driven Products 13  People You May Know (PYMK)  Companies you may be Interested  Jobs you may be interested  Groups you may like  Who Viewed your profile  Economic Graph Challenge
  • 14. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Few Data Driven Products 14  People You May Know (PYMK)  Companies you may be Interested  Jobs you may be interested  Groups you may like  Who Viewed your profile  Economic Graph Challenge
  • 15. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Few Data Driven Products 15  People You May Know (PYMK)  Companies you may be Interested  Jobs you may be interested  Groups you may like  Who Viewed your profile  Economic Graph Challenge
  • 16. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Outline  LinkedIn Overview  Why Data is important for LinkedIn  Linkedin's Big Data Eco-System  How Automic tools are helping LinkedIn 16
  • 17. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 17
  • 18. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Type of Data at “LinkedIn” Behavioral Data 18 Identity Data Social Data + +
  • 19. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. What does “Big Data” mean at LinkedIn 19 Analytical Challenges & Complexity Data Volume + ∞ + ∞ Social Media Data Web/Behavior Data CRM Data Member Data
  • 20. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 20 High Level Data Flow
  • 21. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Camus  Camus is a MapReduce job to load data from Kafka into HDFS. It is capable of incrementally copying data from Kafka into HDFS  http://etl.svbtle.com/setting-up-camus-linkedins-kafka-to-hdfs-pipeline 21  Unified data ingestion system for internal and external data sources. Gobblin uses a worker framework where each records run through the four stages of extraction, conversion, quality checking before writing.  https://engineering.linkedin.com/data-ingestion/gobblin-big-data-ease Gobblin
  • 22. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 22 High Level Data Flow Cont..
  • 23. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Automic  Data driven scheduling - A process will not execute before the data dependency is satisfied.  Typical time series roll-up hierarchy (hour :: day :: week :: month :: quarter :: year) are handled by Azkaban  Processes should execute only when the input data sets are available 23  Grouping -Organize components and workflows into common area for maintenance, enhancements  Supports External dependencies  Use of Global Variable –Keep storing commonly used password in one place.  Throttling --Assign Jobs to Queues, Schedule when jobs are to run throughout the day, Hold jobs under the same flow  Load Balancing --Assign queues to run on a particular server  Monitoring --Graphical Explorer Azkaban
  • 24. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. LinkedIn’s Big Data Architecture Online DBs - Prod DCs Espress o Service Metrics Web Tracking
  • 25. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 25 LinkedIn’s Application Manager
  • 26. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Type of jobs scheduled by Automic  External ETL  ODS ETL  Hadoop ETL  Teradata ETL  User Input ETL  Historical Loads  One-time data fixes 26
  • 27. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. 27
  • 28. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Data Volume 28  How many Kafka topics (tracking + service) do we dump on Hadoop? – ~ 900+, Tracking : 300 (/data/tracking) + Service : 682 (/data/service) – Data size/day of above?  ~10 TB  How many online DB tables do we have on Hadoop? – ~300+ (Oracle, Espresso, MySql) tables – Data size?  ~8 TB  Capacity of DWH on Teradata – ~186 TB overall with 6 month retention, ~3 TB every day – ~340k unique queries/day (248k from users and ~ 90K from ETL)  Capacity of Hadoop – Biggest cluster 5 PB with 2500+ nodes – ETL clusters 3.1 PB with 360+ nodes
  • 29. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Q & A 29