SlideShare une entreprise Scribd logo
1  sur  30
Augmenting MySQL with Big
Data and NoSQL options
The Data Lifecycle
Lead DBA @ Data Services / ObjectRocket by Rackspace
15+ years in data and information systems, ranging from application develop,
data architecture, system design, and more.
Primary focus – Helping business focus on using data not managing
and storing it.
David Murphy
@davidmurphy_data
www.linkedin.com/in/davidbmurphy/
True genius resides in the capacity for
evaluation of uncertain, hazardous,
and conflicting information.
- Winston Churchill
EVERYONE’S GOT TO HAVE A
GREAT DATA QUOTE RIGHT?!
Lifecycle, say
what
Where are
the
technologies
Why One
isn't enough
How to fit
them
together
Outcomes
We want you to leave here understanding:
This
is
NOT…
a deep dive on any technology
a comprehensive list
a roadmap discussion
the end of the journey
What We Will Cover
What We’ll Cover
Concepts
What are the lifecycle stages
How to classify your workloads
Terminology
Actions What technologies are there
When to use them
Fitting them together
Why is this better
What are the lifecycle stages
Transient
• Sessions
• Logins
• Shop Cart
Short - Medium
• Feeds
• E-Commerce
• Video Game Stats
Analytics
• Reports
• Summary Data
• Dash boards
Archival
• Cold Storage
• Seldom Access
• Governances
L i f e C y c l e
What are the lifecycle stages
Transient
• Sessions
• Logins
• Shop Cart
Short - Medium
• Feeds
• E-Commerce
• Video Game Stats
Analytics
• Reports
• Summary Data
• Dash boards
Archival
• Cold Storage
• Seldom Access
• Governances
L i f e C y c l e
What are the lifecycle stages
Transient
• Sessions
• Logins
• Shop Cart
Short - Medium
• Feeds
• E-Commerce
• Video Game Stats
Analytics
• Reports
• Summary Data
• Dash boards
Archival
• Cold Storage
• Seldom Access
• Governances
L i f e C y c l e
What are the lifecycle stages
Transient
• Sessions
• Logins
• Shop Cart
Short - Medium
• Feeds
• E-Commerce
• Video Game Stats
Analytics
• Reports
• Summary Data
• Dash boards
Archival
• Cold Storage
• Seldom Access
• Governances
L i f e C y c l e
Updated frequently
Ultra fast retrieval
If missing is OK
IS IS NOT
Workloads - Transient
Rich Query-able
Durable
Point of truth
Some to many updates
Rich Query-able
Durable + Point of Truth
IS IS NOT
Workloads - Short to Medium
Built for short term
99% Write 1% Reads
Heavy Aggregations
Heavy Aggregations
More Latency
Massive Parallelized
IS IS NOT
Workloads - Analytics
Rich Query-able
Good for many updates
Point of truth
High / Extreme Latency
Ultra Cheap
Built for Retention
IS IS NOT
Workloads - Archival
Rich Query-able
Updateable
Short Term Storage
Terminology:
Documents Rows
Terminology:
Documents
Columns
Rows
Terminology:
Documents
Columns
Rows
Partition
s
Terminology:
Documents
Columns
Rows
Partition
s
Terminology:
Documents
Columns
Rows
Partition
s
Geo & DR
Terminology:
Documents
Columns
Rows
Partition
s
Scaling
Geo & DR
Terminology:
Documents
Columns
Rows
Backups
Partition
s
Scaling
Geo & DR
Terminology:
Documents
Columns
Rows
Backups
Partition
s
Scaling
Geo & DR
The dreaded polyglot persistence
Transient
• Memcache
• CouchBase
• Redis
• SQLite
Medium
• MySQL
• Maria
• PostgreSQL
• Mongo DB
• XtraCluster
• NDB
Analytics
• Hadoop
• InfoBright
• Cassandra
• Teradata
Archival
•Hadoop +
External
•Hadoop
Snapshots
•Cassandra
using S3
Technologies
Fitting it together
• What is the fewest technologies we can use
• What will for new requests
• Do I have plans to handle each stage of data?
• If not can the technologies do a decent job on the
odd case?
• Have talent now? Can I get a service or person easily?
Fitting it together - tools
Build a matrix with
• Features needs ( Transactions, Persistent , Geo,…)
• Importance ( 1- 5)
• Current or Attainable Talent ( 1 -5 )
• Does its Licensing work for this project ( 0 or 1)
(Features * Importance * Talent * License) = Combined Rank
Klout’s great example, but it’s polyglot!
Appboy getting better!
How it should be…
How to scale – focus on what you know
You scale your app by letting someone else
• Build the hardware
• Know the Ops side for the technology
• Make the technologies pass data as its ages vs duplicating
the data
• Be the experts
• You just focus on the features of your app and make $$$
Questions?
WE ARE HIRING! ( DBA, DevOps, and more)
https://rackertalent.com
https://www.objectrocket.com/careers
Twitter: @dmurphy_data @rackspace @objectrocket
Email: david@objectrocket.com
Github: https://github.com/dbmurphy
SlideDeck: https://github.com/dbmurphy/presentations

Contenu connexe

En vedette

Presentation23 (2)
Presentation23 (2)Presentation23 (2)
Presentation23 (2)MATC
 
Les cahiers de l’ant Créer et/ou animer votre page Facebook
Les cahiers de l’ant Créer et/ou animer votre page FacebookLes cahiers de l’ant Créer et/ou animer votre page Facebook
Les cahiers de l’ant Créer et/ou animer votre page FacebookEmilie Rochat
 
Content server (1)
Content server (1)Content server (1)
Content server (1)Rapolu Siva
 
441 settings manager
441 settings manager441 settings manager
441 settings managerjoefin
 
Andrew Harder - “Emerging Market Research”
Andrew Harder - “Emerging Market Research”Andrew Harder - “Emerging Market Research”
Andrew Harder - “Emerging Market Research”UCDUK
 
WordPress for Beginners
WordPress for BeginnersWordPress for Beginners
WordPress for Beginnersayman diab
 
Ust-Kulom, Republic of Komi
Ust-Kulom, Republic of KomiUst-Kulom, Republic of Komi
Ust-Kulom, Republic of KomiMaria Lipina
 
читалићи 2013
читалићи 2013читалићи 2013
читалићи 2013sastavzapet
 
China accounting firm indepth research and investment strategic planning repo...
China accounting firm indepth research and investment strategic planning repo...China accounting firm indepth research and investment strategic planning repo...
China accounting firm indepth research and investment strategic planning repo...Qianzhan Intelligence
 
International OnQ PM 200 - Groups Level Two
International OnQ PM 200 - Groups Level TwoInternational OnQ PM 200 - Groups Level Two
International OnQ PM 200 - Groups Level Twomhtar
 
China engineering consultation industry development prospects and investment ...
China engineering consultation industry development prospects and investment ...China engineering consultation industry development prospects and investment ...
China engineering consultation industry development prospects and investment ...Qianzhan Intelligence
 

En vedette (19)

Presentation23 (2)
Presentation23 (2)Presentation23 (2)
Presentation23 (2)
 
Les cahiers de l’ant Créer et/ou animer votre page Facebook
Les cahiers de l’ant Créer et/ou animer votre page FacebookLes cahiers de l’ant Créer et/ou animer votre page Facebook
Les cahiers de l’ant Créer et/ou animer votre page Facebook
 
Curso Antena3 TV
Curso Antena3 TVCurso Antena3 TV
Curso Antena3 TV
 
ETA_BIO_2015
ETA_BIO_2015ETA_BIO_2015
ETA_BIO_2015
 
Content server (1)
Content server (1)Content server (1)
Content server (1)
 
441 settings manager
441 settings manager441 settings manager
441 settings manager
 
Andrew Harder - “Emerging Market Research”
Andrew Harder - “Emerging Market Research”Andrew Harder - “Emerging Market Research”
Andrew Harder - “Emerging Market Research”
 
WordPress for Beginners
WordPress for BeginnersWordPress for Beginners
WordPress for Beginners
 
Ust-Kulom, Republic of Komi
Ust-Kulom, Republic of KomiUst-Kulom, Republic of Komi
Ust-Kulom, Republic of Komi
 
ACTIVIDAD DE APRENDIZAJE 8
ACTIVIDAD DE APRENDIZAJE  8 ACTIVIDAD DE APRENDIZAJE  8
ACTIVIDAD DE APRENDIZAJE 8
 
читалићи 2013
читалићи 2013читалићи 2013
читалићи 2013
 
Merry Xmas Joyeux Nöel - TLTP
Merry Xmas Joyeux Nöel - TLTPMerry Xmas Joyeux Nöel - TLTP
Merry Xmas Joyeux Nöel - TLTP
 
China accounting firm indepth research and investment strategic planning repo...
China accounting firm indepth research and investment strategic planning repo...China accounting firm indepth research and investment strategic planning repo...
China accounting firm indepth research and investment strategic planning repo...
 
Warrior demos gbg
Warrior demos gbgWarrior demos gbg
Warrior demos gbg
 
Hagan Lawrence Resume
Hagan Lawrence ResumeHagan Lawrence Resume
Hagan Lawrence Resume
 
5 1-control
5 1-control5 1-control
5 1-control
 
International OnQ PM 200 - Groups Level Two
International OnQ PM 200 - Groups Level TwoInternational OnQ PM 200 - Groups Level Two
International OnQ PM 200 - Groups Level Two
 
China engineering consultation industry development prospects and investment ...
China engineering consultation industry development prospects and investment ...China engineering consultation industry development prospects and investment ...
China engineering consultation industry development prospects and investment ...
 
WINPOT CASINO
WINPOT CASINOWINPOT CASINO
WINPOT CASINO
 

Similaire à Augmenting MySQL with NoSQL options - Data Lifecycles

It's All About the Data - Tia Dubuisson
It's All About the Data - Tia DubuissonIt's All About the Data - Tia Dubuisson
It's All About the Data - Tia DubuissonCatalina Arango
 
Data Treatment MongoDB
Data Treatment MongoDBData Treatment MongoDB
Data Treatment MongoDBNorberto Leite
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSPhilip Filleul
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraCaserta
 
Accelerate Data Warehousing Projects with Automation and Data Replication
Accelerate Data Warehousing Projects with Automation and Data ReplicationAccelerate Data Warehousing Projects with Automation and Data Replication
Accelerate Data Warehousing Projects with Automation and Data ReplicationWhereScape
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...DataStax
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudJames Serra
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupScott Mitchell
 
Observability in serverless solutions
Observability in serverless solutionsObservability in serverless solutions
Observability in serverless solutionsLeonardo Murillo
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
AzureDay - Introduction Big Data Analytics.
AzureDay  - Introduction Big Data Analytics.AzureDay  - Introduction Big Data Analytics.
AzureDay - Introduction Big Data Analytics.Łukasz Grala
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
Big Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightBig Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightAmazon Web Services LATAM
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Lessons learned from over 25 Data Virtualization implementations
Lessons learned from over 25 Data Virtualization implementationsLessons learned from over 25 Data Virtualization implementations
Lessons learned from over 25 Data Virtualization implementationsDenodo
 

Similaire à Augmenting MySQL with NoSQL options - Data Lifecycles (20)

It's All About the Data - Tia Dubuisson
It's All About the Data - Tia DubuissonIt's All About the Data - Tia Dubuisson
It's All About the Data - Tia Dubuisson
 
Data Treatment MongoDB
Data Treatment MongoDBData Treatment MongoDB
Data Treatment MongoDB
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
 
Accelerate Data Warehousing Projects with Automation and Data Replication
Accelerate Data Warehousing Projects with Automation and Data ReplicationAccelerate Data Warehousing Projects with Automation and Data Replication
Accelerate Data Warehousing Projects with Automation and Data Replication
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
 
Observability in serverless solutions
Observability in serverless solutionsObservability in serverless solutions
Observability in serverless solutions
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Big Data Boom
Big Data BoomBig Data Boom
Big Data Boom
 
AzureDay - Introduction Big Data Analytics.
AzureDay  - Introduction Big Data Analytics.AzureDay  - Introduction Big Data Analytics.
AzureDay - Introduction Big Data Analytics.
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Big Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightBig Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of Light
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Lessons learned from over 25 Data Virtualization implementations
Lessons learned from over 25 Data Virtualization implementationsLessons learned from over 25 Data Virtualization implementations
Lessons learned from over 25 Data Virtualization implementations
 

Augmenting MySQL with NoSQL options - Data Lifecycles

  • 1. Augmenting MySQL with Big Data and NoSQL options The Data Lifecycle
  • 2. Lead DBA @ Data Services / ObjectRocket by Rackspace 15+ years in data and information systems, ranging from application develop, data architecture, system design, and more. Primary focus – Helping business focus on using data not managing and storing it. David Murphy @davidmurphy_data www.linkedin.com/in/davidbmurphy/
  • 3. True genius resides in the capacity for evaluation of uncertain, hazardous, and conflicting information. - Winston Churchill EVERYONE’S GOT TO HAVE A GREAT DATA QUOTE RIGHT?!
  • 4. Lifecycle, say what Where are the technologies Why One isn't enough How to fit them together Outcomes We want you to leave here understanding:
  • 5. This is NOT… a deep dive on any technology a comprehensive list a roadmap discussion the end of the journey What We Will Cover
  • 6. What We’ll Cover Concepts What are the lifecycle stages How to classify your workloads Terminology Actions What technologies are there When to use them Fitting them together Why is this better
  • 7. What are the lifecycle stages Transient • Sessions • Logins • Shop Cart Short - Medium • Feeds • E-Commerce • Video Game Stats Analytics • Reports • Summary Data • Dash boards Archival • Cold Storage • Seldom Access • Governances L i f e C y c l e
  • 8. What are the lifecycle stages Transient • Sessions • Logins • Shop Cart Short - Medium • Feeds • E-Commerce • Video Game Stats Analytics • Reports • Summary Data • Dash boards Archival • Cold Storage • Seldom Access • Governances L i f e C y c l e
  • 9. What are the lifecycle stages Transient • Sessions • Logins • Shop Cart Short - Medium • Feeds • E-Commerce • Video Game Stats Analytics • Reports • Summary Data • Dash boards Archival • Cold Storage • Seldom Access • Governances L i f e C y c l e
  • 10. What are the lifecycle stages Transient • Sessions • Logins • Shop Cart Short - Medium • Feeds • E-Commerce • Video Game Stats Analytics • Reports • Summary Data • Dash boards Archival • Cold Storage • Seldom Access • Governances L i f e C y c l e
  • 11. Updated frequently Ultra fast retrieval If missing is OK IS IS NOT Workloads - Transient Rich Query-able Durable Point of truth
  • 12. Some to many updates Rich Query-able Durable + Point of Truth IS IS NOT Workloads - Short to Medium Built for short term 99% Write 1% Reads Heavy Aggregations
  • 13. Heavy Aggregations More Latency Massive Parallelized IS IS NOT Workloads - Analytics Rich Query-able Good for many updates Point of truth
  • 14. High / Extreme Latency Ultra Cheap Built for Retention IS IS NOT Workloads - Archival Rich Query-able Updateable Short Term Storage
  • 23. Transient • Memcache • CouchBase • Redis • SQLite Medium • MySQL • Maria • PostgreSQL • Mongo DB • XtraCluster • NDB Analytics • Hadoop • InfoBright • Cassandra • Teradata Archival •Hadoop + External •Hadoop Snapshots •Cassandra using S3 Technologies
  • 24. Fitting it together • What is the fewest technologies we can use • What will for new requests • Do I have plans to handle each stage of data? • If not can the technologies do a decent job on the odd case? • Have talent now? Can I get a service or person easily?
  • 25. Fitting it together - tools Build a matrix with • Features needs ( Transactions, Persistent , Geo,…) • Importance ( 1- 5) • Current or Attainable Talent ( 1 -5 ) • Does its Licensing work for this project ( 0 or 1) (Features * Importance * Talent * License) = Combined Rank
  • 26. Klout’s great example, but it’s polyglot!
  • 28. How it should be…
  • 29. How to scale – focus on what you know You scale your app by letting someone else • Build the hardware • Know the Ops side for the technology • Make the technologies pass data as its ages vs duplicating the data • Be the experts • You just focus on the features of your app and make $$$
  • 30. Questions? WE ARE HIRING! ( DBA, DevOps, and more) https://rackertalent.com https://www.objectrocket.com/careers Twitter: @dmurphy_data @rackspace @objectrocket Email: david@objectrocket.com Github: https://github.com/dbmurphy SlideDeck: https://github.com/dbmurphy/presentations

Notes de l'éditeur

  1. Atomic: Everything in a transaction succeeds or the entire transaction is rolled back. Consistent: A transaction cannot leave the database in an inconsistent state. Isolated: Transactions cannot interfere with each other. Durable: Completed transactions persist, even when servers restart etc.
  2. Atomic: Everything in a transaction succeeds or the entire transaction is rolled back. Consistent: A transaction cannot leave the database in an inconsistent state. Isolated: Transactions cannot interfere with each other. Durable: Completed transactions persist, even when servers restart etc.
  3. Atomic: Everything in a transaction succeeds or the entire transaction is rolled back. Consistent: A transaction cannot leave the database in an inconsistent state. Isolated: Transactions cannot interfere with each other. Durable: Completed transactions persist, even when servers restart etc.
  4. Atomic: Everything in a transaction succeeds or the entire transaction is rolled back. Consistent: A transaction cannot leave the database in an inconsistent state. Isolated: Transactions cannot interfere with each other. Durable: Completed transactions persist, even when servers restart etc.