SlideShare a Scribd company logo
1 of 37
What ya gonna do?
  without the help of Moore’s Law?
Scope

• Internet effect on corporate data centre
• End of Moore’s law
• Scaling on and off CPU
Internet emerges
• 1980s - Connections
  •   Broadband connectivity at work, modem @ home
  •   Beginnings of e-Commerce (Amazon’s readers recommendations
      shows the way)

• 1990s - Few Publishers
  •   Internet bubble
  •   Rise of Search (Google shows the way)
  •   Start of consumer publications (Blogs / WIKIs)
Read-write Internet

• Good connectivity / reach
• Social networking = publication explosion
• Smart phones WIFI / 3G
Outputs

• More, much more data
• Content is rich (read BIG!!)
 • audio, video, photo
• Data is unstructured or semi-structured
 • users don’t do DBs
We ain’t Twitter
•   OK, but wouldn’t you like to mine all of that
    public information?

    •   See what they are saying about your
        products / competitors / their requirements?

•   Is there any possibility of turning on an internal
    fire hose?

    •   How many fine-grained business events
        happen in your company that you would like
        to track / analyse? Someone will....
Fire Hydrants
• They’re coming - more data, from more people
  and more devices

• Use data to improve decisions
• Gain insight to the organisation
• Jump competition or at least maintain pace
Numbers
• Facebook serves 250k unique pages per
    second (June 2010)

• Twitter has seen a rise from 10m to 50m
    tweets per day in the last year (July 2010)

• 1Gb of disk $700k         (1980) ---   10c (2010)
•   “Between the birth of the world and 2003, there
    were 5 exabytes of information created. We [now]
    create 5 exabytes every 2 days.” Eric Schmidt,
    CEO, Google
So what?
•   As people share more, they will change the way
    they form their opinions

•   Existing media channels are struggling to adapt
    their business models

•   Traditional market research, product marketing
    and after-sales channels become less relevant to
    these consumers

•   Being out of the loop is bad for business
How bad for business?
• Now: data is a key asset of business
• Future: business data is not only private
 • as public content integrated into analysis
• Maintaining secrecy will rise in cost
 • internal systems management
 • governance as you join the conversation
Effects

• Conventional platforms cannot
   • store so much data cost effectively
   • process the data cost effectively
   • derive meaning from unstructured sources
Hardware Now

• SMP x86-64 & bit players
• Large local RAM (<=2TB)
• NAS for high capacity storage (<=14PB)
• On-premise
Today’s Big Boxes

• Indicate trends and influences
• Use 50k-250k CPU cores
• All Top 10 supercomputers run Linux
• Algorithms must be fault tolerant
Moore’s is Less

•   Moore’s law was software developer’s friend
    •   30 years of good times, speed ups “for free”
•   Outward effect of Moore’s law only
    maintained if exploiting multiple cores
•   Standard programming models need to adapt
    to use multiple cores
Hardware Horizon
• Fast inter-core buses and networks
 •   Infiniband: 10Gb/s - 120Gb/s

• Networked memory
 •   NUMA - not homogenous

• Exabyte disk clusters
• Elastic scaling
• On and off-premise integrated
Distributed Disruption

• Existing clustering options do not work
• Existing software models do not work
• Existing data models do not work
Old Skool
• Traditional clustering enables all machines
  in a cluster to behave as if they are one in
  space and time
• Not physically possible to cluster online
  access to all data globally with today’s
  hardware and networks (ask Google)
  •   Not news: traditional corporations do not
      have real-time, coherent global BI databases
I’m gonna pop a CAP in
       your head
• Repeat: clustering does not scale
• So, you can have 2 from 3 of:
 • Consistency
 • Availability
 • Partition Tolerance
AC / DC

• One needs Partition tolerance to scale, so
  you can only have:
  • Availability OR
  • Consistency
• All attempts to scale out conventional
  databases and application servers prove the
  theorem (who still believes in sharding?)
Availability

• Enables high service levels so the site stays
  alive
• Lose global consistency for periods
  (seconds or less)
Consistency

• Focus of RDBMS today
• High cost only appropriate for high value
• Remains the default for non-scaling cases
Eventual Consistency
• A datastore guarantees to eventually
  provide updates to all cluster members
• Some desirable properties
 • Read your own writes
   •   Limited form of cursor stability

 • Monotonic read consistency
   •   Only see updates in the order they happened
Sclerotic Software
• Early (mostly static) binding of everything
  to everything else
• Point to point traffic routing
• Application to server
• Single thread model of control
• Program language to runtime
• Object models to SQL
Shapeability

• Dynamic data routing
• Runtime, in-place upgrades
• Languages that support parallel functions
• Multiple evolving and coexisting schemas
• Zero impedance mismatching
Dynamic Data Routing
• Cannot rely on per input solutions
• Data transfer protocols should have
  minimal impact on programming models
 • Law of leaky abstractions
• Bus required to allow evolution and to add
  intelligence to routes
Upgrades
• Software must be upgradeable in parts
• Software must stay up while upgrade is
  ongoing
• Modular, transitive upgrades (Maven, OSGI)
• Hypervisor VM mobility (vMotion,
  Teleportation)
SCAlable LAnguages
• Java 7 comes with more concurrency
  support (fork/join ... due mid 2011)
• Functional languages have support for high
  concurrency
  • JVM Languages: Scala, Clojure
  • .NET: F#
  • Others: Erlang, Haskell, Ocaml
Schema Shmeema

• Easiest schema evolution is with no schema
  (NoSQL data stores)
• Where schema needed, data can travel with
  its schema (AVRO, Riak, CouchDB)
• Data can be shared via REST, JMS or trickle
  to RDMBS
Objections to models

• Remove the RM from ORM
• Externalize schemas don’t internalize them
• Prefer simple persistence options
 • key/value, graph or document-oriented
What scales

• HTTP - it’s stateless . . . but:
 • Caching layers need to be added
 • Protocol can go faster (Google et al
    proposing updates for 1.2)
• Er, that’s it from the current stack
App server scale FAIL

• Threads too coarse grained and expensive
• Need Actor model to be reliable and scale
  out to exploit the hardware
• CAP based design patterns over data
Software scaling limits
• Ahmdahl’s law still applies:
 • Can only go as fast as the slowest
    serializable task
 • Worse if that task blocks others, which it
    often does
• Software needed to support cloud design
  and testing
RDBMS scale FAIL
•   Index updates do not scale linearly with data

•   Normalize to reduce data volumes but then joins
    become too expensive

•   Transactions are costly and often not needed
    (especially for READ)

•   Hard to manage xx,000 MySQL instances (ask
    Yahoo! and Facebook)

•   License fees scale with load ($1m+ / month for
    Facebook just to serve photos)
NoSQL

• Different flavours (with examples)
 • column oriented (Hadoop, Hbase)
 • document store (Couch, Mongo)
 • key value store (Riak, Redis)
 • eventual consistent (Dynamo,Voldemort)
 • graph database (Neo4J, InfiniteGraph)
NoSQL gains
• Scale
• Performance
• Reliability and uptime
• Simpler application persistence API
• Some SQL syntax for aggregate operations
• Zero backup, if using HA file system
NoSQL loses
• SQL - especially joins
• Schema
• Transactions
• Consistency (for some coarse-grained
  aspects at least)
• Query tools are immature / low-level
NoSQL


• For the diplomats: No(t Only) SQL
• SQL will live on in many applications and
  Use Cases

More Related Content

What's hot

Infinispan - Galder Zamarreno - October 2010
Infinispan - Galder Zamarreno - October 2010Infinispan - Galder Zamarreno - October 2010
Infinispan - Galder Zamarreno - October 2010JUG Lausanne
 
Infinispan, transactional key value data grid and nosql database
Infinispan, transactional key value data grid and nosql databaseInfinispan, transactional key value data grid and nosql database
Infinispan, transactional key value data grid and nosql databaseAlexander Petrov
 
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web ApplicationsWhat Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web ApplicationsTodd Hoff
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQLCrate.io
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?DataStax
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldRandy Shoup
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?Venu Anuganti
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Gavin Heavyside
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardNOLOH LLC.
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Gavin Heavyside
 
4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document DsplayChris Despopoulos
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-final
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-finalTech lab 2016-ep01-pepper-data-dez-slides-20160303-final
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-finalDez Blanchfield
 
SharePoint SpeedMetal Admin 101 - SPSDEN
SharePoint SpeedMetal Admin 101 - SPSDENSharePoint SpeedMetal Admin 101 - SPSDEN
SharePoint SpeedMetal Admin 101 - SPSDENChris McNulty
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singhMayank Singh
 
D Maeda Bi Portfolio
D Maeda Bi PortfolioD Maeda Bi Portfolio
D Maeda Bi PortfolioDMaeda
 

What's hot (20)

Infinispan - Galder Zamarreno - October 2010
Infinispan - Galder Zamarreno - October 2010Infinispan - Galder Zamarreno - October 2010
Infinispan - Galder Zamarreno - October 2010
 
Infinispan, transactional key value data grid and nosql database
Infinispan, transactional key value data grid and nosql databaseInfinispan, transactional key value data grid and nosql database
Infinispan, transactional key value data grid and nosql database
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web ApplicationsWhat Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQL
 
Hpts 2011 flexible_oltp
Hpts 2011 flexible_oltpHpts 2011 flexible_oltp
Hpts 2011 flexible_oltp
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forward
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010
 
4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay
 
Database History From Codd to Brewer
Database History From Codd to BrewerDatabase History From Codd to Brewer
Database History From Codd to Brewer
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-final
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-finalTech lab 2016-ep01-pepper-data-dez-slides-20160303-final
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-final
 
Revision
RevisionRevision
Revision
 
SharePoint SpeedMetal Admin 101 - SPSDEN
SharePoint SpeedMetal Admin 101 - SPSDENSharePoint SpeedMetal Admin 101 - SPSDEN
SharePoint SpeedMetal Admin 101 - SPSDEN
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
 
D Maeda Bi Portfolio
D Maeda Bi PortfolioD Maeda Bi Portfolio
D Maeda Bi Portfolio
 

Similar to What ya gonna do?

ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)Huibert Aalbers
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Ricard Clau
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learnJohn D Almon
 
How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.Steve Hoffman
 
Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015Ricard Clau
 
Why we got to Docker
Why we got to DockerWhy we got to Docker
Why we got to Dockerallingeek
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop IntroductionJayant Mukherjee
 
Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha Talagala
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMichael Hiskey
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of databcantrill
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)Ben Stopford
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDBFoundationDB
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Bob Pusateri
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 

Similar to What ya gonna do? (20)

Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.
 
Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015
 
Why we got to Docker
Why we got to DockerWhy we got to Docker
Why we got to Docker
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of data
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
noSQL choices
noSQL choicesnoSQL choices
noSQL choices
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

What ya gonna do?

  • 1. What ya gonna do? without the help of Moore’s Law?
  • 2. Scope • Internet effect on corporate data centre • End of Moore’s law • Scaling on and off CPU
  • 3. Internet emerges • 1980s - Connections • Broadband connectivity at work, modem @ home • Beginnings of e-Commerce (Amazon’s readers recommendations shows the way) • 1990s - Few Publishers • Internet bubble • Rise of Search (Google shows the way) • Start of consumer publications (Blogs / WIKIs)
  • 4. Read-write Internet • Good connectivity / reach • Social networking = publication explosion • Smart phones WIFI / 3G
  • 5. Outputs • More, much more data • Content is rich (read BIG!!) • audio, video, photo • Data is unstructured or semi-structured • users don’t do DBs
  • 6. We ain’t Twitter • OK, but wouldn’t you like to mine all of that public information? • See what they are saying about your products / competitors / their requirements? • Is there any possibility of turning on an internal fire hose? • How many fine-grained business events happen in your company that you would like to track / analyse? Someone will....
  • 7. Fire Hydrants • They’re coming - more data, from more people and more devices • Use data to improve decisions • Gain insight to the organisation • Jump competition or at least maintain pace
  • 8. Numbers • Facebook serves 250k unique pages per second (June 2010) • Twitter has seen a rise from 10m to 50m tweets per day in the last year (July 2010) • 1Gb of disk $700k (1980) --- 10c (2010) • “Between the birth of the world and 2003, there were 5 exabytes of information created. We [now] create 5 exabytes every 2 days.” Eric Schmidt, CEO, Google
  • 9. So what? • As people share more, they will change the way they form their opinions • Existing media channels are struggling to adapt their business models • Traditional market research, product marketing and after-sales channels become less relevant to these consumers • Being out of the loop is bad for business
  • 10. How bad for business? • Now: data is a key asset of business • Future: business data is not only private • as public content integrated into analysis • Maintaining secrecy will rise in cost • internal systems management • governance as you join the conversation
  • 11. Effects • Conventional platforms cannot • store so much data cost effectively • process the data cost effectively • derive meaning from unstructured sources
  • 12. Hardware Now • SMP x86-64 & bit players • Large local RAM (<=2TB) • NAS for high capacity storage (<=14PB) • On-premise
  • 13. Today’s Big Boxes • Indicate trends and influences • Use 50k-250k CPU cores • All Top 10 supercomputers run Linux • Algorithms must be fault tolerant
  • 14. Moore’s is Less • Moore’s law was software developer’s friend • 30 years of good times, speed ups “for free” • Outward effect of Moore’s law only maintained if exploiting multiple cores • Standard programming models need to adapt to use multiple cores
  • 15. Hardware Horizon • Fast inter-core buses and networks • Infiniband: 10Gb/s - 120Gb/s • Networked memory • NUMA - not homogenous • Exabyte disk clusters • Elastic scaling • On and off-premise integrated
  • 16. Distributed Disruption • Existing clustering options do not work • Existing software models do not work • Existing data models do not work
  • 17. Old Skool • Traditional clustering enables all machines in a cluster to behave as if they are one in space and time • Not physically possible to cluster online access to all data globally with today’s hardware and networks (ask Google) • Not news: traditional corporations do not have real-time, coherent global BI databases
  • 18. I’m gonna pop a CAP in your head • Repeat: clustering does not scale • So, you can have 2 from 3 of: • Consistency • Availability • Partition Tolerance
  • 19. AC / DC • One needs Partition tolerance to scale, so you can only have: • Availability OR • Consistency • All attempts to scale out conventional databases and application servers prove the theorem (who still believes in sharding?)
  • 20. Availability • Enables high service levels so the site stays alive • Lose global consistency for periods (seconds or less)
  • 21. Consistency • Focus of RDBMS today • High cost only appropriate for high value • Remains the default for non-scaling cases
  • 22. Eventual Consistency • A datastore guarantees to eventually provide updates to all cluster members • Some desirable properties • Read your own writes • Limited form of cursor stability • Monotonic read consistency • Only see updates in the order they happened
  • 23. Sclerotic Software • Early (mostly static) binding of everything to everything else • Point to point traffic routing • Application to server • Single thread model of control • Program language to runtime • Object models to SQL
  • 24. Shapeability • Dynamic data routing • Runtime, in-place upgrades • Languages that support parallel functions • Multiple evolving and coexisting schemas • Zero impedance mismatching
  • 25. Dynamic Data Routing • Cannot rely on per input solutions • Data transfer protocols should have minimal impact on programming models • Law of leaky abstractions • Bus required to allow evolution and to add intelligence to routes
  • 26. Upgrades • Software must be upgradeable in parts • Software must stay up while upgrade is ongoing • Modular, transitive upgrades (Maven, OSGI) • Hypervisor VM mobility (vMotion, Teleportation)
  • 27. SCAlable LAnguages • Java 7 comes with more concurrency support (fork/join ... due mid 2011) • Functional languages have support for high concurrency • JVM Languages: Scala, Clojure • .NET: F# • Others: Erlang, Haskell, Ocaml
  • 28. Schema Shmeema • Easiest schema evolution is with no schema (NoSQL data stores) • Where schema needed, data can travel with its schema (AVRO, Riak, CouchDB) • Data can be shared via REST, JMS or trickle to RDMBS
  • 29. Objections to models • Remove the RM from ORM • Externalize schemas don’t internalize them • Prefer simple persistence options • key/value, graph or document-oriented
  • 30. What scales • HTTP - it’s stateless . . . but: • Caching layers need to be added • Protocol can go faster (Google et al proposing updates for 1.2) • Er, that’s it from the current stack
  • 31. App server scale FAIL • Threads too coarse grained and expensive • Need Actor model to be reliable and scale out to exploit the hardware • CAP based design patterns over data
  • 32. Software scaling limits • Ahmdahl’s law still applies: • Can only go as fast as the slowest serializable task • Worse if that task blocks others, which it often does • Software needed to support cloud design and testing
  • 33. RDBMS scale FAIL • Index updates do not scale linearly with data • Normalize to reduce data volumes but then joins become too expensive • Transactions are costly and often not needed (especially for READ) • Hard to manage xx,000 MySQL instances (ask Yahoo! and Facebook) • License fees scale with load ($1m+ / month for Facebook just to serve photos)
  • 34. NoSQL • Different flavours (with examples) • column oriented (Hadoop, Hbase) • document store (Couch, Mongo) • key value store (Riak, Redis) • eventual consistent (Dynamo,Voldemort) • graph database (Neo4J, InfiniteGraph)
  • 35. NoSQL gains • Scale • Performance • Reliability and uptime • Simpler application persistence API • Some SQL syntax for aggregate operations • Zero backup, if using HA file system
  • 36. NoSQL loses • SQL - especially joins • Schema • Transactions • Consistency (for some coarse-grained aspects at least) • Query tools are immature / low-level
  • 37. NoSQL • For the diplomats: No(t Only) SQL • SQL will live on in many applications and Use Cases

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. #1 owned by USA, #2 owned by PRC\n
  14. \n
  15. \n
  16. \n
  17. \n
  18. CAP theorem proposed in June 2000 by Eric Brewer\n
  19. \n
  20. \n
  21. \n
  22. Robert Patrick\n
  23. \n
  24. \n
  25. \n
  26. \n
  27. Java: Join/Fork, Parallel arrays, tail call recursion improvements, not closures\n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n