SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
NoSQL Beyond
the Key:Value
      y
Store
By Robert Greene




Versant Corporation U.S. Headquarters
255 Shoreline Dr Suite 450 Redwood City CA 94065
              Dr.      450,          City,
www.versant.com | 650-232-2400




                                                   #NoSQLVersant
The Genesis of NoSQL
Overview
               The Sky is Falling

           NoSQL at it’s Core

               Shift in Architecture

           Shift Innovation

               Domain Models, Distribution, SOA

           Enterprise Needs and NoSQL

           Application Development with NoSQL

           NoSQL 2 0 - Leveraging the Knowledge
                 2.0
           Base




                                       #NoSQLVersant
Genesis of NoSQL
►   The Sky is Falling
    Early Web 2.0 Social Computing drives innovation
        y                   p    g




►   End of the Hammer Era
    One relational tool for every data problem, fails.
                                       problem fails
    Agility and Cost, usher in reason and innovation



                                                  #NoSQLVersant
NoSQL at its Core
An Increasingly Crowed Space
  To “shift”, is to be NoSQL


                       No “shift” Inside




                           #NoSQLVersant
Traditional DBMS Scale Architecture
  INEFFICIENT
 CPU destroying
    Mapping




                               EXPENSIVE
                             Repetitive data
                           movement and JOIN
                               calculation



                          #NoSQLVersant
NoSQL at its Core
A Shift In Application Architecture
                                             UNIFED
                                          Application 
                                          A li ti
                                         driven schema




                                       COMMODITY HW
                                       COMMODITY HW
                                      Horizontal scale out, 
                                        distribution and 
                                          partitioning

     •   Google – Soft-Schema
     •   IBM – Schema-Less




                                  #NoSQLVersant
A Shift is Needed

► How Often do Relations Change?
  Blog : BlogEntry , Order : OrderItem , You : Friend


►Relations Rarely Change, Stop Recalculating
 Them ► Do you need ALL o you da a in o e p ace
             o     eed    of your data one place.

          ► You don’t. You can distribute it.




                                            #NoSQLVersant
NoSQL

Innovation and the Shift




                       #NoSQLVersant
Domain Model Thinking

►   Business Model is Schema
     Not Data Model under Entities

►   Movement of Responsibility
    Soft-Schema (vs) Schema-less

►   Enables changing Nature of Analytics
     SQL/MapReduce – “give me top 20 performers”
     NoSQL – “find 3 dimensional protein pattern match”



                                              #NoSQLVersant
Distributed Thinking
►   Scale-out, with fall out

►   Partition Impact –Implementation, Algorithms
    Different design considerations
       ►   Key Driven access impacts
       ►   Embedded Models
       ►   Enterprise Reference Data




                                        #NoSQLVersant
SOA Thinking

►   Business Processes and Service Orchestration
    The Drivers of Business Agility
       ►   NoSQL enables increased speed of agility
       ►   Faster Time to Market, Competitive Edge


►   Raw Data Manipulation and Mining
    Typically done outside of day to day business
    ETL strategy essential
       ►   Feedback loop for BPM/O layers




                                                      #NoSQLVersant
NoSQL and the Enterprise
Responsibly, taking advantage of the “Shift”




                                  #NoSQLVersant
Embedded Models
                       NoSQL 1 0
                             1.0
►   Document Store Characteristics
    Blogs have Articles


►   Patterns of Access
    Only access sub elements from root
    Good candidate for simple web system
       ►   Query on Articles content to get similar Blogs
       ►   Display Blogs and their Articles




                                                            #NoSQLVersant
Enterprise Models
                      NoSQL 2 0
                            2.0
►   Many to Many
    Blogs get Tags - search based on tag
    Tags weighted, Similarity Meta Data
      g     g               y


►   Faster algorithmic searching
    Narrow Blogs via back reference
       ►   Sub queries on collection contents
    Can leverage A ti l i addition t Bl
    C l          Articles in dditi to Blogs




                                                #NoSQLVersant
Operational Features
                       NoSQL 1 0
                             1.0
►   Transactions – The 20:80 Rule (ACID:CAP)
    Most prevalent NoSQL 1.0 approach
       ►   Give up transactions for better scalibility
       ►   Compensating application code needed
             Code Complexity, Manual Processes
             High Operational Cost
       ►   Weak Transactions
            It’s a start, gets us to 20%, demonstrates the need
    From Key to Criteria Based Query




                                                         #NoSQLVersant
Enterprise Operational Features
                       NoSQL 2 0
                             2.0
►   Transactions – The 80:20 Rule ( ACID:CAP )
    Algorithm, Tagged Blogs via Tag
       ►   No Transactions = lost Blog, no results from Algorithm
►   Cascading Operations
    Network essential
►   External Access
    Jdbc/odbc tooling support




                                                        #NoSQLVersant
Operating NoSQL 1.0

►   DevOps – Dev builds it, Dev owns it.
    Schema-less implementation
       ►   Evolution directly impacts application space ( Development )


►   Data Backup
    Largely fil d
    L    l file dumps, mostly systems off-line
                          tl     t     ff li


►   Custom tooling for out of band needs
    Operational need, write a custom access
    Non-centralized,
    Non-centralized scripted monitoring

                                                        #NoSQLVersant
Enterprise Operations
                     NoSQL 2 0
                           2.0
►   DevOps – Dev builds it, IT owns it eventually.
        p                                       y
    IT System Management
       ►   Centralized monitoring
       ►   Integrated with SNMP / system management
               g                   y         g


►   Availability, Governance, Data Backup
    Enterprise point i ti
    E t     i     i t in time recovery, SOX, HIPPA, etc
                                        SOX HIPPA t
    Fault tolerant, globally replicated
    Online and distributed back up  p

►   Cloud Enabled - utility efficiency
    Automated SLA based Provisioning
    Mobility of Processes
                                                      #NoSQLVersant
Web Development
                   NoSQL 1 0
                         1.0
►   Requires completely new skill set

►   Lack of ecosystem integration
    IDE tooling
    Immature integration
                 g
    Non standard connectivity


►   Custom, custom and more custom
    Each 1st generation product unique / proprietary


                                               #NoSQLVersant
Enterprise Development
                   NoSQL 2 0
                         2.0
►   Leverages existing enterprise skill set
          g          g      p

►   Mature development p
                 p     platforms
    Tomcat, Spring, Hudson, Eclipse enabled


►   Industry standard API’s
    Java – JPA ( 10 years of ORM experts )
    Ruby – OnRails its the shift the matters
           OnRails,




                                               #NoSQLVersant
Application Development
 pp               p
    The Things You Will Build


          NoSQL 1.0




          NoSQL 2.0

                                #NoSQLVersant
Need Proxy Pattern
                     NoSQL 1 0
                           1.0
►   Avoid overhead of extraneous loading
    You want all Blog Articles to get 1 Article?


►   Model must change to use References
    Blog:owner(User)  becomes
    Blog:owner_id(long)


►   Proxy pattern for long to User swizzle
    P       tt    f l      t U       i l
    Object to Value, Value to Object
       ►   Maybe Document store BasicDBObject
       ►   Maybe Key:Value store BSON
                                                   #NoSQLVersant
Serializable
                   NoSQL 1 0
                         1.0
►   You don’t write code in JSON or XML
        don t
    Programming models need transformation


►   Non-Vendor transformation limits
    Create binary format value, cannot query it


►   Not all programming structures are supported
    Map -- Need to breakdown programming model
    List’s -- Array need Serializable


                                                  #NoSQLVersant
Reference System
                    NoSQL 1 0
                          1.0
►   Avoid object duplicates
             j     p
     Load a User’s Personal Blog, Search Tagged Blog
        ► Inconsistencies during runtime



►   Materialization of bi-directional relations
    Need to avoid circular references
                               f
       ► Load Blog*, blog has a Owner:User

       ► Load User, user has a Personal Blog*
                User                          Blog
       ►       …..repeat



                                                     #NoSQLVersant
Need Lifecycle Tracking
                       NoSQL 1 0
                             1.0
►   New, Changed, Deleted
    On store, update: Slow overhead to replace all objects
       ►   If not dirty, do not traverse and update
       ►   If new, add to the reference system
       ►   If null, delete underlying element


►   Need to manage the reference system




                                                      #NoSQLVersant
NoSQL 1.0
                          (observations)


►   Mapping layer is forming
    Why re-invent the wheel
       ►   ‘O’RM – Object Relational Mapping
       ►   ‘O’DM – Object Document Mapping
       ►   ‘O’CM – Object Column Mapping
    Software Industry knows where this leads
       ►   Mapping Complexity, brittle code base, non-agility
       ►   The ‘O’ is what matters, ‘O’bject Lifecycle Management




                                                      #NoSQLVersant
NoSQL 2.0

►   Leverage NoSQL 1.0 architectural shift
    Scale out with performance
       ►   Key partitioned data distribution
             yp
       ►   The good stuff from NoSQL 1.0
►   Eliminate mapping complexity
    Handle modern information models
       ►   Eliminate domain model mapping
       ►   Enable development agility
       ►   Leverage existing enterprise skills
            ‘O’ in a standard (e.g. JPA), without RM,DM,CM



                                                      #NoSQLVersant
Verite Group Case Study




                   #NoSQLVersant
Verite Group

►   Value Proposition
    Line Level I.P. Analytics
       ►   Answers the question: What is happening?
            Not: What has happened?


    Activity Correlation
       ►   Capturing time related sequences of activity
            Not capturing discrete “product” on the wire




                                                           #NoSQLVersant
Verite Group

►   Core netScope Use Case
               p
    Pipeline Monitor and capture
       ►   In-flight I.P. traffic content


    Apply target rules and populate meta models
       ►   High network traffic content equipment variation
                        traffic, content,


    Present analyst visualization and alerts
                y
       ►   Customize new target rules
            Insert into Pipeline and iterate




                                                       #NoSQLVersant
Verite Group
►   Technology Adoption Process
    IBM DB2 – Pure XML store
      ►   Driver: fast ingestion, excellent reg_exp query support
      ►   Failure: huge CPU issues pulling query results
           Analytic model too complex, need objects from results
    Hibernate – P t
    Hib    t    Postgress, M SQL
                           MySQL
      ►   Driver: binary protocol to analytic model up front
           Soft-Schema driven, Still supports reg_exp query
      ►   Failure: data ingestion too slow, CPU max high disk spin
                                      slow      max,
    Versant – NoSQL 2.0
      ►   Driver: speed data ingestion
      ►   Success: high speed data ingestion low CPU low disk spin
                                     ingestion,  CPU,
           Direct soft-schema storage, still supports reg_exp query
           Scale-out capability for large data analytics




                                                               #NoSQLVersant
Verite Group
►   Discovered Value, Lessons Learned
    Changing nature of analytics
       ►   Model driven algorithmic, not iterative query
            E.g. eliminated many reg_exp queries and moved to model
                ►   Significant increase in performance of analytic


    Operational efficiencies
     p
       ►   Soft-Schema is database schema
            Faster analytic model evolution ( less DBA )
            Lower CPU cost to marshal type systems ( mapping )
                                         yp y              pp g
            Less Disk space and fast I/O ( less duplication, disk seeking )




                                                                  #NoSQLVersant
Q&A


      #NoSQLVersant
Contact

      Robert Greene
Vice President, Technology
   rgreene@versant.com
          @      t

  NoSQL Now! – Booth #
    SQ               #14




                           #NoSQLVersant

Contenu connexe

Tendances

Tendances (20)

Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachSlides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprise
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
 
Cheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduceCheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduce
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
Speeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachSpeeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT Approach
 
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
 
2022 02 Integration Bootcamp
2022 02 Integration Bootcamp2022 02 Integration Bootcamp
2022 02 Integration Bootcamp
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse Design
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Data warehouse con azure synapse analytics
Data warehouse con azure synapse analyticsData warehouse con azure synapse analytics
Data warehouse con azure synapse analytics
 

En vedette

Riak Training Session — Surge 2011
Riak Training Session — Surge 2011Riak Training Session — Surge 2011
Riak Training Session — Surge 2011
DstroyAllModels
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overview
Marc Seeger
 
Tableau presentation
Tableau presentationTableau presentation
Tableau presentation
kt166212
 

En vedette (11)

Data visualization
Data visualizationData visualization
Data visualization
 
Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)
 
Schema Design for Riak
Schema Design for RiakSchema Design for Riak
Schema Design for Riak
 
Relational Databases to Riak
Relational Databases to RiakRelational Databases to Riak
Relational Databases to Riak
 
Riak Training Session — Surge 2011
Riak Training Session — Surge 2011Riak Training Session — Surge 2011
Riak Training Session — Surge 2011
 
Introduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingIntroduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf Training
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overview
 
Microsoft Modern Analytics
Microsoft Modern AnalyticsMicrosoft Modern Analytics
Microsoft Modern Analytics
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualization
 
Tableau presentation
Tableau presentationTableau presentation
Tableau presentation
 

Similaire à NoSQL – Beyond the Key-Value Store

001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
Scott Miao
 
Cloud & Big Data: Lessons Learnt
Cloud & Big Data: Lessons LearntCloud & Big Data: Lessons Learnt
Cloud & Big Data: Lessons Learnt
philipbalinov
 

Similaire à NoSQL – Beyond the Key-Value Store (20)

NoSQL and ACID
NoSQL and ACIDNoSQL and ACID
NoSQL and ACID
 
DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?
 
Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQL
 
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQL
 
Internet Scale Architecture
Internet Scale ArchitectureInternet Scale Architecture
Internet Scale Architecture
 
Db trends final
Db trends   finalDb trends   final
Db trends final
 
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
 
FoundationDB - NoSQL and ACID
FoundationDB - NoSQL and ACIDFoundationDB - NoSQL and ACID
FoundationDB - NoSQL and ACID
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Strudel: Framework for Transaction Performance Analyses on SQL/NoSQL Systems
Strudel: Framework for Transaction Performance Analyses on SQL/NoSQL SystemsStrudel: Framework for Transaction Performance Analyses on SQL/NoSQL Systems
Strudel: Framework for Transaction Performance Analyses on SQL/NoSQL Systems
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
 
SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL Server
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
NoSQL: An Architects Perspective
NoSQL: An Architects PerspectiveNoSQL: An Architects Perspective
NoSQL: An Architects Perspective
 
Clustrix Database Overview
Clustrix Database OverviewClustrix Database Overview
Clustrix Database Overview
 
Percona presentation v2
Percona presentation v2Percona presentation v2
Percona presentation v2
 
Cloud & Big Data: Lessons Learnt
Cloud & Big Data: Lessons LearntCloud & Big Data: Lessons Learnt
Cloud & Big Data: Lessons Learnt
 
Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk
 

Plus de DATAVERSITY

The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 

Plus de DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

NoSQL – Beyond the Key-Value Store

  • 1. NoSQL Beyond the Key:Value y Store By Robert Greene Versant Corporation U.S. Headquarters 255 Shoreline Dr Suite 450 Redwood City CA 94065 Dr. 450, City, www.versant.com | 650-232-2400 #NoSQLVersant
  • 2. The Genesis of NoSQL Overview The Sky is Falling NoSQL at it’s Core Shift in Architecture Shift Innovation Domain Models, Distribution, SOA Enterprise Needs and NoSQL Application Development with NoSQL NoSQL 2 0 - Leveraging the Knowledge 2.0 Base #NoSQLVersant
  • 3. Genesis of NoSQL ► The Sky is Falling Early Web 2.0 Social Computing drives innovation y p g ► End of the Hammer Era One relational tool for every data problem, fails. problem fails Agility and Cost, usher in reason and innovation #NoSQLVersant
  • 4. NoSQL at its Core An Increasingly Crowed Space To “shift”, is to be NoSQL No “shift” Inside #NoSQLVersant
  • 5. Traditional DBMS Scale Architecture INEFFICIENT CPU destroying Mapping EXPENSIVE Repetitive data movement and JOIN calculation #NoSQLVersant
  • 6. NoSQL at its Core A Shift In Application Architecture UNIFED Application  A li ti driven schema COMMODITY HW COMMODITY HW Horizontal scale out,  distribution and  partitioning • Google – Soft-Schema • IBM – Schema-Less #NoSQLVersant
  • 7. A Shift is Needed ► How Often do Relations Change? Blog : BlogEntry , Order : OrderItem , You : Friend ►Relations Rarely Change, Stop Recalculating Them ► Do you need ALL o you da a in o e p ace o eed of your data one place. ► You don’t. You can distribute it. #NoSQLVersant
  • 8. NoSQL Innovation and the Shift #NoSQLVersant
  • 9. Domain Model Thinking ► Business Model is Schema Not Data Model under Entities ► Movement of Responsibility Soft-Schema (vs) Schema-less ► Enables changing Nature of Analytics SQL/MapReduce – “give me top 20 performers” NoSQL – “find 3 dimensional protein pattern match” #NoSQLVersant
  • 10. Distributed Thinking ► Scale-out, with fall out ► Partition Impact –Implementation, Algorithms Different design considerations ► Key Driven access impacts ► Embedded Models ► Enterprise Reference Data #NoSQLVersant
  • 11. SOA Thinking ► Business Processes and Service Orchestration The Drivers of Business Agility ► NoSQL enables increased speed of agility ► Faster Time to Market, Competitive Edge ► Raw Data Manipulation and Mining Typically done outside of day to day business ETL strategy essential ► Feedback loop for BPM/O layers #NoSQLVersant
  • 12. NoSQL and the Enterprise Responsibly, taking advantage of the “Shift” #NoSQLVersant
  • 13. Embedded Models NoSQL 1 0 1.0 ► Document Store Characteristics Blogs have Articles ► Patterns of Access Only access sub elements from root Good candidate for simple web system ► Query on Articles content to get similar Blogs ► Display Blogs and their Articles #NoSQLVersant
  • 14. Enterprise Models NoSQL 2 0 2.0 ► Many to Many Blogs get Tags - search based on tag Tags weighted, Similarity Meta Data g g y ► Faster algorithmic searching Narrow Blogs via back reference ► Sub queries on collection contents Can leverage A ti l i addition t Bl C l Articles in dditi to Blogs #NoSQLVersant
  • 15. Operational Features NoSQL 1 0 1.0 ► Transactions – The 20:80 Rule (ACID:CAP) Most prevalent NoSQL 1.0 approach ► Give up transactions for better scalibility ► Compensating application code needed Code Complexity, Manual Processes High Operational Cost ► Weak Transactions It’s a start, gets us to 20%, demonstrates the need From Key to Criteria Based Query #NoSQLVersant
  • 16. Enterprise Operational Features NoSQL 2 0 2.0 ► Transactions – The 80:20 Rule ( ACID:CAP ) Algorithm, Tagged Blogs via Tag ► No Transactions = lost Blog, no results from Algorithm ► Cascading Operations Network essential ► External Access Jdbc/odbc tooling support #NoSQLVersant
  • 17. Operating NoSQL 1.0 ► DevOps – Dev builds it, Dev owns it. Schema-less implementation ► Evolution directly impacts application space ( Development ) ► Data Backup Largely fil d L l file dumps, mostly systems off-line tl t ff li ► Custom tooling for out of band needs Operational need, write a custom access Non-centralized, Non-centralized scripted monitoring #NoSQLVersant
  • 18. Enterprise Operations NoSQL 2 0 2.0 ► DevOps – Dev builds it, IT owns it eventually. p y IT System Management ► Centralized monitoring ► Integrated with SNMP / system management g y g ► Availability, Governance, Data Backup Enterprise point i ti E t i i t in time recovery, SOX, HIPPA, etc SOX HIPPA t Fault tolerant, globally replicated Online and distributed back up p ► Cloud Enabled - utility efficiency Automated SLA based Provisioning Mobility of Processes #NoSQLVersant
  • 19. Web Development NoSQL 1 0 1.0 ► Requires completely new skill set ► Lack of ecosystem integration IDE tooling Immature integration g Non standard connectivity ► Custom, custom and more custom Each 1st generation product unique / proprietary #NoSQLVersant
  • 20. Enterprise Development NoSQL 2 0 2.0 ► Leverages existing enterprise skill set g g p ► Mature development p p platforms Tomcat, Spring, Hudson, Eclipse enabled ► Industry standard API’s Java – JPA ( 10 years of ORM experts ) Ruby – OnRails its the shift the matters OnRails, #NoSQLVersant
  • 21. Application Development pp p The Things You Will Build NoSQL 1.0 NoSQL 2.0 #NoSQLVersant
  • 22. Need Proxy Pattern NoSQL 1 0 1.0 ► Avoid overhead of extraneous loading You want all Blog Articles to get 1 Article? ► Model must change to use References Blog:owner(User)  becomes Blog:owner_id(long) ► Proxy pattern for long to User swizzle P tt f l t U i l Object to Value, Value to Object ► Maybe Document store BasicDBObject ► Maybe Key:Value store BSON #NoSQLVersant
  • 23. Serializable NoSQL 1 0 1.0 ► You don’t write code in JSON or XML don t Programming models need transformation ► Non-Vendor transformation limits Create binary format value, cannot query it ► Not all programming structures are supported Map -- Need to breakdown programming model List’s -- Array need Serializable #NoSQLVersant
  • 24. Reference System NoSQL 1 0 1.0 ► Avoid object duplicates j p Load a User’s Personal Blog, Search Tagged Blog ► Inconsistencies during runtime ► Materialization of bi-directional relations Need to avoid circular references f ► Load Blog*, blog has a Owner:User ► Load User, user has a Personal Blog* User Blog ► …..repeat #NoSQLVersant
  • 25. Need Lifecycle Tracking NoSQL 1 0 1.0 ► New, Changed, Deleted On store, update: Slow overhead to replace all objects ► If not dirty, do not traverse and update ► If new, add to the reference system ► If null, delete underlying element ► Need to manage the reference system #NoSQLVersant
  • 26. NoSQL 1.0 (observations) ► Mapping layer is forming Why re-invent the wheel ► ‘O’RM – Object Relational Mapping ► ‘O’DM – Object Document Mapping ► ‘O’CM – Object Column Mapping Software Industry knows where this leads ► Mapping Complexity, brittle code base, non-agility ► The ‘O’ is what matters, ‘O’bject Lifecycle Management #NoSQLVersant
  • 27. NoSQL 2.0 ► Leverage NoSQL 1.0 architectural shift Scale out with performance ► Key partitioned data distribution yp ► The good stuff from NoSQL 1.0 ► Eliminate mapping complexity Handle modern information models ► Eliminate domain model mapping ► Enable development agility ► Leverage existing enterprise skills ‘O’ in a standard (e.g. JPA), without RM,DM,CM #NoSQLVersant
  • 28. Verite Group Case Study #NoSQLVersant
  • 29. Verite Group ► Value Proposition Line Level I.P. Analytics ► Answers the question: What is happening? Not: What has happened? Activity Correlation ► Capturing time related sequences of activity Not capturing discrete “product” on the wire #NoSQLVersant
  • 30. Verite Group ► Core netScope Use Case p Pipeline Monitor and capture ► In-flight I.P. traffic content Apply target rules and populate meta models ► High network traffic content equipment variation traffic, content, Present analyst visualization and alerts y ► Customize new target rules Insert into Pipeline and iterate #NoSQLVersant
  • 31. Verite Group ► Technology Adoption Process IBM DB2 – Pure XML store ► Driver: fast ingestion, excellent reg_exp query support ► Failure: huge CPU issues pulling query results Analytic model too complex, need objects from results Hibernate – P t Hib t Postgress, M SQL MySQL ► Driver: binary protocol to analytic model up front Soft-Schema driven, Still supports reg_exp query ► Failure: data ingestion too slow, CPU max high disk spin slow max, Versant – NoSQL 2.0 ► Driver: speed data ingestion ► Success: high speed data ingestion low CPU low disk spin ingestion, CPU, Direct soft-schema storage, still supports reg_exp query Scale-out capability for large data analytics #NoSQLVersant
  • 32. Verite Group ► Discovered Value, Lessons Learned Changing nature of analytics ► Model driven algorithmic, not iterative query E.g. eliminated many reg_exp queries and moved to model ► Significant increase in performance of analytic Operational efficiencies p ► Soft-Schema is database schema Faster analytic model evolution ( less DBA ) Lower CPU cost to marshal type systems ( mapping ) yp y pp g Less Disk space and fast I/O ( less duplication, disk seeking ) #NoSQLVersant
  • 33. Q&A #NoSQLVersant
  • 34. Contact Robert Greene Vice President, Technology rgreene@versant.com @ t NoSQL Now! – Booth # SQ #14 #NoSQLVersant