The bulk of the NoSQL Technologies focus on achieving scale-out ability by building their architecture around a simple, distributed hash, key-value store. This works well for partitioning simple data, but in reality, your information models are not simple. As a result, you may have to build enormous layers of code to manage an explicit structure baked into the persistence tier. In this session, take a look at a NoSQL solution which allows you to store naturally clustered, richly linked object networks beneath your key partitioned roots. The result is that you do not have to write extensive code to deal with the physical structure in the persistence tier even when dealing with complex information models like predictive models, timeseries, recursive relations, compositions, etc. We will explore how such an implementation works in practice by looking at a case study of an advanced model analytics and visualization solution built on the clustered NoSQL database solution Versant Database Engine.
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
NoSQL – Beyond the Key-Value Store
1. NoSQL Beyond
the Key:Value
y
Store
By Robert Greene
Versant Corporation U.S. Headquarters
255 Shoreline Dr Suite 450 Redwood City CA 94065
Dr. 450, City,
www.versant.com | 650-232-2400
#NoSQLVersant
2. The Genesis of NoSQL
Overview
The Sky is Falling
NoSQL at it’s Core
Shift in Architecture
Shift Innovation
Domain Models, Distribution, SOA
Enterprise Needs and NoSQL
Application Development with NoSQL
NoSQL 2 0 - Leveraging the Knowledge
2.0
Base
#NoSQLVersant
3. Genesis of NoSQL
► The Sky is Falling
Early Web 2.0 Social Computing drives innovation
y p g
► End of the Hammer Era
One relational tool for every data problem, fails.
problem fails
Agility and Cost, usher in reason and innovation
#NoSQLVersant
4. NoSQL at its Core
An Increasingly Crowed Space
To “shift”, is to be NoSQL
No “shift” Inside
#NoSQLVersant
5. Traditional DBMS Scale Architecture
INEFFICIENT
CPU destroying
Mapping
EXPENSIVE
Repetitive data
movement and JOIN
calculation
#NoSQLVersant
6. NoSQL at its Core
A Shift In Application Architecture
UNIFED
Application
A li ti
driven schema
COMMODITY HW
COMMODITY HW
Horizontal scale out,
distribution and
partitioning
• Google – Soft-Schema
• IBM – Schema-Less
#NoSQLVersant
7. A Shift is Needed
► How Often do Relations Change?
Blog : BlogEntry , Order : OrderItem , You : Friend
►Relations Rarely Change, Stop Recalculating
Them ► Do you need ALL o you da a in o e p ace
o eed of your data one place.
► You don’t. You can distribute it.
#NoSQLVersant
9. Domain Model Thinking
► Business Model is Schema
Not Data Model under Entities
► Movement of Responsibility
Soft-Schema (vs) Schema-less
► Enables changing Nature of Analytics
SQL/MapReduce – “give me top 20 performers”
NoSQL – “find 3 dimensional protein pattern match”
#NoSQLVersant
10. Distributed Thinking
► Scale-out, with fall out
► Partition Impact –Implementation, Algorithms
Different design considerations
► Key Driven access impacts
► Embedded Models
► Enterprise Reference Data
#NoSQLVersant
11. SOA Thinking
► Business Processes and Service Orchestration
The Drivers of Business Agility
► NoSQL enables increased speed of agility
► Faster Time to Market, Competitive Edge
► Raw Data Manipulation and Mining
Typically done outside of day to day business
ETL strategy essential
► Feedback loop for BPM/O layers
#NoSQLVersant
12. NoSQL and the Enterprise
Responsibly, taking advantage of the “Shift”
#NoSQLVersant
13. Embedded Models
NoSQL 1 0
1.0
► Document Store Characteristics
Blogs have Articles
► Patterns of Access
Only access sub elements from root
Good candidate for simple web system
► Query on Articles content to get similar Blogs
► Display Blogs and their Articles
#NoSQLVersant
14. Enterprise Models
NoSQL 2 0
2.0
► Many to Many
Blogs get Tags - search based on tag
Tags weighted, Similarity Meta Data
g g y
► Faster algorithmic searching
Narrow Blogs via back reference
► Sub queries on collection contents
Can leverage A ti l i addition t Bl
C l Articles in dditi to Blogs
#NoSQLVersant
15. Operational Features
NoSQL 1 0
1.0
► Transactions – The 20:80 Rule (ACID:CAP)
Most prevalent NoSQL 1.0 approach
► Give up transactions for better scalibility
► Compensating application code needed
Code Complexity, Manual Processes
High Operational Cost
► Weak Transactions
It’s a start, gets us to 20%, demonstrates the need
From Key to Criteria Based Query
#NoSQLVersant
16. Enterprise Operational Features
NoSQL 2 0
2.0
► Transactions – The 80:20 Rule ( ACID:CAP )
Algorithm, Tagged Blogs via Tag
► No Transactions = lost Blog, no results from Algorithm
► Cascading Operations
Network essential
► External Access
Jdbc/odbc tooling support
#NoSQLVersant
17. Operating NoSQL 1.0
► DevOps – Dev builds it, Dev owns it.
Schema-less implementation
► Evolution directly impacts application space ( Development )
► Data Backup
Largely fil d
L l file dumps, mostly systems off-line
tl t ff li
► Custom tooling for out of band needs
Operational need, write a custom access
Non-centralized,
Non-centralized scripted monitoring
#NoSQLVersant
18. Enterprise Operations
NoSQL 2 0
2.0
► DevOps – Dev builds it, IT owns it eventually.
p y
IT System Management
► Centralized monitoring
► Integrated with SNMP / system management
g y g
► Availability, Governance, Data Backup
Enterprise point i ti
E t i i t in time recovery, SOX, HIPPA, etc
SOX HIPPA t
Fault tolerant, globally replicated
Online and distributed back up p
► Cloud Enabled - utility efficiency
Automated SLA based Provisioning
Mobility of Processes
#NoSQLVersant
19. Web Development
NoSQL 1 0
1.0
► Requires completely new skill set
► Lack of ecosystem integration
IDE tooling
Immature integration
g
Non standard connectivity
► Custom, custom and more custom
Each 1st generation product unique / proprietary
#NoSQLVersant
20. Enterprise Development
NoSQL 2 0
2.0
► Leverages existing enterprise skill set
g g p
► Mature development p
p platforms
Tomcat, Spring, Hudson, Eclipse enabled
► Industry standard API’s
Java – JPA ( 10 years of ORM experts )
Ruby – OnRails its the shift the matters
OnRails,
#NoSQLVersant
22. Need Proxy Pattern
NoSQL 1 0
1.0
► Avoid overhead of extraneous loading
You want all Blog Articles to get 1 Article?
► Model must change to use References
Blog:owner(User) becomes
Blog:owner_id(long)
► Proxy pattern for long to User swizzle
P tt f l t U i l
Object to Value, Value to Object
► Maybe Document store BasicDBObject
► Maybe Key:Value store BSON
#NoSQLVersant
23. Serializable
NoSQL 1 0
1.0
► You don’t write code in JSON or XML
don t
Programming models need transformation
► Non-Vendor transformation limits
Create binary format value, cannot query it
► Not all programming structures are supported
Map -- Need to breakdown programming model
List’s -- Array need Serializable
#NoSQLVersant
24. Reference System
NoSQL 1 0
1.0
► Avoid object duplicates
j p
Load a User’s Personal Blog, Search Tagged Blog
► Inconsistencies during runtime
► Materialization of bi-directional relations
Need to avoid circular references
f
► Load Blog*, blog has a Owner:User
► Load User, user has a Personal Blog*
User Blog
► …..repeat
#NoSQLVersant
25. Need Lifecycle Tracking
NoSQL 1 0
1.0
► New, Changed, Deleted
On store, update: Slow overhead to replace all objects
► If not dirty, do not traverse and update
► If new, add to the reference system
► If null, delete underlying element
► Need to manage the reference system
#NoSQLVersant
26. NoSQL 1.0
(observations)
► Mapping layer is forming
Why re-invent the wheel
► ‘O’RM – Object Relational Mapping
► ‘O’DM – Object Document Mapping
► ‘O’CM – Object Column Mapping
Software Industry knows where this leads
► Mapping Complexity, brittle code base, non-agility
► The ‘O’ is what matters, ‘O’bject Lifecycle Management
#NoSQLVersant
27. NoSQL 2.0
► Leverage NoSQL 1.0 architectural shift
Scale out with performance
► Key partitioned data distribution
yp
► The good stuff from NoSQL 1.0
► Eliminate mapping complexity
Handle modern information models
► Eliminate domain model mapping
► Enable development agility
► Leverage existing enterprise skills
‘O’ in a standard (e.g. JPA), without RM,DM,CM
#NoSQLVersant
29. Verite Group
► Value Proposition
Line Level I.P. Analytics
► Answers the question: What is happening?
Not: What has happened?
Activity Correlation
► Capturing time related sequences of activity
Not capturing discrete “product” on the wire
#NoSQLVersant
30. Verite Group
► Core netScope Use Case
p
Pipeline Monitor and capture
► In-flight I.P. traffic content
Apply target rules and populate meta models
► High network traffic content equipment variation
traffic, content,
Present analyst visualization and alerts
y
► Customize new target rules
Insert into Pipeline and iterate
#NoSQLVersant
31. Verite Group
► Technology Adoption Process
IBM DB2 – Pure XML store
► Driver: fast ingestion, excellent reg_exp query support
► Failure: huge CPU issues pulling query results
Analytic model too complex, need objects from results
Hibernate – P t
Hib t Postgress, M SQL
MySQL
► Driver: binary protocol to analytic model up front
Soft-Schema driven, Still supports reg_exp query
► Failure: data ingestion too slow, CPU max high disk spin
slow max,
Versant – NoSQL 2.0
► Driver: speed data ingestion
► Success: high speed data ingestion low CPU low disk spin
ingestion, CPU,
Direct soft-schema storage, still supports reg_exp query
Scale-out capability for large data analytics
#NoSQLVersant
32. Verite Group
► Discovered Value, Lessons Learned
Changing nature of analytics
► Model driven algorithmic, not iterative query
E.g. eliminated many reg_exp queries and moved to model
► Significant increase in performance of analytic
Operational efficiencies
p
► Soft-Schema is database schema
Faster analytic model evolution ( less DBA )
Lower CPU cost to marshal type systems ( mapping )
yp y pp g
Less Disk space and fast I/O ( less duplication, disk seeking )
#NoSQLVersant