Informix NoSQL & Hybrid SQL detailed deep dive

Informix NoSQL
-- Very Deep Dive
John F. Miller III, IBM miller3@us.ibm.com
Keshava Murthy, IBM rkeshav@us.ibm.com

© 2013 IBM Corporation

Speaker Introduction
John F. Miller III
Lead Architect, STSM

Keshava Murthy
NoSQL & SQL Architect, STSM

Jef Treece
Product Manager

Overview
Business phenomenon of NoSQL
−

Value: Open Source vs Enterprise Class

What is NoSQL, JSON and BSON?
−

Quick overview

−

IBM/Mongo Partnership

New Informix NoSQL/JSON Capabilities
−

Solution Highlights

Mongo Compatibility
−

The Power of a Hybrid Solution

−

Analytics

Demo Deep Dive
−
−
3

Putting together an application
Seeing it live

NewSQL

BUSINESS PHENOMENON OF NOSQL

© 2013 2013Corporation
© IBM IBM Corporation

Informix NoSQL Capabilities
Key-value Stores
– Hashed Key value model
– Capable of massive data storage

Column Family Stores
– Keys that point to multiple columns and column families

Document Data Base
– JSON / BSON Formatted documents
– Collections of Documents with Key-value stored versions

Graph Databases
–
–
–
–

Nodes / relationship between nodes
Properties of nodes
Data values
Supports SPARQL ?


Business Value of Informix NoSQL
Increases business opportunity through capture and analysis of
“interactive / interaction data” for a full understanding of customers
and markets
Lowers business barriers and reduces IT costs through Rapid
Application Deployment
Increases business ability to respond to volume changes through
Scalable / extensible architecture
Lowers IT overhead through use of Low Cost Commodity Hardware
Changes business view to “what is happening” versus “what
happened”


Use Cases for Informix NoSQL
Session Store
– High volume ingestion / low latency access requirements
– Linkage to a specific user provides immediate recognition of customer
preferences
– Session restore after “break” is immediate to last functions performed increase
customer confidence and connection to their desire(s)

User Profile Store
– Customer Profiles, orders and shipment status is immediate and searchable
– Fast access for Authentication and preferences
– Historical access and click streams increase personalization & targeting

Content and Metadata Store
– Document database for digital content, articles, instruction manuals
– Key-value linkages provide linkage to customer profiles and history
– Multiple image/data type support provide fast access to different data type


Use Case for Informix NoSQL (cont)
Mobile Apps
– Ability to store content and user data in schema-less formats increase
development and deployment speed and changes to existing apps
– Supports multi-format data storage associated with differing device types
– Scalable storage provides for document, image, text oriented storage and
retrieval of apps data / information

Third Party Aggregation
– High speed ingestion of data feed from multiple sources:
• Retail store loyalty programs, social media, marketing information, purchase histories by groups,
person, industries

– Ease of format management and analysis

High Availability Cache
–
–
–
–

Storage for popular / frequent search results
Session information
High Frequency Ads tailored for User Profiles, Locations, searches
Dual function cache for data store and fast response time for retrieval


Informix NoSQL Use Case (cont)
Globally Distributed Data Repository
– Scalable nodes and access across distributed systems
– Location affinity for workload optimization
– Capture of differing formats and languages to provide global and discrete views
of the business and analytics of searches, purchases, etc.

eCommerce
– Ability to handle purchase spikes / peak time through scalable system
architecture and low cost commodity hardware
– Fast / rapid application deployment through schema-less development model
– Ability to combine transaction data with user action reduces cost of upgrading
business apps for transaction management

Social Media & Gaming
– Rapid app development , deployment, and change implementation increase
ability to grow customer base and respond to trends – or create them
– Fast access to user profile for authentication and preferences / historical
information
– Real time Analytics and trend identification


Informix NoSQL Use Cases (cont)
Ad Targeting
– Fast access to user profile and histories permit well managed Ad placement
increasing revenue and buy decision
– Click history and Pyschographic based suggestions
– Ability to ingest other feeds and quickly relate to a specific customer
• Loyalty programs
• Social Media
• Associations (LinkedIn, Dating sites, Industry Groups etc.)


Informix and MongoDB Have Free Editions

Editions

MongoDB

Free

Developer
Innovator-C

Basic

For Purchase

11

Informix

Express, Workgroup,
Advanced Workgroup,
Enterprise, Advanced
Enterprise

Standard,
Enterprise

MongoDB Subscriptions
Basic

Standard

Enterprise

Edition

MongoDB

MongoDB

MongoDB
Enterprise

Support

9am-9pm local, M-F

24x7x365

24x7x365

License

AGPL

Commercial

Commercial

Emergency Patches

Not Included

Included

Included

Price

$2,500 / Server / Year

$5,000 / Server /
Year

$7,500 / Server /
Year

Additional monthly charges for backup services.

Subscription information obtained from 10Gen site, June 26, 2013.
12

Price Point Comparison Estimate, 3-year cost
Dual Core Intel
Nehalem

Innovator-C

Express

Workgroup

(4 core, 8 GB,
2 nodes)

(16 core, 16 GB,
unlimited nodes)

Product Cost

$0

$8,540

$19,740

$1,680

Included

Included

Support Renewal Year
2

$1,680

$1,708

$3,948

Support Renewal Year
3

$1,680

$1,708

$3,948

Total

$5,040

$11,956

$27,636

Support Subscription
Year 1
24 x 7 x 365
Production System Down
Development Call
Emergency Patches
Free Upgrades

MongoDB Enterprise, 3-year cost: $22,500
Subscription information obtained from 10Gen site, June 26, 2013.
Retail U.S. prices subject to change, valid as of June 26, 2013.
13

Back up


Informix NoSQL Key Messages
Mobile computing represents one of the greatest opportunities for
organizations to expand their business, requiring mobile apps to have
access to all relevant data
DB2 and Informix have embraced the JSON standard and MongoDB API
to give mobile app developers the ability to combine critical data managed
by DB2 or Informix with the next generation mobile apps
DB2 JSON and Informix JSON let developers leverage the abilities of
relational DBMS together with document store systems

15

IBM Confidential


DB2/Informix JSON for Install Base Clients
Leverage all relevant enterprise data for your mobile app development

The Offer

Client Value

Any for-purchase edition of

One solution for relational and non-relational data

• Latest version of DB2 LUW 10.5
• Latest version of Informix 12.1

• Relational tables and NoSQL collections co-exist in the
same database

Includes modern interface with JSON /BSON
support

• Allows joins between NoSQL and Relational tables

• Lets DB2/Informix be compatible with existing
MongoDB apps with minimal/no application changes
• Ability to handle a mix of structured and unstructured
data

• Joins utilize indexes on both relational and NoSQL
Brings Enterprise level performance, reliability, and
security to mobile apps
• Includes ACID compliance
Includes built-in online backup and restore capabilities

• Flexible schema allows rapid delivery of apps
• Compatible with MongoDB programming interfaces

Automatic Document Compression (only if they buy
Advanced Enterprise or add-on compression with
Enterprise Edition?)

16

IBM Confidential


DB2 JSON Go-to-Market Timeline
Phase 1 – June through August 15

Phase 2 – eGA – Q4

Timing
Target

Phase 3 - 2014
Jan 1, 2014 – TBD

Targeted clients/partners
Opportunistic white space

Install Base – Top xxxx accounts
Key partners TBD

DB2 Trial code with Tech Preview
Content
Tech preview on dW
A series of how-to articles on dW:
Part 1, Part 2, Part 3, Part 4.
Video: DB2 JSON Overview
Video: Getting started

Update NoSQL product page with
JSON content
Re-promote DB2 Trial code
IBM Champions for blogs
dW DM for JSON Community covreing
both DB2 and Informix

Who

Deliverables

Events

17

MongoDB World

Tech Bits and Tech Talks
Developer Days
DB2 Night Show via BP
Leverage existing 10Gen event
calendar to insert content
(these are potential – not confirmed)

IBM Confidential

On-going

Connect with AIM, Rational for joint
presence at 3rd conferences and
events.


Informix JSON Go-to-Market Timeline
Phase 1 - EVP

Phase 2 – eGA – Q4

Phase 3 - 2014

Timing

Aug. 5, 2013 – Sep. 13, 2013

Sep. 13, 2013 – Dec. 31, 2013

Jan 1, 2014 – TBD

Target

•
•

•
•
•

Who

Confirmed EVP participants
Begooden
BP
Moukouri
BP
Consult-IX
BP
ADT
BP
Openbet
Cust
Technicolor
White space cust

Targeted clients/partners

Install Base – Top xxxx accounts
Key partners TBD
Top xxxx accounts
Key partners TBD
(criteria to be determined for each)

TBD

Potential EVP participants
Nuvento
White space cust
SRCE
BP
Cisco
OEM cust
Zebra Fish
BP
Deliverables

Events

18

NoSQL Client Business Value Deck
NoSQLTechnical White-paper
12.10.xC2 Release White Paper
Developer Works article on NoSQL Client
Updated NoSQL demo

•
•

Content on Informix marketing page
with re-direct to DevWorks page
Promote Informix trial code?

Chat with the Labs

•
•

IOD sessions
IOD Lab

IBM Confidential

Connect with AIM, Rational
for joint presence at 3rd
conferences and events.


IOD 2013 Roadmap:
Extend Database Application Versatility with NoSQL
Session No.

Session Title

IPT-2059

JSON Document Support in IBM DB2 Schema Flexibility in a Trusted Database Environment

IPT-2909

Embracing NoSQL: IBM and the Community

IPT-3149

Big Data, Hadoop, and NoSQL: A Crash Course for a Fearless IBM DB2 Practitioner

IDB-1943

Firefighter Safety Enhancement through NoSQL Information Integration Architectures with IBM DB2
Graph Store

IDB-2075

Competitive Product Delivery with IBM DB2 JSON

IDB-1878

Agile Product Development Using IBM DB2 with JSON

IDX-3691

Get the Best of Both Worlds: Bring NoSQL to Your SQL Database

IDX-3722

Balancing Big Data Workloads with NoSQL Auto-Sharding

IDX-2551

"Developing ""Hybrid"" NoSQL/SQL Applications using JSON/BSON with Informix"

IPT-3070

Extending Information Management Solutions with NoSQL Capabilities--XML, RDF and JSON

IDZ-2599

NoSQL and DB2 for z/OS? Receiving Agility from a Trusted Enterprise Database


DB2/Informix Joint Events 2014
Need to identify events where we can submit topics/speakers for
maximum exposure (not just booth participation)

20

IBM Confidential


Some Typical NoSQL Use Cases
- Mostly Interactive Web/Mobile
Online/Mobile Gaming
−
−
−
−
−

Leaderboard (high score table)
management
Dynamic placement of visual elements
Game object management
Persisting game/user state information
Persisting user generated data (e.g.
drawings)

Display Advertising on Web Sites
−
−

Ad Serving: match content with profile and
present
Real-time bidding: match cookie profile with
ad inventory, obtain bids, and present ad

Communications
−

Device provisioning

Social Networking/Online
Communities

21

E-commerce/Social Commerce
– Storing frequently changing product
catalogs

Logging/message passing
– Drop Copy service in Financial
Services (streaming copies of trade
execution messages into (for
example) a risk or back office
system)

Dynamic Content Management
and Publishing (News & Media)
– Store content from distributed
authors, with fast retrieval and
placement
– Manage changing layouts and user
generated content

NoSQL = RDBMS + JSON + Sharding
NoSQL = JSON + Sharding
InformixNoSQL = JSON + Sharding + Transaction + RDBMS(Hybrid) +
HA(shared Disk) + HDR

22

IBM Confidential


NOSQL, JSON AND BSON
OVERVIEW
Technical Opportunities/ Motivation
What are NoSQL Databases?
Quick overview of JSON
What is sharding?

New Era in Application Requirements
Store data from web/mobile application in their
native form
−

New web applications use JSON for storing and
exchanging information

−

Very lightweight – write more efficient applications

−

It is also the preferred data format for mobile
application back-ends

Move from development to production in no
time!
−

Ability to create and deploy flexible JSON schema

−

Gives power to application developers by reducing
dependency on IT

−

Ideal for agile, rapid development and continuous
integration

Why NoSQL?
Non-traditional data management requirements driven by Web
2.0 companies
−

Document stores, key-value stores, graph and columnar dbms

The Three Vs:
−

Velocity – high frequency of data arrivals

−

Volume – BigData

−

Variability – unstructured data, flexibility in
schema design

New data interchange formats – like JSON
(JavaScript Object Notation) and BSON
(Binary JSON)
Scale-out requirements across
heterogeneous environment – Cloud
computing

25

What is a NoSQL Database?
Not Only SQL or NOt allowing SQL
A non-relational database management systems
−

Does not require a fixed schema

−

Avoids join operations

−

Scales horizontally

−

No ACID (eventually consistent)

Good with distributing data and fast application development

Provides a mechanism for storage and retrieval of
data while providing horizontal scaling.

IBM Use Case Characteristics for JSON

Why Most Commercial Relational Databases cannot
meet these Requirements

NoSQL Database Philosophy Differences

No ACID
– No ACID (Atomicity, Consistency, Isolation, Durability)
– An eventual consistence model

No Joins
– Generally single row/document lookups

Flexible Schema
– Rigid format

29


Partnership with IBM and MongoDB
MongoDB and IBM announced a partnership in June 2013

There are many common use cases of interest addressed by the
partnership
−
−
−
−

Accessing JSON Data in DB2, Informix MongoDB using JSON query
Schema-less JSON Data for variety of applications
Making JSON data available for variety of applications
Securing JSON Data

IBM and MongoDB are collaborating in 3 areas:
−
−
−

Open Governance: Standards and Open Source
Technology areas of mutual interest
Products and solutions

Basic Translation Terms/Concepts
Mongo/NoSQL Terms

Traditional SQL Terms

Database

Database

Collection

Table

Document

Row

Field

Column

Collection

Table
Name
John

31

Value

Document

28

Scott

Key

21

Tim

{"name":"John","age":21}
{"name":"Tim","age":28}
{"name":"Scott","age":30}

Age

30

Row

JSON Details
JSON Syntax Rules
–
–
–
–
–

JSON syntax is a subset of the JavaScript object notation syntax:
Data is in name/value pairs
Data is separated by commas
Curly braces hold objects
Square brackets hold arrays

JSON Name/Value Pairs
– JSON data is written as key/value pairs.
– A key/value pair consists of a field name (in double quotes), followed by a
colon, followed by a value:

"name":"John Miller"

The 6 types of JSON Values:
–
–
–
–
–
–
32

A number (integer or floating point)
A string (in double quotes)
A Boolean (true or false)
An array (in square brackets)
An object (in curly brackets)
Null

Example of Supported JSON Types

Example of each JSON
type
Mongo specific JSON
types in blue
– date

{
"string":"John",
"number":123.45,
"boolean":true,
"array":[ "a", "b", "c" ],
"object: { "str":"Miller", "num":711 },
"value": NULL,
"date": ISODate("2013-10-01T00:33:14.000Z")
}

JSON: JavaScript Object Notation
What is JSON?
−

JSON is lightweight text-data interchange format

−

JSON is language independent

−

JSON is "self-describing" and easy to understand

JSON is syntax for storing and exchanging text information much like
XML. However, JSON is smaller than XML, and faster and easier to
parse.
{
"name":"John Miller",
"age":21,
"count":27,
"employees": [
{ "f_name":"John" , "l_name":"Doe" },
{ "f_name":"Anna","m_name" : "Marie","l_name":"Smith" },
{ "f_name":"Peter" , "l_name":"Jones" }
]
}

BSON is a binary form of JSON.
34


The Power of JSON Drives Flexible Schema
JSON key value pair enables a flexible schema

Data Access Examples
Relational Representation
LName

FName

Address

Miller

John

123 Blazer St

JSON Representation
JSON_string = ‘{“LName”:”Miller”,”FName”,”John”,”Address”:”123 Blazer St”}’;

Javascript data access
var JSONelems = JSON.parse( JSON_string )
f_name = JSONelems.FNname;
l_name = JSONelems.FName;
l_addr = JSONelems.Address;

Simple Code Example

db.posts.insert( ‘{“author”:”John”, “date”,”2013-04-20”,”post”,”mypost

Creates the database “db” if it does not exists
Creates the collection “posts” if the it does not exists
Insert a record into a blog post by user John

db.posts.find ( ‘{ “author”:”John” }’ )

Retrieve all posts by user John

”}’ )

Dynamic Elasticity

Rapid horizontal scalability
−

Ability for the application to grow by adding low cost hardware
to the solution

−

Ability to add or delete nodes dynamically

−

Ability rebalance the data dynamically

Application transparent elasticity

Why Scale Out Instead of Up?
Scaling Out
−

Adding more servers with less processors and RAM

−

Advantages
Startup costs much less
Can grow instep with the application
Individual servers cost less
−

Several less expensive server rather than fewer high cost servers

−

Redundant servers cost more

Greater chance of isolating catastrophic failures

Scaling Up
−

Adding more processors and RAM to a single server

−

Advantages
Less power consumption than running multiple servers
Less infrastructure (network, licensing,..)

Difference between Sharding Data VS Replication
Shard Key
state= “CA”

Shard Key
state= “WA”

Sharding
Each node hold a
portion of the data
• Hash
• Expression

Data is copied to all
nodes

Actions are shipped
to applicable nodes

40

Same data on each
node

Inserted data is
placed on the correct
node

Shard Key
state= “OR”

Replication

Work on local copy
and modification are
propagated

Motivation of Sharding

Synergistic with the application strategy
Start small and grow with commodity hardware

Sharding is not

Sharding is not for Data Availability
Sharding is for growth, not availability
Redundancy of a node provides high availability for the data
−

Both Mongo and Informix allow for multiple redundant nodes

−

Mongo refers to this as Replica Sets and the additional nodes
slaves

−

Informix refers to this as MACH, and additional nodes secondary

With Informix the secondary server can:
−

Provide high availability

−

Scale out
Execute select
Allow Insert/Update/Deletes on the secondary servers
Share Disks with the master/primary node

Basic Data Distribution/Replication Terms
Term

Description

Informix Term

Shard

A single node or a group of nodes holding the same data
(replica set)

Instance

Replica Set

A collection of nodes contain the same data

MACH Cluster

Shard Key

The field that dictates the distribution of the documents.
Must always exist in a document.

Shard Key

Sharded
Cluster

A group shards were each shard contains a portion of the
data.

Grid/Region

Slave

A server which contains a second copy of the data for read
only processing.

Secondary Server
Remote Secondary

43

Sharding is not for Data Availibility
Shard Key
state= “CA”

Shard Key
state= “WA”

Shard Key
state= “OR”
44

NEW INFORMIX NOSQL/JSON
CAPABILITIES

Flexible Schema, Native JSON and BSON
Provide native JSON & BSON support in the Informix Database Server
Two new built in “first-class” data types called JSON and BSON
Support for MongoDB client side APIs throughout the entire database
Enhance horizontal scaling by enabling sharding on all database objects
and models
Adaptive default system initialization
−
−

48

Up and running in seconds
Adapts to computer an environment

Two New Data Types JSON and BSON
Native JSON and BSON data types
Index support for NoSQL data types
Native operators and comparator functions
allow for direct manipulation of the BSON
data type
Database Server seamlessly converts to
and from

JSON
BSON
Character data

49

JSON

JSON and BSON Data Type Details
Row locks the individual BSON/JSON document
−

MongoDB must lock the database

Bigger documents – 2GB maximum size
−

MongoDB caps at 16MB

Ability to compress documents
−

MongoDB currently not available

Ability to intelligently cache commonly used documents
−

MongoDB currently not available

Client Applications
Applications
MongoDB
native Client

New Wire Protocol Listener supports
existing MongoDB drivers
Connect to MongoDB or Informix with same
application!

MongoDB
driver

MongoDB
Wire
Protocol
IBM
JDBC
NoSQL
Driver
Wire
Protocol
Listener

MongoDB
web browser

MongoDB shell

Mobile

Informix
DB

MongoDB Application Compatibly
Ability to use any of the MongodB client drivers and frameworks
against the Informix Database Server
−

Little to no change required when running MongoDB programs

−

Informix listens on the same default port as mongo, no need to change.

Leverage the different programming languages available
−

Language examples C, C#, Erlang, JAVA, node.js, Perl, Python, Ruby

Mongo Action

Description

db.customer.insert( { name: “John", age: 21 } )

Insert into database “db” the customer
collection the associated document.

db.customer.find( {age: { $gt:21 } } )

Retrieve from database “db” the customer
collection all documents with the age
greater than 21.

52

Fix up JSON Basic Mongo Operations
Conceptual Operations
Mongo Action
db.customer.insert( { name: “John", age:
21 } )

CREATE DATABASE if not exist db
CREATE TABLE if not exist customer
INSERT INTO customer
VALUES ( { “name”:”John”,”age:21”} )

db.customer.find()

SELECT bson_doc FROM customer

db.customer.find( {age: { $gt:21 } } )

SELECT * FROM customer WHERE age > 21

db.customer.drop()

DROP TABLE customer

db.customer.ensureIndex( { name : 1,
age : -1 } )

CREATE INDEX idx_1 on customer(name ,
age DESC)

db.customer.remove( {age: { $gt:21 } } )

DELETE FROM customer where age > 21

db.customer.update( { age: { $gt: 20 } }, {
$set: { status: “Drink" } }, { multi: true } )

53

Traditional SQL Action

UPDATE customer
SET bson_doc_field (status) = { “status”:“Drink"
}
WHERE age > 20

Scaling Out Using Sharded Queries

Shard Key
state= “CA”

Shard Key
state= “WA”

Find sold cars for
all states

1. Request data from local shard
2. Automatically sends request to
other shards requesting data
3. Returns results to client
Shard Key
state= “OR”
54


Scaling Out Using Sharded Inserts

Shard Key
state= “CA”

Shard Key
state= “WA”

Row
state = “OR”

1. Insert row sent to your local
shard
2. Automatically forward the data to
the proper shard
Shard Key
state= “OR”
55


Scaling Out Using Sharded Delete

Shard Key
state= “CA”

Shard Key
state= “WA”
Delete
state = “OR” AND
Dnum = “123123”

1. Delete condition sent to local shard
2. Local shard determine which
shard(s) to forward the operation
Shard Key
state= “OR”
56

3. Execute the operation


Scaling Out Using Sharded Update

Shard Key
state= “CA”

Shard Key
state= “WA”

Row
state = “OR”

Row
state = “OR”

1. Insert a row on your local shard
2. Automatically forward the data to
the proper shard
Shard Key
state= “OR”
57


Scaling Out Adding a Shard
Shard Key
state= “CA”

Shard Key
state= “WA”

Command
Add Shard “NV”

1. Send command to local node
2. New shard dynamically added,
data re-distributed (if required)

Shard Key
state= “OR”
58

Shard Key
state= “NV”

Sharding with Hash

Hash based sharding simplifies the
partitioning of data across the shards
Benefits
−

No data layout planning is required

−

Adding additional nodes is online and
dynamic

Cons
−

Adding additional node requires data to be
moved

Data automatically broken in pieces

Scaling Out with Hash Sharding - Insert

Shard Key
HASH( gambler_ID )

Shard Key
HASH( gambler_ID )

Row
Gambler_ID =“11”

1. Insert row sent to your local shard
2. Local shard determine which
shard(s) to forward the insert
3. Execute the operation
Shard Key
HASH( gambler_ID )
60


Move to END Informix NoSQL Cluster Architecture
Overview

Shard Key
state= “CA”

Shard Key
state= “OR”

61

two independent copies of the
data, but three servers to
share the workload (two
servers share the same
disks). Read/Write activity
supported on all servers independent copies of
two
the data and two servers
to share the workload.
Read/Write activity
supported on all servers
three independent copies of
the data, but four servers
to share the workload (two
servers share the same
disk). Read/Write activity
supported on all servers

Shard Key
state= “WA”


INFORMIX NOSQL
DEEP DIVE INTO
IMPLEMENTATION

Focus Areas.
Applications need easier way to persist objects.
−

Object-relational layers like hibernate has performance & functional issues.

−

Support Mongo API and its eco system

Flexible Schema: Changing schema in RDBMS is a significant operation,
especially in production.
−

Need exclusive access to table & application downtime

−

Fast changing schema & sparse data has issues with fixed schema.
-- Implement

JSON (BSON) type, data store & indexing

Scale Out: While RDBMS has solutions for cluster and MPP, features
enabling application development and deployment in cloud+MPP
architecture is limited.
−

Use Informix flexible grid & replication to shard tables;

−

Enhance query processing for distributed queries

Client Applications
New Wire Protocol Listener supports existing MongoDB drivers
Connect to MongoDB or Informix with same application!
MongoDB
native Client
application

MongoDB

MongoDB
driver
IBM
JDBC
NoSQL
Driver
Wire
Protocol
Listener

MongoDB web
browser
application

Informix
NoSQL
Cluster

MongoDB shell

IBM Wire Protocol Listener logic shared with DB2 and Extreme Scale

65


Informix for SQL and NoSQL Applications
NoSQL APP
IBM Wire Listener
IBM Wire Listener

JDBC connections
JDBC connections

SQL Apps + Tools

SQL Apps + Tools

SQL Drivers
ODBC/JDBC

NoSQL APP

SQL Drivers
ODBC/JDBC

IBM Wire Listener
JDBC connections

Informix Dynamic Server (shard 1)
Informix Dynamic

Query processing
Distributed
Queries
JSON
Tables

Collections

Server (shard…n)

IDXs
IDXs

JSON
Tables

Collections

Tables
Tables
Tables

IDXs

Logs

Enterprise replication + Flexible Grid

Tables

IDXs

ER + Grid

Logs

Informix for SQL and NoSQL Applications
Mongo Application

SQL Apps + Tools

NoSQL/BSON

SQL

IBM Wire Listener

SQL Drivers
ODBC/JDBC

JDBC connections
SQL + BSON

SQL

Informix Dynamic Server (shard 1)
Query processing
JSON
Tables
Collections

Distributed
Queries

SELECT

Informix Dynamic
Server (shard…n)

IDXs

IDXs

JSON
Tables

Collections

Tables
Tables
Tables

IDXs

Logs


INSERT
UPDATE
DELETE

Tables

IDXs

ER + Grid

Logs

NoSQL Feature

Informix Implementation

1. Flexible Schema

Use BSON and JSON data type. Complete row is stored in a single column; BSON,
JSON are multi-rep types and can store up to 2GB.

2. Accessing KV pairs
within JSON.

Translate the Mongo queries into SQL expressions to access key-value pairs.
Informix has added expressions to extract specific KV pairs.
Select bson_new(data, “{id:1, name:1, addr:1}”) from tab;

3. Indexing

Support standard B-tree index via functional indexing.
create index itid on t(bson_value_int(data, “{id:1}”);
Create index itnamestate on t(bson_value_lvarchar(data, “{name,1}”),
bson_value_lvarchar(data, “{city,1}”);
Informix also supports indexing bsons keys with different data types.

4. Sharding

Supports range & hash based sharding.Informix has built-in technology for
replication. Create identical tables in multiple nodes. Add meta data for partitioning
the data across based on range or hash expressions.

5. SELECT

Limited support now. Mongo API helps by disallowing joins. The translated query on
a single table is transformed into federated UNION ALL query; includes shard
elimination.

NoSQL Feature


6. Updates (single node)

INSERT: Simple direct insert.
DELETE: DELETE statement with WHERE bson_extract() > ?; or bson_value..() > ?
UPDATE: bson_update(bson, “update expr”) will return a new bson after applying the
bson expression. Simple updates to non_sharded tables will be direct UPDATE
statement in a single transaction. UPDATE tab bsoncol = bson_update(bsoncol,
“expr”) where

7. Updates (sharded env)

INSERT – All rows are inserted to LOCAL shard, replication threads will read logs
and replicate to the right shard and delete from local node (log only inserts
underway).
DELETE – Do the local delete and replicate the “delete statements” to target node in
the background
UPDATE – Slow update via select-delete-insert.

8. Transaction

Each statement is a distinct transaction in a single node environment.
The data and the operation is replicated via enterprise replication.

9. Isolation levels.

NoSQL session can use any of Informix isolation levels.
-- Examples from Mongo applications.

10. Locking

Application control only on the node they’re connected to. Standard
Informix locking semantics will apply when data is replicated and applied
on the target shard.
-- Verify.. What’s the default locking level?
-- Any option to change it? Just ONCONFIG??

Feature


11. Hybrid access

Mongo APIs can simply access Informix tables, views, virtual tables as if they’re
JSON collections. db.foo.find(); foo can be JSON collection, relational table, a
view or a virtual table on top of timeseries or Websphere MQ.

From MongoAPI to
relational tables.
12. Hybrid access

1.

Directly get binary BSON or cast to JSON to get in textual form.

From SQL to JSON data

2.

Use expressions to extract to extract specific key-value pairs.

3.

To be done: NoSQL collections will only have one BSON object in the table.
We can “imply” the expressions when the SQL refers to a column.
SELECT t.c1, t.c2 from t;
SELECT bson_extract(t.data, “{c1:1}”), bson_extract(t.data, “{c2:1}”) from t;

So, that’s twelve steps for NoSQL Anonymous!

1. Flexible Schema
Clients exchange BSON document with the server both for queries and data.
Thus, BSON becomes a fundamental data type.
The explicit key-value pairs withing the JSON/BSON document will be roughly
equivalent to columns in relational tables.
However, there are differences!
−

The type of the KV data encoded within BSON is determined by the client

−

Server is unaware of data type of each KV pair at table definition time.

−

No guarantees that data type for each key will remain consistent in the
collection.

−

The keys in the BSON document can be arbitrary;

−

While customers exploit flexible schema, they’re unlikely to create a single
collection and dump everything under the sun into that collection.

−

Due to the limitations of Mongo/NoSQL API, customers typically denormalize
the tables (customer will have customer+customer addr + customer
demographics/etc) to avoid joins.

1. Flexible Schema – Informix Implementation
• Informix has a new data type BSON to store the data.
• Informix also has a JSON data type to convert between binary and
text form.
• BSON and JSON are abstract data types (like spatial, etc).
• BSON and JSON multi-representational types.
•Objects up to 4K is stored in data pages.
•Larger objects (up to 2GB) are stored out of row, in BLOBs.
•MongoDB limits objects to 16MB.
•This is all seamless and transparent to applications.

1. Flexible Schema – Informix Implementation
CREATE TABLE customer (data BSON);
• BSON is the binary represenation of JSON.
•It has length and types of the key-value pairs in JSON.
• MongoDB drivers send and receive in BSON form.

2. Accessing KV pairs within JSON.
• We’ll have number of extract expressions/functions
•Expressions returning base type
bson_value_bigint(BSON, “key”);
bson_value_lvarchar(bsoncol, “key.key2”);
bson_value_date(bsoncol, “key.key2.key3”);
bson_value_timestamp(bsoncol, “key”)
bson_value_double(bsoncol, “key”);
bson_value_boolean(bsoncol, “key”);
bson_value_array(bsoncol, “key”);
bson_keys_exist(bsoncol, “key”);
Bson_value_document(bsoncol, “key”)
Bson_value_binary(bsonol, “key”)
Bson_value_objectid(bsoncol, “key”)
•Expression returning BSON subset. Used for bson indices.
bson_extract(bsoncol, “projection specification”)
•Expressions to project out of SELECT statement.
bson_new(bsoncol, “{key1:1, key2:1, key3:1}”);
bson_new(bsoncol, “{key5:0}”);

2. Accessing KV pairs within JSON.
Mongo Query

SQL Query

db.customers.find();

SELECT SKIP ? Data::bson

db.customers.find({},{num:1,name:1})
;

SELECT SKIP ? bson_new( data, '{ "num" : 1.0 , "name"
: 1.0}')::bson FROM
customers

db.customers.find({},
{_id:0,num:1,name:1});

SELECT SKIP ? bson_new( data, '{_id:0.0, "num" : 1.0
, "name" : 1.0}')::bson FROM
customers

db.customers.find({status:”A”})

SELECT SKIP ? data FROM customers WHERE
bson_extract(data, ‘status') = “A”

db.customers.find({status:”A”},
{_id:0,num:1,name:1});

FROM

customers

SELECT SKIP ?
bson_new( data, '{ "_id" : 0.0 , "num" : 1.0 , "name"
: 1.0}')::bson
FROM customers WHERE bson_extract(data, 'name') = “A”

3. Indexing
•
•
•
•
•

Supports B-Tree indexes on any key-value pairs.
Indices could be on simple basic type (int, decimal) or BSON
Indices could be created on BSON and use BSON type comparison
Listener translates ensureIndex() to CREATE INDEX
Listener translates dropIndex() to DROP INDEX

Mongo Query

SQL Query

db.customers.ensureIndex({orderD
ate:1})

CREATE INDEX IF NOT EXISTS w_x_1 ON w (bson_extract(data,'x')
ASC) using bson (ns='{ "name" : "newdb.w.$x_1"}', idx='{ "ns" :
"newdb.w" , "key" : {"x" : [ 1.0 , "$extract"]} , "name" : "x_1"
, "index" : "w_x_1"}') EXTENT SIZE 64 NEXT SIZE 64

ate:1, zip:-1})

CREATE INDEX IF NOT EXISTS v_c1_1_c2__1 ON v (bson_extract(data,'c1') ASC,
bson_extract(data,'c2') DESC) using bson (ns='{ "name" :
"newdb.v.$c1_1_c2__1"}', idx='{ "ns" : "newdb.v" , "key" : { "c1" : [ 1.0
, "$extract"] , "c2" : [ -1.0 , "$extract"]} , "name" : "c1_1_c2__1" ,
"index" : "v_c1_1_c2__1"}') EXTENT SIZE 64 NEXT SIZE 64

ate:1}, {unique:true)

CREATE UNIQUE INDEX IF NOT EXISTS v_c1_1_c2__1 ON v (bson_extract(data,'c1') ASC,
bson_extract(data,'c2') DESC) using bson (ns='{ "name" : "newdb.v.$c1_1_c2__1"}',
idx='{ "ns" : "newdb.v" , "key" : { "c1" : [ 1.0 , "$extract"] , "c2" : [ -1.0 ,
"$extract"]} , "name" :"c1_1_c2__1" , "unique" : true , "index" : "v_c1_1_c2__1"}')
EXTENT SIZE
64 NEXT SIZE 64

3. Indexing
db.w.find({x:1,z:44},{x:1,y:1,z:1})
Translate to:
SELECT bson_new( data, '{ "x" : 1.0 , "y" : 1.0 , "z" : 1}
FROM w
WHERE ( bson_extract(data, 'x') = '{ "x" : 1.0 }'::json::bson ) AND
( bson_extract(data, 'z') = '{ "z" : 44.0 }'::json::bson)
Estimated Cost: 2
Estimated # of Rows Returned: 1
1) keshav.w: SEQUENTIAL SCAN
Filters: (informix.equal(informix.bson_extract(keshav.w.data ,'z' ),UDT )
AND informix.equal(informix.bson_extract(keshav.w.data ,'x' ),UDT))

3. Indexing
•Functional Index is built on bson expressions
CREATE INDEX IF NOT EXISTS w_x_1 ON w (bson_extract(data,'x') ASC)
using bson (ns='{ "name" : "newdb.w.$x_1"}',
idx='{ "ns" : "newdb.w" , "key" : {"x" : [ 1.0 , "$extract"]} , "name"
: "x_1" , "index" : "w_x_1"}')
EXTENT SIZE 64 NEXT SIZE 64

•Listener is aware of the available index and therefore
generates right predicates.
db.w.find({x:1});
gets translated to
SELECT SKIP ? data FROM w WHERE bson_extract(data, 'x') = ?

3. Indexing
db.w.find({x:5,z:5}, {x:1,y:1,z:1})
Translates to:
SELECT bson_new( data, '{ "x" : 1.0 , "y" : 1.0 , "z" : 1 .0}')::bson
FROM w
WHERE ( bson_extract(data, 'x') = '{ "x" : 5.0 }'::json::bson )
AND ( bson_extract(data, 'z') = '{ "z" : 5.0 }'::json::bson )
Estimated Cost: 11
1) keshav.w: INDEX PATH
Filters: informix.equal(informix.bson_extract(keshav.w.data ,'z' ), UDT )
(1) Index Name: keshav.w_x_1
Index Keys: informix.bson_extract(data,'x') (Serial, fragments: ALL)
Lower Index Filter: informix.equal(informix.bson_extract(keshav.w.data ,'x' ),UDT )

4. Mongodb SHARDING (roughly)
Shard a single table by range or hashing.
Mongos will direct the INSERT to target shard.
Mongos tries to eliminate shards for update, delete, selects as well.
FIND (SELECT) can happen ONLY a SINGLE table.
It also works as coordinator for multi-node ops.
Once a row is inserted to a shard, it remains there despite any key
update.
No transactional support on multi-node updates.
−

Each document update is unto its own.

4. Informix Sharding
App Server

App Server

App Server

App Server

Mongo Driver

Mongo Driver

Mongo Driver

Mongo Driver

Listener

Listener

Listener

Listener

JDBC

JDBC

JDBC

JDBC

Customer/1

Customer/2

Customer/3

Customer/4

Customer/5

Customer/6

Sales/1

Sales/2

Sales/3

Sales/4

Sales/5

Sales/6

Location/1

Location/2

Location/3

Location/4

Location/5

Location/6

Informix/2

Informix/3

Informix/1

Informix/4

Informix/5

Enterprise repliation + Flexible Grid

Informix/6

4. Informix sharding + High Availability
App Server

App Server

App Server

App Server

Mongo Driver

Mongo Driver

Mongo Driver

Mongo Driver

Listener

Listener

Listener

Listener

JDBC

JDBC

JDBC

JDBC

Informix/1
Primary

Informix/2
Primary

Informix/3
Primary

Informix/4
Primary

Informix/5
Primary

Informix/6
Primary

Informix/1
SDS/HDR

Informix/2
SDS/HDR

Informix/3
SDS/HDR

Informix/4
SDS/HDR

Informix/5
SDS/HDR

Informix/6
SDS/HDR

Informix/1
RSS

Informix/2
RSS

Informix/3
RSS

Informix/4
RSS

Informix/5
RSS

Informix/6
RSS

Enterprise repliation + Flexible Grid

4. Sharding – Informix Implementation
•Shard a single table by range or hashing.
cdr define shard myshard mydb:usr1.mytab
–type=delete –key=”bson_get(bsoncol, ‘STATE’)” –stragety=expression
versionCol=version
servA “in (‘TX’, ‘OK’)”
servB “in (‘NY’,’NJ’) “
servC “in (‘AL’,’KS’) “
servD remainder
cdr define shard myshard mydb:usr1.mytab
–type=delete –key=state –stragety=hash --versionCol=version
servA servB servC servD

<<Mongo syntax>>

4. Sharding – Informix Implementation

•Sharding is transparent to application.
•Each CRUD statement will only touch a single table.
•Limitation of MongoDB…Makes it easier for Informix.
•Lack of joins is a big limitation for SQL applications.
•Lacks transactional support for distributed update.

4. Informix Sharding
Identical table is created on each node and meta data is replicated
on each node.
Schema based replication is the foundation for our sharding.
CRUD operations can go to any node.
−

We’ll use replication and other techniques to reflect the data in the
target node, eventually.

−

Right now, replication is asynchronous.

−

Informix has synchronous replication

not used for sharding now.

5. SELECT on sharded tables
•The query can be submitted to any of the nodes via the listener.
•That node acts as the “coordinator” for the distributed query.
•It also does the node elimination based on the query predicate.
•After that, the query is transformed to UNION ALL query
SELECT SKIP ? bson_new( data, '{"_id":0.0 ,"num":1.0 ,"name" : 1.0}')::bson
FROM customers@rsys1:db
WHERE bson_extract(data, 'name') = “A” or bson_extract(data, 'name') = “X”

is transformed into:
SELECT SKIP ? bson_new( data, '{"_id":0.0
WHERE bson_extract(data, 'name') = “A” or
UNION ALL
SELECT SKIP ? bson_new( data, '{"_id":0.0
WHERE bson_extract(data, 'name') = “A” or

,"num":1.0 ,"name" : 1.0}')::bson
bson_extract(data, 'name') = “X”
,"num":1.0 ,"name" : 1.0}')::bson
bson_extract(data, 'name') = “X”

6. INSERT: Single node

•If necessary, automatically create database &
table (collection) on the application INSERT
•Collections: CREATE TABLE t(a GUID, d
BSON);
•GUID column is needed to ensure unique row across
the SHARDs. Also used as PK for replication.
•Client application inserts JSON, client API converts
this into BSON, generates _id (object id) if necessary &
sends to server over JDBC/ODBC.
•Server saves the data into this table as BSON, with
an automatically generated GUID

6. DELETE: Single node
•Mongo remove are translated to SQL DELETE
•Will always remove all the qualifying rows.
•WHERE clause translation is same as SELECT

6. UPDATE: Single node

•Simple set, increment, decrement updates are
handled directly by the server.
Mongo: db.w.update({x: {$gt:55}}, {$set:{z:9595}});
SQL: UPDATE w SET data = bson_update(data, “{$set:{z:9595}}”)
WHERE bson_extract(data, 'x') > “{x:55}”::bson ?

•bson_update is a built-in expression updating a
BSON document with a given set of operations.
•Complex updates are handled via select batch,
delete, insert.
•Always updates all the rows or no rows, under a
transaction.

7. INSERT: shard implementation

•If the table is sharded:
•Insert the data into local table as usual.
•Replication threads in the background will evaluate
the log record to see which rows will have moved.
•Replication thread will move the necessary row to
target and delete from the local table.
•Work underway to avoid the insert into local table.
•For each inserted row ending up in non-local shard,
simply generate the logical log and avoid insert into
local table & indices.

7. DELETE : shard implementation
•Application delete could delete rows from any of the
shards in the table.
•DELETE will come in as shard_delete() procedure.
–Execute procedure shard_delete(tabname, delete_stmt);
–This procedure will issue delete locally.
–It will then INSERT the delete statement into a “special”
shard delete table (single for all tables).
–Enterprise replication will propagate the delete to
applicable target systems.

7. UPDATE: shard implementation

•When the update have to be done on multiple
nodes:
•Client application does UPDATE and CLIENT API
converts that into three operations: SELECT,
DELETE & INSERT.
•GUID column is needed to ensure unique row across the
SHARD
•Client application inserts JSON, client API converts this
into BSON & sends to server.

8. Transactions (single node)
•Mongo does not have the notion of transactions.
•Each document update is atomic, but not the app statement
•For the first release of Informix-NoSQL
•By default, JDBC listener simply uses AUTO COMMIT option
•Each server operation INSERT, UPDATE, DELETE, SELECT will be
automatically be committed after each operation.
•No locks are held across multiple operation.
•However, customers can create multi-statement transaction
via $sql and issuing begin work, commit work, rollback work

8. Transactions (sharded environment)
•In sharded environment, mongo runs database via two different
instances: mongos and mongod.
•Mongos simply redirects operations to relevant mongod.
•No statement level transactoinal support.
•Informix
•Informix does not have the 2-layer architecture
•Informix server the application connected to becomes the
transaction coordinator
•Informix does have the 2-phase commit transaction support,
but is unused for NoSQl, for now.
•SELECT statement goes thru distributed query infrastructure
•INSERT,UPDATE DELETE goes thru enterprise replication.

9. ISOLATION levels

•Default isolation level is DIRTY READ.
•Change this directly or sysdbopen()
•You can also use USELASTCOMMITTED variable in
ONCONFIG.
•If you’re using procedures for executing multistatement transaction, you can set it within your
procedure.

10. LOCKING

•Page level locking is the default.
•You can change it to ROW level locks easily.
•ALTER TABLE jc MODIFY lock mode (row)
• DEF_TABLE_LOCKMODE onconfig variable.
• SET LOCK MODE can be set via sysdbopen()
•Each statement is executed with auto-commit and locking
semantics will apply there.

Hybrid Access between relational & JSON Collections

Relational Table

SQL API

Standard ODBC, JDBC,
.NET, OData, etc.
Language SQL.

MongoDB API
(NoSQL)

?

JSON Collections

?
Mongo APIs for Java,
Javascript, C++, C#,...

Benefits of Hybrid Power
Access consistent data from its source
Avoid ETL, continuous data sync and conflicts.
Exploit the power of SQL, MongoAPI seamlessly
Exploit the power of RDBMS technologies in MongoAPI:
−

Informix Warehouse accelerator,

−

Cost based Optimizer

−

R-tree indices for spatial, Lucene text indexes, and more.

Access all your data thru any interface: MongoAPI or SQL.
Store data in one place and efficiently transform and use them on
demand.
Existing SQL based tools and APIs can access new data in JSON

Why do you need hybrid access?

Data model
should not restrict
Data Access


Relational Table

SQL API

MongoDB API
(NoSQL)

.NET, OData, etc.
Language SQL.


JSON Collections

Direct SQL Access.
Dynamic Views
Row types


Hybrid Power
MongoAPI to relational data

Hybrid access: From MongoAPI to relational tables.
You want to develop an application with MongoAPI, but
1. You already have relational tables with data.
2. You have views on relational data
3. You need to join tables
4. You need queries with complex expressions. E.g. OLAP window functions.
5. You need to get results from a stored procedure.
6. You need to exploit Informix stored procedure
7. You need federated access to other data
8. You have timeseries data.

How to treat relational data as JSON store.
Relational data (relations or resultset) can treated as structured JSON
documents; column name-value becomes key-value pair.
SELECT partner, pnum, country from partners;
partner
pnum Country
Pronto
1748 Australia
Kazer
1746 USA
Diester
1472 Spain
Consultix
1742 France
{parnter:
{parnter:
{parnter:
{parnter:

“Pronto”, pnum:”1748”, Country: “Australia”}
“Kazer”, pnum:”1746”, Country: “USA”}
“Diester”, pnum:”1472”, Country: “Spain”}
“Consultix”, pnum:”1742”, Country: “France”}

Listner translates the query and the data object between relational and
JSON/BSON form.

Mongo Application
JSON

JSON

db.customer.find({state:”MO”})

db.partners.find({state:”CA”})

IBM Wire Listener
JDBC connections

Access JSON

SELECT bson_new(bson, ‘{}’) FROM customer
WHERE bson_value_lvarchar(bson,‘state’)= “MO”

Access Relational

SELECT * FROM partners WHERE

state = “CA”

Informix Dynamic Server
JSON Collections

Customer

Distributed
Queries

Tables
IDXs
Tables

Relational Tables

Logs

partners

Tables

IDXs


Accessing data in relational tables.
Create table partners(pnum int, name varchar(32), country
varchar(32));
db.partners.find();
SELECT * from partners;
db.partners.find({name:”Pronto”});
SELECT * FROM PARTNERS WHERE name = “Pronto”;
db.partners.find({name:”Pronto”}, {pnum:1, country:1});
SELECT a, b FROM t WHERE a = 2.0;
db.partners.find({name:”Pronto”}, {pnum:1, country:1}).limit(10);
SELECT LIMIT 10 pnum, country FROM WHERE name = “Pronto”;
db.partners.find({name:”Pronto”}, {pnum:1, country:1}).sort({b:1})
SELECT pnum,country FROM partners WHERE name = “Pronto” ORDER BY b ASC
db.partners.find({name:”Pronto”}, {pnum:1, country:1}).sort({b:-1})
SELECT a, b FROM t WHERE a = 2.0 ORDER BY b DESC
db.t.find({a:{$gt:1}}, {a:1, b:1}).sort({b:-1})
SELECT SKIP ? a, b FROM t WHERE query > 1.0 ORDER BY b DESC

Accessing data in relational tables.
db.partners.save({pnum:1632, name:”EuroTop”, Country: “Belgium”});
INSERT into partners(pnum, name, country values(1632, ”EuroTop”,
“Belgium”);
db.partners.delete({name:”Artics”});
DELETE FROM PARTNERS WHERE name = “Artics”;
Db.partners.update({country:”Holland”},{$set:{country:”Netherland”}}
, {multi: true});
UPDATE partners SET country = “Netherland” WHERE country =
“Holland”;
db.partners.drop();
DROP TABLE partners;

Views and JOINS
A dynamically created relation created from one or more database
tables or procedures.
create table pcontact(pnum int, name varchar(32), phone varchar(32));
insert into pcontact values(1748, "Joe Smith", "61-123-4821");
insert into pcontact values(1746, "John Kelley", "1-729-284-2893");
insert into pcontact values(1472, "Ken Garcia", "34-829-2842");
insert into pcontact values(1742, "Adam Roy", "33-380-3892");
create view partnerphone(pname, pcontact, pphone) as select a.name, b.name, b.phone
FROM pcontact b left outer join partners a on (a.pnum = b.pnum);

db.partnerphone.find();
{ "pname" : "Pronto", "pcontact" : "Joe Smith", "pphone" : "61-123-4821" }
{ "pname" : "Kazer", "pcontact" : "John Kelley", "pphone" : "1-729-284-2893" }
{ "pname" : "Diester", "pcontact" : "Ken Garcia", "pphone" : "34-98-829-2842" }
{ "pname" : "Consultix", "pcontact" : "Adam Roy", "pphone" : "33-82-380-3892" }

db.partnerphone.find({pname:"Pronto"})
{ "pname" : "Pronto", "pcontact" : "Joe Smith", "pphone" : "61-123-4821" }

complex expressions. E.g. OLAP window functions
create view contactreport(pname, pcontact, totalcontacts) as
select b.name, a.name,
count(a.name) over(partition by b.pnum)
from pcontact a left outer join partners b on (a.pnum = b.pnum);

db.contactreport.find({pname:"Pronto"})
{ "pname" : "Pronto", "pcontact" : "Joel Garner", "totalcontacts" : 2 }
{ "pname" : "Pronto", "pcontact" : "Joe Smith", "totalcontacts" : 2 }

Seamless federated access
1.
2.

create database newdb2;
create synonym oldcontactreport for
newdb:contactreport;

> use newdb2
> db.oldcontactreport.find({pname:"Pronto"})
{ "pname" : "Pronto", "pcontact" : "Joel Garner", "totalcontacts" : 2 }
{ "pname" : "Pronto", "pcontact" : "Joe Smith", "totalcontacts" : 2 }
SELECT data FROM oldcontactreport WHERE
bson_extract(data, 'pname') = “Pronto”;
•

create synonym oldcontactreport for
custdb@nydb:contactreport;

Get results from a stored procedure.
create function "keshav".p6() returns int, varchar(32);
define x int; define y varchar(32);
foreach cursor for select tabid, tabname into x,y from systables
return x,y with resume;
end foreach;
end procedure;
create view "keshav".v6 (c1,c2) as
select x0.c1 ,x0.c2 from table(function p6())x0(c1,c2);

db.v6.find().limit(5)
{
{
{
{
{

"c1"
"c1"
"c1"
"c1"
"c1"

:
:
:
:
:

1,
2,
3,
4,
5,

"c2"
"c2"
"c2"
"c2"
"c2"

:
:
:
:
:

"systables" }
"syscolumns" }
"sysindices" }
"systabauth" }
"syscolauth" }

Access Timeseries data
create table daily_stocks
( stock_id integer,
stock_name lvarchar,
stock_data timeseries(stock_bar)
);
-- Create virtual relational table on top (view)
EXECUTE PROCEDURE
TSCreateVirtualTab('daily_stocks_virt',
'daily_stocks', 'calendar(daycal),origin(2011-01-03
00:00:00.00000)' );
create table daily_stocks_virt
( stock_id integer,
stock_name lvarchar,
timestamp datetime year to fraction(5),
high smallfloat,

db.daily_stocks_virt.find()
{ "stock_id" : 901, "stock_name" : "IBM", "timestamp" : ISODate("2011-01-03T06:0
0:00Z"), "high" : 356, "low" : 310, "final" : 340, "vol" : 999 }
{ "stock_id" : 902, "stock_name" : "HPQ", "timestamp" : ISODate("2011-01-03T06:0
{ "stock_id" : 902, "stock_name" : "HPQ", "timestamp" : ISODate("2011-01-04T06:0

db.daily_stocks_virt.find({stock_name:"IBM"})

db.daily_stocks_virt.find({stock_name:"IBM"}).sort({final:-1})
mongos>

You want to develop an application with MongoAPI, but
1. You already have relational tables with data.
2. You have views on relational data
3. You need to join tables
4. You need queries with complex expressions. E.g. OLAP window functions.
5. You need to get results from a stored procedure.
6. You need to exploit Informix stored procedure
7. You need federated access to other data
8. You have timeseries data.

Put all of these together!

Hybrid access: From SQL API to JSON COllections.
SQL Applications
JDBC connections
Access JSON

Access Relational

SELECT bson_new(bson, “{customer:1}’)
FROM customer WHERE
bson_value_lvarchar(bson,‘state’)= “MO”

Select * from patners where state = “CA”;

JSON Collections

Customer

Distributed
Queries

Tables
IDXs
Tables

Relational Tables

Logs

partners

Tables

IDXs


Join JSON collections
JSON Collection V:
{ "_id" : ObjectId("526a1bb1e1819d4d98a5fa4b"), "c1" : 1, "c2" : 2 }
JSON Collection w:
{ "_id" : ObjectId("526b005cfb9d36501fe31605"), "x" : 255, "y" : 265, "z" : 395}
{ "_id" : ObjectId("52716cdcf278706cb7bf9955"), "x" : 255, "y" : 265, "z" : 395}
{ "_id" : ObjectId("5271c749fa29acaeba572302"), "x" : 1, "y" : 2, "z" : 3 }

SELECT c1, c2, x,y,z
FROM V, W
WHERE V.c1 = W.x;

-- Wrap aorund the built-in expressions
SELECT bson_value_int(jc1.data, 'x'),
bson_value_lvarchar(jc1.data, 'y'),
bson_value_int(jc1.data, 'z') ,
bson_value_int(jc2.data, 'c1'),
bson_value_lvarchar(jc2.data, 'c2')
FROM w jc1, v jc2
WHERE bson_value_int(jc1.data, 'x') =
bson_value_int(jc2.data, 'c1');

-- Create a view to make the access simple.
create view vwjc(jc1x, jc1y, jc1z, jc2c1, jc2c2) as
SELECT bson_value_int(jc1.data, 'x'),
bson_value_lvarchar(jc1.data, 'y'),
bson_value_int(jc1.data, 'z') ,
bson_value_int(jc2.data, 'c1'),
bson_value_lvarchar(jc2.data, 'c2')
FROM w jc1, v jc2
WHERE bson_value_int(jc1.data, 'x') =
bson_value_int(jc2.data, 'c1');

db.vwjc.find();
Db.vwjc.find({jc1x:100});
Db.vwjc.find({jc1x:{$gte:10}}).sort(jc1y:1)

You want to perform complex analytics on JSON data
BI Tools like Cognos, Tableau generate SQL on data sources.
Option 1: Do ETL
Need to expose JSON data as views so it’s seen as a database
object.
−

We use implicit casting to convert to compatible types

−

The references to non-existent key-value pair returns NULL

Create any combination of views
−

A view per JSON collection

−

Multiple views per JSON collection

−

Views joining JSON collections, relational tables and views.

Use these database objects to create reports, graphs, etc.

Analytics
SQL & BI Applications
ODBC, JDBC connections

JSON Collections
Customer

Tables & views

Tables

Tables

Orders

Relational Tables

partners
Inventory
Tables
CRM

Informix Warehouse Accelerator


Relational Table

JSON Collections

SQL API

.NET, OData, etc.
Language SQL.

Direct SQL Access.
Dynamic Views
Row types

MongoDB API


(NoSQL)



Ability for MongoDB API to Access All Data Models

node.js Application

Traditional SQL

MongoDB Drivers

NoSQL - JSON
NoSQL - JSON
TimeSeries
TimeSeries
MQ Series
MQ Series
MQ Series

130

All three drivers support all data models
Applications
native Client

Informix SQLI Drivers

Traditional SQL
IBM DRDA Drivers

MongoDB Drivers

web
browser

NoSQL - JSON
TimeSeries
MQ Series

Mobile


Benefits of Hybrid Power
Access consistent data from its source
Avoid ETL, continuous data sync and conflicts.
Exploit the power of SQL, MongoAPI seamlessly
Exploit the power of RDBMS technologies via MongoAPI:

− Informix Warehouse accelerator,
− Cost based Optimizer & power of SQL
− R-tree indices for spatial, Lucene text indexes, and more.
Access all your data thru any interface: MongoAPI & SQL
Store data in one place and efficiently transform and use
them on demand.
Existing SQL based tools and APIs can access new data
in JSON

Simple SQL statements which Operate on the BSON

CREATE TABLE mycollection (data BSON);
INSERT INTO mycollection (data) VALUES ( '{"name":“John"}‘ ::JSON );
CREATE INDEX myindx ON mycollection(bson_extract(data,“name"))
USING BSON
SELECT data::JSON FROM mycollection;

134


The SQL to Create a Collection
Formal definition of a collection used by the Wire Listener

CREATE COLLECTION TABLE mycollection
(
id
char(128),
modcnt integer,
data
bson,
flags
integer
);

135


New Built-in BSON Expressions

bson_value_array(lvarchar doc, lvarchar key) returns lvarchar as BSON array
bson_value_bigint(lvarchar doc, lvarchar key) returns bigint
bson_value_binary(lvarchar doc, lvarchar key) returns lvarchar as BSON binary
bson_value_boolean(lvarchar doc, lvarchar key) returns boolean
bson_value_code(lvarchar doc, lvarchar key) returns lvarchar as string
bson_value_date(lvarchar doc, lvarchar key) returns datetime
bson_value_document(lvarchar doc, lvarchar key) returns lvarchar as BSON object
bson_value_double(lvarchar doc, lvarchar key) returns float
bson_value_int(lvarchar doc, lvarchar key) returns bigint
bson_value_lvarchar(lvarchar doc, lvarchar key) returns lvarchar as string
bson_value_objectid(lvarchar doc, lvarchar key) returns lvarchar as string
bson_value_timestamp(lvarchar doc, lvarchar key) returns datetime
bson_keys_exist(lvarchar doc, lvarchar key(s)) returns Boolean
bson_new(lvarchar doc, lvarchar projection_list) return lvarchar as BSON binary
bson_type(lvarchar doc, lvarchar key) return integer
bson_size(lvarchar doc, lvarchar key) return integer
bson_extract(lvarchar doc, lvarchar) return lvarchar as BSON binary

136

Combining Tables and Collections
SELECT customer.*,
bson_value_lvarchar(data,"customer_code") AS customer_code
FROM mycollection, customer
WHERE bson_value_lvarchar(data,"fname") = customer.fname
AND bson_value_lvarchar(data,"lname") = customer.lname
AND customer_num < 102;

Query Optimizer
QUERY: (OPTIMIZATION TIMESTAMP: 07-21-2013 23:15:31)
Estimated Cost: 6
1) miller3.customer: INDEX PATH
(1) Index Name: miller3. 100_1
Index Keys: customer_num
(Serial, fragments: ALL)
Upper Index Filter: miller3.customer.customer_num < 102
2) miller3.mycollection: INDEX PATH
Filters: informix.equal(BSON_VALUE_LVARCHAR (miller3.mycollection.data , 'fname' ) ,miller3.customer.fname )
(1) Index Name: miller3.mycollection_ix1
Index Keys: :informix.bson_value_lvarchar(data,'lname')
(Serial, fragments: ALL)
Lower Index Filter: equal(BSON_VALUE_LVARCHAR (miller3.mycollection.data , 'lname' ) ,miller3.customer.lname )
NESTED LOOP JOIN

137


Understanding Informix BSON Indexes
Indexes are created on BSON data and support
–
–
–
–

Primary Key (enforced across all nodes)
Unique Indexes (enforced at a single node level)
Composite Indexes ( 8 parts )
Arrays
{
"fname":"Sadler",
"lname":"Sadler",
"company":"Friends LLC",
"age":21,
"count":27,
“phone": [ “408-789-1234”, “408-111-4779” ],
}

create index fnameix1 on customer(bson_value_lvarchar(bson,"fname")) using bson;
create index lnameix2 on customer(bson_value_lvarchar(bson,"lname")) using bson;
create index phoneix3 on customer(bson_value_lvarchar(bson,"phone")) using bson;

138


Understanding Informix BSON Indexes
create index fnameix1 on customer(bson_value(bson,"fname")) using bson;
create index lnameix2 on customer(bson_value(bson,"lname")) using bson;
create index phoneix3 on customer(bson_value(bson,"phone")) using bson;

select * from customer where bson_value_lvarchar(bson,"fname") = "Ludwig";
-- use fnameix1

select * from customer where bson_value_lvarchar (bson,"lname") = "Sadler";
-- use lnameix2

select * from customer where bson_value_lvarchar(bson,"phone") = "408-789-8091";
-- use phoneix3

select * from customer where bson_value_lvarchar(bson,"phone") = "415-822-1289“
OR bson_value_lvarchar(bson,"phone") = "408-789-8091";
-- use phoneix3

select * from customer where bson_value_lvarchar(bson,"company") = "Los Altos
Sports";
-- no index use sequential scan
139


Informix 12.1

Convert Traditional SQL Tables to JSON

Mongo API
–Leverage the numerous open
source drivers for Mongo
– Extend the power of the mongo API to
execute on
• Traditional SQL tables
• TimeSeries/Sensor Data

140



Not just a Hybrid Database

Hybrid databases only solve half of the problem
Hybrid application are hear to stay

Scaling Out – Sharding Data
Shard Key
state= “CA”

Shard by either hash or
expression
When inserting data it is
automatically moved to the
correct node

•

142

Each node in the environment
hold a portion of the data

•

Shard Key
state= “OR”

•

•

Shard Key
state= “WA”

Queries automatically
aggregate data from the
required node(s)

Graphical Administration showing a Sharded System
Graphical administration of
the shards keep things
simple
Monitor the entire
environment from a single
graphical console

Simply Administration - OAT Manages JSON Objects

144

Simplify the “up and running” Experience
Install and setup a typical database instance with only 3 questions:
−

Where to place the Product?

−

Where to place the Data?

−

How many users do you anticipate?

Newly installed instance adapts to the resources on the
computer
Description
Auto tuning of CPU VPS
Auto Table Placement
Auto Buffer pool tuning
Auto Physical Log extension
Auto Logical Log Add
Auto Read Ahead
145

SIMPLE POWER –
INFORMIX HYBRID
DATABASE CAPABILITIES

Relational and non-relational data in one system
NoSQL/MongoDB Apps can access Informix Relational Tables
Distributed Queries
Multi-statement Transactions
Enterprise Proven Reliability
Enterprise Ready Security
Enterprise Level Performance

Informix provides the capability to leverage
the abilities of both relational DBMS and document store systems.
MongoDB does not. It is a document store system lacking key
abilities like transaction durability.
147

Round Peg Square Hole

The DBA and/or programmer no longer have to decide upfront if it
is better to use a system entirely of JSON documents or a system
only of SQL tables, but rather have a single database in which the
programmer decides a if JSON document is optimal or a SQL table
is optimal for the data and usage.

148

Informix Specific Advantages with Mongo Drivers
Traditional SQL tables and JSON collections co-existing in the
same database
Using the MongoDB client drivers Query, insert, update, delete
−

JSON collections

−

Traditional SQL tables

−

Timeseries data

Join SQL tables to JSON collections utilizing indexes
Execute business logic in stored procedures
Provide a view of JSON collections as a SQL table
−

Allows existing SQL tools to access JSON data

Enterprises level functionality
149

Scalability
Better performance on multi-core, multi-session scenarios
−

Architecture has finer grain locking – not entire database

−

Better concurrency because less resources locked

Document Compression
−

60% to 90% observed

Bigger documents – 2GB maximum size
−

MongoDB caps at 16MB

Informix has decades of optimization on single node solution

150


Some NoSQL Use Cases - Mostly Interactive Web/Mobile
Online/Mobile Gaming
−
−
−
−
−

Leaderboard (high score table)
management
Dynamic placement of visual elements
Game object management
Persisting game/user state information
Persisting user generated data (e.g.
drawings)

Display Advertising on Web Sites
−
−

Ad Serving: match content with profile
and present
Real-time bidding: match cookie profile
with ad inventory, obtain bids, and
present ad

Dynamic Content Management
and Publishing (News & Media)
−
−

151

Store content from distributed authors,
with fast retrieval and placement
Manage changing layouts and user
generated content

E-commerce/Social Commerce
– Storing frequently changing product
catalogs

Social Networking/Online
Communities
Communications
– Device provisioning

Logging/message passing
– Drop Copy service in Financial
Services (streaming copies of trade
execution messages into (for
example) a risk or back office
system)

Dual Capability

Hybrid databases are a way of the past
Hybrid application are hear to stay

152

Client Applications
Applications
native Client

All three drivers support all data models

Informix SQLI
Drivers
IBM DRDA
Drivers
MongoDB Drivers

web browser

Informix
NoSQL

Mobile


Ability for All Clients to Access All Data Models

Informix SQLI
Drivers
IBM DRDA
Drivers
MongoDB Drivers

Traditional SQL

NoSQL - JSON
TimeSeries
MQ Series

154

Ability for All Clients to Access All Data Models

node.js Application

Traditional SQL

MongoDB Drivers

NoSQL - JSON
TimeSeries
MQ Series

155

BUILDING A REAL LIFE
APPLICATION

IOD Demo and Beyond
A fun exciting demo application that attendees
can interact with that makes people take notice
of IBM technologies
A social networking photo application utilizing
smart phones/tablets that allows the end user to
take photos, tag and add photos to an Informix
NoSQL hybrid database
Random photos can be displayed at sessions,
booths or on attendee smart devices

IOD Attendee Photo Application
Allow conference attendee to take and share photo!

Technology Highlights
•

Create a hybrid application using NoSQL, traditional SQL,
timeseries mobile web application
•
•
•

Utilizing both JSON collections, SQL tables and timeseries
Utilize IBM Dojo Mobile tools to build a mobile application
Leverage new mongo client side drivers for fast application
delvelopment

•

Demonstrate sharding with over 100 nodes

•

Cloud based solution on top of Amazon Cloud

•

Provide real-time analytics on all forms of data
•
•

Leverage existing popular analytic front-end IBM-Congos
Utilize an in-memory columnar database accelerator to
provide real-time trending analytics on data

Mobile Device Application Architecture

Apache Web Server
IOD Photo App - UPLOAD

Photo Application
IBM Dojo Mobile
Informix
tag

Van Gogh

Informix JSON
Listener

Photo
collection

User
Table

Application Architecture (Big Picture)
Apache web server
IOD application

Informix JSON Listener

Informix

Photo collection

Amazon Cloud

Sharding
commands

Node1
Aa-Am

Node2
An-Az

Node3
Ba-Bf

Node3
Bg-Bp

Node101

“Dalmation” shard farm – 101 sharded nodes on Amazon Cloud
Photo collection sharded across these nodes

Photo Application Schema
TimeSeries
NoSQL Collections
activity_photos

activity_data

timeseries(photo_like)

Photo Application NoSQL Collections

activity_photos

activity_data


Application Considerations

Photo meta-data varies from camera to camera
A Picture and all its meta data are stored in-document
Pictures are stored in a NoSQL collection
Pre-processing on the phone ensures only reasonable size
photos are sent over the network.

Example of Live JSON Photo Data

{"_id":ObjectId("526157c8112c2fe70cc06a75"), "Make":"NIKON CORPORA
TION","Model":"NIKON D60","Orientation":"1","XResolution":"300","
YResolution":"300","ResolutionUnit":"2","Software":"Ver.1.00 ","D
ateTime":"2013:05:15 19:46:36","YCbCrPositioning":"2","ExifIFDPoi
nter":"216","ExposureTime":"0.005","FNumber":"7.1","ExposureProgr
am":"Not defined","ISOSpeedRatings":"100",
"Contrast":"Normal","Saturation":"Normal","Sharpness":"Normal",
"SubjectDistanceRange":"Unknown","name":"DSC_0078.JPG","img_d
ata":"data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDABcQ

Sharding

NoSQL “Dalmatian” cloud deployment
Sharding by hash
Heterogeneous Cloud deployment with sharding
−

Processor Type and Number

Intel and ARM processors
−

Operating System Types and Versions

Linux, Windows, AIX, Apple, Solaris, HP,

Basic PHP Programming Overview Information
List of NoSQL collection names and SQL tables names
Function to set the active database and return the Collection

private $conn;
private
private
private
private
private
private
private

$dbname = "photo_demo";
$photoCollectionName = "photos";
$contactsCollectionName = "contacts";
$sqlCollectionName = 'system.sql';
$userTableName = "users";
$tagsTableName = "tags";
$likesTableName = "likes";

private $photoQueryProjection = array("_id" => 1, "tags" => 1,
"user_id" => 1, "img_data" => 1);
/**
* Get collection by name
* @param MongoCollection $collectionName
*/
private function getCollection($collectionName) {
return $this->conn->selectDB($this->dbname)->selectCollection ($collectionName);
}

Insert Example

Information is placed
in the contacts
collection

Insert Data into a Collection
Very simple to insert JSON data into a collection using the
MongoAPIs

/**
* Insert user's contact information into contacts table.
*/
public function insertContact( $json ) {
if (! is_array ( $json )) {
return "Contact info not in JSON format.";
}
try {
$result = $this->getCollection($this->contactsTableName)->insert($json);
if ($result ["ok"] != 1) {
return $result ["err"];
}
} catch ( MongoException $e ) {
return $e->getMessage ();
}
return "ok";
}

Retrieve Collection Information
Very simple to retrieve data from a collection using the
MongoAPIs
Data is returned as a JSON document

/**
* Get contact info
*/
public function adminContacts() {
$photoCollection = $this->getCollection($this->contactsCollectionName);
$cursor = $photoCollection->find();
$results = $this->getQueryResults($cursor);
return $results;
}

Retrieve Collection Information

/**
* Get contact info for admin panel
*/
public function adminContacts() {
$photoCollection = $this->getCollection($this->contactsCollectionName);
$cursor = $photoCollection->find();
$results = $this->getQueryResults($cursor);
return $results;
}

Delete a Photo and its Information
Deleting from SQL Tables and NoSQL Collection is exactly
the same
/**
* Delete photo
*/
public function deletePhoto($id) {
try {
// First delete from likes and tags tables
$query = array('photo_id' => $id['_id']);
$result = $this->getCollection($this->likesTableName)->remove($query);
if ($result ["ok"] != 1) {
return $result["err"];
}
$result = $this->getCollection($this->tagsTableName)->remove($query);
if ($result ["ok"] != 1) {
}
// Then delete the photo from the collection
$query = array('_id' => new MongoId($id['_id']));
$result = $this->getCollection ( $this->photoCollectionName )->remove ( $query );
if ($result ["ok"] != 1) {
}
} catch ( MongoException $e ) {
return $e->getMessage();
}
return "ok";
}

Executing a Stored Procedure in MongoAPI

/**
* Get the user_id for a particular user name (email address).
*
* Calls a stored procedure that will insert into the users table if the
* user does not exist yet and returns the user_id.
*
* @param string $username
* @return int $user_id
*/
public function getUserId($username) {
$username = trim($username);
try {
$sql = "EXECUTE FUNCTION getUserID('" . $username . "')";
$result = $this->getCollection($this->sqlCollectionName)->
findOne(array('$sql'=>$sql));
if (isset($result['errmsg'])) {
return "ERROR. " . $result['errmsg'];
}
return $result['user_id'];
} catch (MongoException $e) {
return "ERROR. " . $e->getMessage();
}
}

Real Time Analytics
Customer Issues
−

Several different models of data (SQL, NoSQL,
TimeSeries/Sensor)

−

NoSQL is not strong building relations between collections

−

Most valuable analytics combine the results of all data models

−

Most protonate Analytic system written using standard SQL

−

ETL & YAS (Yet Another System)

Solution
Provide a mapping of the required data in SQL form
−

Enables common tools like Congo's

Analytics on a Hybrid Database

Photo
collection

SQL

MongoAPI

Informix

User
Table

Photo Application
SQL Mapping of NoSQL PHOTO Collection

activity_photos

activity_data


Mapping A Collection To A SQL Table

CREATE VIEW photo_metadata (gpslatitude, gpslongitude,
make, model, orientation, datetimeoriginal,
exposuretime, fnumber, isospeedratings,
pixelxdimension, pixelydimension)
AS SELECT BSON_VALUE_LVARCHAR ( x0.data , 'GPSLatitude' ),
BSON_VALUE_LVARCHAR ( x0.data , 'GPSLongitude' ),
BSON_VALUE_LVARCHAR ( x0.data , 'Make' ),
BSON_VALUE_LVARCHAR ( x0.data , 'Model' ),
BSON_VALUE_LVARCHAR ( x0.data , 'Orientation' ),
BSON_VALUE_LVARCHAR ( x0.data , 'DateTimeOriginal' ) ,
BSON_VALUE_LVARCHAR ( x0.data , 'ExposureTime'),
BSON_VALUE_LVARCHAR ( x0.data , 'FNumber' ),
BSON_VALUE_LVARCHAR ( x0.data , 'ISOSpeedRatings' ),
BSON_VALUE_LVARCHAR ( x0.data , 'PixelXDimension' ) ,
BSON_VALUE_LVARCHAR ( x0.data , 'PixelYDimension')
FROM photos x0;

IWA Integration

What is trending at IOD?
Set up to work on frequently updated snapshots of data to
perform analytic queries on the photo meta-data
Since BSON/JSON data can not be directly compressed
and searched, snapshot process would selectively extract
data from the documents for copying to IWA
Use Genero to show the power of IWA

Mobile Device Application Architecture

Photo
collection

SQL

MongoAPI

Informix

User
Table

Configure Informix on Amazon Cloud Simple

•
•
•
•
•

Instantiate the Amazon image
Setup the storage
Install the product
Start the system
Configure sharding

Informix NoSQL & Hybrid SQL detailed deep dive

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Informix NoSQL & Hybrid SQL detailed deep dive

Similar to Informix NoSQL & Hybrid SQL detailed deep dive (20)

More from Keshav Murthy

More from Keshav Murthy (20)

Recently uploaded

Recently uploaded (20)

Informix NoSQL & Hybrid SQL detailed deep dive