2. Please Note
IBM’s statements regarding its plans, directions, and intent are subject to change
or withdrawal without notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general
product direction and it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a
commitment, promise, or legal obligation to deliver any material, code or
functionality. Information about potential future products may not be incorporated
into any contract. The development, release, and timing of any future features or
functionality described for our products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM
benchmarks in a controlled environment. The actual throughput or performance
that any user will experience will vary depending upon many factors, including
considerations such as the amount of multiprogramming in the user’s job stream,
the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results
similar to those stated here.
1
3. Agenda
Internet of Things & Big Data
BigInsights
Big SQL 3.0
• Architecture
• Performance
• Best practices
2
4. Systems of Insight from Data
Emanating from the Internet of Things (IoT)
3
6. 2+
billion
people on
the Web
by end
2011
30 billion RFID
tags today
(1.3B in 2005)
4.6
billion
camera
phones
world wide
100s of
millions
of GPS
enabled
devices
sold
annually
76 million smart
meters in 2009…
200M by 2014
12+ TBs
of tweet data
every day
25+ TBs of
log data every
day
?TBsof
dataeverydayWhere is Big Data coming from?
5
7. Now, they’re installing
smart meters.
10 million smart meters read every 15 minutes =
350 billion transactions a year.
A major gas and electric utility has
10 million meters.
They read the meters
once a month.
The meters are read
once an hour.
6
8. Data AVAILABLE to
an organization
Data an organization
can PROCESS
The Big Data Conundrum
The percentage of available data an enterprise can analyze is
decreasing
This means enterprises are getting “more naive” over time
7
9. Transactional &
Application Data
Machine Data Social Data Enterprise
Content
• Volume
• Structured
• Throughput
• Velocity
• Structured
• Ingestion
• Variety
• Unstructured
• Veracity
• Variety
• Unstructured
• Volume
Big Data is All Data and All Paradigms
8
12. BI /
Report
ing
BI /
Reporting
Exploration /
Visualization
Functional
App
Industry
App
Predictive
Analytics
Content
Analytics
Analytic Applications
IBM Big Data Platform
Systems
Management
Application
Development
Visualization
& Discovery
Accelerators
Information Integration & Governance
Stream
Computing
Data
Warehouse
InfoSphere BigInsights builds on open source Hadoop
capabilities for enterprise class deployments
Open source
based
components
Enterprise
capabilities Hadoop
System
Business benefits
– Quicker time-to-value due to IBM
technology and support
– Reduced operational risk
– Enhanced business knowledge
with flexible analytical platform
– Leverages and complements
existing software
InfoSphere BigInsights
Administration & Security
Workload Optimization
Connectors
Advanced Engines
Visualization & Exploration
Development Tools
Open source Hadoop
components
Big SQL
11
14. Component BigInsights 3.0
HortonWorks
HDP 2.0
MapR 3.1
Pivotal HD
1.1
Cloudera
CDH5
Hadoop 2.2 2.2 1.0.3 2.0.5 * 2.3
HBase 0.96.0 0.96.0 0.94.13 0.94.8 0.96.1
Hive 0.12.0 0.12 0.11 0.11.0 0.12.0
Pig 0.12.0 0.12 0.11.0 0.10.1 0.12.0
Zookeeper 3.4.5 3.4.5 3.4.5 3.4.5 3.4.5
Oozie 4.0.0 4.0.0 3.3.2 3.3.2 4.0.0
Avro 1.7.5 X X X 1.7.5
Flume 1.4.0 1.4.0 1.4.0 1.3.1 1.4.0
Sqoop 1.4.4 1.4.4 1.4.4 1.4.2 1.4.4
Current as of April 27, 2014
Common Hadoop core in all Hadoop Distributions
13
15. What is Big SQL 3.0?
Comprehensive SQL functionality
• IBM SQL/PL support, including…
• Stored procedures (SQL bodied and external)
• Functions (SQL bodied and external)
• IBM Data Server JDBC and ODBC drivers
Leverages advanced IBM SQL compiler/runtime
• High performance native (C++) runtime
Replaces Map/Reduce
• Advanced message passing runtime
• Data flows between nodes without
requiring persisting intermediate results
• Continuous running daemons
• Advanced workload management allows
resources to remain constrained
• Low latency, high throughput…
SQL-based
Application
Big SQL
Engine
InfoSphere BigInsights
Data Sources
IBM data server
client
SQL MPP Run-time
CSVCSV SeqSeq ParquetParquet RCRC
ORCORCAvroAvro CustomCustomJSONJSON
14
16. Big SQL 3.0 – Architecture
Head (coordinator) node
• Listens to the JDBC/ODBC connections
• Compiles and optimizes the query
• Coordinates the execution of the query
Big SQL worker processes reside on compute nodes (some or all)
Worker nodes stream data between each other as needed
Workers can spill large data sets to local disk if needed
• Allows Big SQL to work with data sets larger than available memory
Mgmt Node
Big SQL
Mgmt Node
Hive
Metastore
Mgmt Node
Name Node
Mgmt Node
Job Tracker•••
Compute Node
Task
Tracker
Data
Node
Compute Node
Task
Tracker
Data
Node
Compute Node
Task
Tracker
Data
Node
Compute Node
Task
Tracker
Data
Node•••
Big
SQL
Big
SQL
Big
SQL
Big
SQL
GPFS/HDFS
15
17. Big SQL 3.0 works with Hadoop
All data is Hadoop data
• In files in HDFS
• SEQ, RC, delimited, Parquet …
Never need to copy data to a proprietary representation
All data is catalog-ed in the Hive metastore
• It is the Hadoop catalog
• It is flexible and extensible
All Hadoop data is in a Hadoop filesystem
• HDFS or GPFS-FPO
16
18. Big SQL 3.0 – Architecture (cont.)
Big SQL's runtime execution engine is all native code
For common table formats a native I/O engine is utilized
• e.g. delimited, RC, SEQ, Parquet, …
For all others, a java I/O engine is used
• Maximizes compatibility with existing tables
• Allows for custom file formats and SerDe's
All Big SQL built-in functions are native code
Customer built UDx's can be developed in C++ or Java
• Existing Big SQL UDF's can be used with a slight
change in how they are registered
Mgmt Node
Big SQL
Compute Node
Task
Tracker
Data
Node
Big
SQL
Big SQL Worker
Native I/O
Engine
Java I/O
Engine
SerDe I/O Fmt
Runtime
Java UDFs
Native UDFs
17
19. Big SQL 3.0 – Enterprise security
Users may be authenticated via
• Operating system
• Lightweight directory access protocol (LDAP)
• Kerberos
User authorization mechanisms include
• Full GRANT/REVOKE based security
• Group and role based hierarchical security
• Object level, column level, or row level (fine-grained) access controls
Auditing
• You may define audit policies and track user activity
Transport layer security (TLS)
• Protect integrity and confidentiality of data between the client and Big SQL
18
20. Row Based Access Control - 4 easy steps
2) Create permissions *
CREATE PERMISSION BRANCH_A_ACCESS ON BRANCH_TBLCREATE PERMISSION BRANCH_A_ACCESS ON BRANCH_TBLCREATE PERMISSION BRANCH_A_ACCESS ON BRANCH_TBLCREATE PERMISSION BRANCH_A_ACCESS ON BRANCH_TBL
FOR ROWS WHERE(VERIFY_ROLE_FOR_USER(SESSION_USER,'BRANCH_A_ROLE'FOR ROWS WHERE(VERIFY_ROLE_FOR_USER(SESSION_USER,'BRANCH_A_ROLE'FOR ROWS WHERE(VERIFY_ROLE_FOR_USER(SESSION_USER,'BRANCH_A_ROLE'FOR ROWS WHERE(VERIFY_ROLE_FOR_USER(SESSION_USER,'BRANCH_A_ROLE') = 1) = 1) = 1) = 1
ANDANDANDAND
BRANCH_TBL.BRANCH_NAME = 'Branch_A')BRANCH_TBL.BRANCH_NAME = 'Branch_A')BRANCH_TBL.BRANCH_NAME = 'Branch_A')BRANCH_TBL.BRANCH_NAME = 'Branch_A')
ENFORCED FOR ALL ACCESSENFORCED FOR ALL ACCESSENFORCED FOR ALL ACCESSENFORCED FOR ALL ACCESS
ENABLEENABLEENABLEENABLE
3) Enable access control *
ALTER TABLE BRANCH_TBL ACTIVATE ROW ACCESS CONTROLALTER TABLE BRANCH_TBL ACTIVATE ROW ACCESS CONTROLALTER TABLE BRANCH_TBL ACTIVATE ROW ACCESS CONTROLALTER TABLE BRANCH_TBL ACTIVATE ROW ACCESS CONTROL
4) Select as Branch_A user
CONNECT TO TESTDB USER newtonCONNECT TO TESTDB USER newtonCONNECT TO TESTDB USER newtonCONNECT TO TESTDB USER newton
SELECT "*" FROM BRANCH_TBLSELECT "*" FROM BRANCH_TBLSELECT "*" FROM BRANCH_TBLSELECT "*" FROM BRANCH_TBL
EMP_NO FIRST_NAME BRANCH_NAMEEMP_NO FIRST_NAME BRANCH_NAMEEMP_NO FIRST_NAME BRANCH_NAMEEMP_NO FIRST_NAME BRANCH_NAME
-------------------------------------------- ------------------------------------------------ --------------------------------------------
2 Chris Branch_A2 Chris Branch_A2 Chris Branch_A2 Chris Branch_A
3 Paula Branch_A3 Paula Branch_A3 Paula Branch_A3 Paula Branch_A
5 Pete Branch_A5 Pete Branch_A5 Pete Branch_A5 Pete Branch_A
8 Chrissie Branch_A8 Chrissie Branch_A8 Chrissie Branch_A8 Chrissie Branch_A
4 record(s) selected.4 record(s) selected.4 record(s) selected.4 record(s) selected.
Data
SELECT "*" FROM BRANCH_TBLSELECT "*" FROM BRANCH_TBLSELECT "*" FROM BRANCH_TBLSELECT "*" FROM BRANCH_TBL
EMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARY
---------------------------- ------------------------------------------------ --------------------------------------------
1 Steve Branch_B1 Steve Branch_B1 Steve Branch_B1 Steve Branch_B
2 Chris Branch_A2 Chris Branch_A2 Chris Branch_A2 Chris Branch_A
3 Paula Branch_A3 Paula Branch_A3 Paula Branch_A3 Paula Branch_A
4 Craig Branch_B4 Craig Branch_B4 Craig Branch_B4 Craig Branch_B
5 Pete Branch_A5 Pete Branch_A5 Pete Branch_A5 Pete Branch_A
6 Stephanie Branch_B6 Stephanie Branch_B6 Stephanie Branch_B6 Stephanie Branch_B
7 Julie Branch_B7 Julie Branch_B7 Julie Branch_B7 Julie Branch_B
8 Chrissie Branch_A8 Chrissie Branch_A8 Chrissie Branch_A8 Chrissie Branch_A
1) Create and grant access and roles *
CREATE ROLE BRANCH_A_ROLECREATE ROLE BRANCH_A_ROLECREATE ROLE BRANCH_A_ROLECREATE ROLE BRANCH_A_ROLE
GRANT ROLE BRANCH_A_ROLE TO USER newtonGRANT ROLE BRANCH_A_ROLE TO USER newtonGRANT ROLE BRANCH_A_ROLE TO USER newtonGRANT ROLE BRANCH_A_ROLE TO USER newton
GRANT SELECT ON BRANCH_TBL TO USER newtonGRANT SELECT ON BRANCH_TBL TO USER newtonGRANT SELECT ON BRANCH_TBL TO USER newtonGRANT SELECT ON BRANCH_TBL TO USER newton
* Note: Steps 1, 2, and 3 are done by a user with* Note: Steps 1, 2, and 3 are done by a user with* Note: Steps 1, 2, and 3 are done by a user with* Note: Steps 1, 2, and 3 are done by a user with
SECADM authority.SECADM authority.SECADM authority.SECADM authority.
19
21. Column Based Access Control
2) Create permissions *
CREATE MASK SALARY_MASK ON SAL_TBL FORCREATE MASK SALARY_MASK ON SAL_TBL FORCREATE MASK SALARY_MASK ON SAL_TBL FORCREATE MASK SALARY_MASK ON SAL_TBL FOR
COLUMN SALARY RETURNCOLUMN SALARY RETURNCOLUMN SALARY RETURNCOLUMN SALARY RETURN
CASE WHEN VERIFY_ROLE_FOR_USER(SESSION_USER,'MANAGER') = 1CASE WHEN VERIFY_ROLE_FOR_USER(SESSION_USER,'MANAGER') = 1CASE WHEN VERIFY_ROLE_FOR_USER(SESSION_USER,'MANAGER') = 1CASE WHEN VERIFY_ROLE_FOR_USER(SESSION_USER,'MANAGER') = 1
THEN SALARYTHEN SALARYTHEN SALARYTHEN SALARY
ELSE 0.00ELSE 0.00ELSE 0.00ELSE 0.00
ENDENDENDEND
ENABLEENABLEENABLEENABLE
3) Enable access control *
ALTER TABLE SAL_TBL ACTIVATE COLUMN ACCESS CONTROLALTER TABLE SAL_TBL ACTIVATE COLUMN ACCESS CONTROLALTER TABLE SAL_TBL ACTIVATE COLUMN ACCESS CONTROLALTER TABLE SAL_TBL ACTIVATE COLUMN ACCESS CONTROL
4b) Select as a MANAGER
CONNECT TO TESTDB USER socratesCONNECT TO TESTDB USER socratesCONNECT TO TESTDB USER socratesCONNECT TO TESTDB USER socrates
SELECT "*" FROM SAL_TBLSELECT "*" FROM SAL_TBLSELECT "*" FROM SAL_TBLSELECT "*" FROM SAL_TBL
EMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARY
---------------------------- ------------------------------------------------ --------------------------------------------
1 Steve 2500001 Steve 2500001 Steve 2500001 Steve 250000
2 Chris 2000002 Chris 2000002 Chris 2000002 Chris 200000
3 Paula 10000003 Paula 10000003 Paula 10000003 Paula 1000000
3 record(s) selected.3 record(s) selected.3 record(s) selected.3 record(s) selected.
Data
SELECT "*" FROM SAL_TBLSELECT "*" FROM SAL_TBLSELECT "*" FROM SAL_TBLSELECT "*" FROM SAL_TBL
EMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARY
---------------------------- ------------------------------------------------ --------------------------------------------
1 Steve 2500001 Steve 2500001 Steve 2500001 Steve 250000
2 Chris 2000002 Chris 2000002 Chris 2000002 Chris 200000
3 Paula 10000003 Paula 10000003 Paula 10000003 Paula 1000000
1) Create and grant access and roles *
CREATE ROLE MANAGERCREATE ROLE MANAGERCREATE ROLE MANAGERCREATE ROLE MANAGER
CREATE ROLE EMPLOYEECREATE ROLE EMPLOYEECREATE ROLE EMPLOYEECREATE ROLE EMPLOYEE
GRANT SELECT ON SAL_TBL TO USER socratesGRANT SELECT ON SAL_TBL TO USER socratesGRANT SELECT ON SAL_TBL TO USER socratesGRANT SELECT ON SAL_TBL TO USER socrates
GRANT SELECT ON SAL_TBL TO USER newtonGRANT SELECT ON SAL_TBL TO USER newtonGRANT SELECT ON SAL_TBL TO USER newtonGRANT SELECT ON SAL_TBL TO USER newton
GRANT ROLE MANAGER TO USER socratesGRANT ROLE MANAGER TO USER socratesGRANT ROLE MANAGER TO USER socratesGRANT ROLE MANAGER TO USER socrates
GRANT ROLE EMPLOYEE TO USER newtonGRANT ROLE EMPLOYEE TO USER newtonGRANT ROLE EMPLOYEE TO USER newtonGRANT ROLE EMPLOYEE TO USER newton
4a) Select as an EMPLOYEE
CONNECT TO TESTDB USER newtonCONNECT TO TESTDB USER newtonCONNECT TO TESTDB USER newtonCONNECT TO TESTDB USER newton
SELECT "*" FROM SAL_TBLSELECT "*" FROM SAL_TBLSELECT "*" FROM SAL_TBLSELECT "*" FROM SAL_TBL
EMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARYEMP_NO FIRST_NAME SALARY
---------------------------- ------------------------------------------------ --------------------------------------------
1 Steve 01 Steve 01 Steve 01 Steve 0
2 Chris 02 Chris 02 Chris 02 Chris 0
3 Paula 03 Paula 03 Paula 03 Paula 0
3 record(s) selected.3 record(s) selected.3 record(s) selected.3 record(s) selected.
* Note: Steps 1, 2, and 3 are done by a user with* Note: Steps 1, 2, and 3 are done by a user with* Note: Steps 1, 2, and 3 are done by a user with* Note: Steps 1, 2, and 3 are done by a user with
SECADM authority.SECADM authority.SECADM authority.SECADM authority.
20
22. Big SQL 3.0 – Other enterprise features
Federation
• Join between your Hadoop data and other external relational platforms
• Optimizer determines most efficient execution path
Open integration across Business Analytic Tools
• IBM Optim Data Studio performance tool portfolio
• Superior enablement for IBM Software – e.g. Cognos
• Enhanced support by 3rd party software – e.g. Microstrategy
Mixed workload cluster management
• Capacity sharing with the rest of the cluster
– Specify %cpu and %memory to dedicate to BigSQL 3.0
• SQL based workload management
• Integration with Platform Symphony to manage mixed cluster workloads
Support for standard development tools
21
23. Workload Management
(2) Identify workloads and associate to service classes
create workload SALES_WL CURRENT
client_appname(‘SalesSys') service class HIGHPRIWORK
create workload ITEMCOUNT_WL CURRENT
client_appname(‘InventorySys') service class LOWPRIWORK
(1) Create service classes
create service class BIGDATAWORK
create service class HIGHPRIWORK under BIGDATAWORK
create service class LOWPRIWORK under BIGDATAWORK
(3a) Avoid thrashing by queueing low priority work.
create threshold LOW_CONCURRENT for service class
LOWPRIWORK under BIGDATAWORK activities enforcement
database enable when concurrentdbcoordactivities > 5 and
queued activities unbounded continue
(3b) Stop high priority job if SLA cannot be met
create threshold HIGH_CONCURRENT for service class
HEAVYQUERIES under BIGDATAWORK activities
enforcement database enable when concurrentdbcoordactivities
> 30 and queued activities > 0 stop execution
(4a) Stop very long running jobs
create threshold LOWPRI_WL_TIMEOUT for service
class LOWPRIWORK under BIGDATAWORK activities
enforcement database enable when activitytotaltime >
30 minutes stop execution
(4b) Stop jobs that return too many rows
create threshold TOO_MANY_ROWS_RETURNED for
service class HIGHPRIWORK under BIGDATAWORK
enforcement database when sqlrowsreturned >30 stop
execution
(5) Collect data for long running jobs
Create threshold LONGRUNINVENTORYACTIVITIES
for service class LOWPRIWORK activities enforcement
database when activitytotaltime > 15 minutes collect
activity data with details continue
(6) Reporting system activity
create event monitor BIGDATAMONACT for
activities write to table
24. Using existing standard SQL tools: Eclipse
•Using existing SQL tooling against BigData,
•Same setup as for existing SQL sources!!
•Support for “standard” authentication!!
23
25. Using existing standard SQL tools: SQuirrel SQL
•Using existing SQL tooling against BigData
•Support for authenticating (not supported for Hive,
BUT supported by Big SQL!)
24
26. Using BigSheets in BigInsights: data discovery
•Discovery and analytics in a spreadsheet-like environment.
27. Big SQL 3.0 – Performance
Query rewrites
• Exhaustive query rewrite capabilities
• Leverages additional metadata such as constraints and nullability
Optimization
• Statistics and heuristic driven query optimization
• Query optimizer based upon decades of IBM RDBMS experience
Tools and metrics
• Highly detailed explain plans and query diagnostic tools
• Extensive number of available performance metrics
SELECT ITEM_DESC, SUM(QUANTITY_SOLD),
AVG(PRICE), AVG(COST)
FROM PERIOD, DAILY_SALES, PRODUCT,
STORE
WHERE
PERIOD.PERKEY=DAILY_SALES.PERKEY AND
PRODUCT.PRODKEY=DAILY_SALES.PRODKE
Y AND
STORE.STOREKEY=DAILY_SALES.STOREKEY
AND
CALENDAR_DATE BETWEEN AND
'01/01/2012' AND '04/28/2012' AND
STORE_NUMBER='03' AND
CATEGORY=72
GROUP BY ITEM_DESC
Thread 0
DSS
TQA (tq1)
AGG (complete)
BNO
EXT
Thread 1
TA (Product)
NLJN (Daily
Sales)
NLJN (Period)
NLJN (Store)
AGG (partial)
TQB (tq1)
EXT
Thread 2
TA (DS_IX7)
EXT
Thread 3
TA (PER_IX2)
EXT
Thread 4
TA (ST_IX1)
EXT
Access plan generationQuery transformation
Access
section
~150 query
transformations
Hundreds or thousands
of access plan options
Store
Product
Product Store
NLJOIN
Daily SalesNLJOIN
Period
NLJOIN
Product
NLJOIN
Daily Sales
NLJOIN
Period
NLJOIN
Store
HSJOIN
Daily Sales
HSJOIN
Period
HSJOIN
Product
StoreZZJOIN
Daily Sales
HSJOIN
Period
26
28. Statistics are key to performance
Table statistics:
• Cardinality (count)
• Number of Files
• Total File Size
Column statistics (this applies to column group stats also):
• Minimum value
• Maximum value
• Cardinality (non-nulls)
• Distribution (Number of Distinct Values)
• Number of null values
• Average Length of the column value (for string columns)
• Histogram
• Frequent Values (MFV)
27
29. Performance, Benchmarking, Benchmarketing
Performance matters to customers
Benchmarking appeals to Engineers to drive product innovation
Benchmarketing used to convey performance in a memorable
and appealing way
SQL over Hadoop is in the “Wild West” of Benchmarketing
• 100x claims! Compared to what? Conforming to what rules?
The TPC (Transaction Processing Performance Council) is the
grand-daddy of all multi-vendor SQL-oriented organizations
• Formed in August, 1988
• TPC-H and TPC-DS are the most relevant to SQL over Hadoop
– R/W nature of workload not suitable for HDFS
Big Data Benchmarking Community (BDBC) formed
28
30. Power and Performance of Standard SQL
Everyone loves performance numbers, but that's not the whole story
• How much work do you have to do to achieve those numbers?
A portion of our internal performance numbers are based upon read-only
versions of TPC benchmarks
Big SQL is capable of executing
• All 22 TPC-H queries without modification
• All 99 TPC-DS queries without modification
SELECT s_name, count(*) AS numwait
FROM supplier, lineitem l1, orders, nation
WHERE s_suppkey = l1.l_suppkey
AND o_orderkey = l1.l_orderkey
AND o_orderstatus = 'F'
AND l1.l_receiptdate > l1.l_commitdate
AND EXISTS (
SELECT *
FROM lineitem l2
WHERE l2.l_orderkey = l1.l_orderkey
AND l2.l_suppkey <> l1.l_suppkey)
AND NOT EXISTS (
SELECT *
FROM lineitem l3
WHERE l3.l_orderkey = l1.l_orderkey
AND l3.l_suppkey <> l1.l_suppkey
AND l3.l_receiptdate > l3.l_commitdate)
AND s_nationkey = n_nationkey
AND n_name = ':1'
GROUP BY s_name
ORDER BY numwait desc, s_name
SELECT s_name, count(*) AS numwait
FROM supplier, lineitem l1, orders, nation
WHERE s_suppkey = l1.l_suppkey
AND o_orderkey = l1.l_orderkey
AND o_orderstatus = 'F'
AND l1.l_receiptdate > l1.l_commitdate
AND EXISTS (
SELECT *
FROM lineitem l2
WHERE l2.l_orderkey = l1.l_orderkey
AND l2.l_suppkey <> l1.l_suppkey)
AND NOT EXISTS (
SELECT *
FROM lineitem l3
WHERE l3.l_orderkey = l1.l_orderkey
AND l3.l_suppkey <> l1.l_suppkey
AND l3.l_receiptdate > l3.l_commitdate)
AND s_nationkey = n_nationkey
AND n_name = ':1'
GROUP BY s_name
ORDER BY numwait desc, s_name
JOIN
(SELECT s_name, l_orderkey, l_suppkey
FROM orders o
JOIN
(SELECT s_name, l_orderkey, l_suppkey
FROM nation n
JOIN supplier s
ON s.s_nationkey = n.n_nationkey
AND n.n_name = 'INDONESIA'
JOIN lineitem l
ON s.s_suppkey = l.l_suppkey
WHERE l.l_receiptdate > l.l_commitdate) l1
ON o.o_orderkey = l1.l_orderkey
AND o.o_orderstatus = 'F') l2
ON l2.l_orderkey = t1.l_orderkey) a
WHERE (count_suppkey > 1) or ((count_suppkey=1)
AND (l_suppkey <> max_suppkey))) l3
ON l3.l_orderkey = t2.l_orderkey) b
WHERE (count_suppkey is null)
OR ((count_suppkey=1) AND (l_suppkey = max_suppkey))) c
GROUP BY s_name
ORDER BY numwait DESC, s_name
JOIN
(SELECT s_name, l_orderkey, l_suppkey
FROM orders o
JOIN
(SELECT s_name, l_orderkey, l_suppkey
FROM nation n
JOIN supplier s
ON s.s_nationkey = n.n_nationkey
AND n.n_name = 'INDONESIA'
JOIN lineitem l
ON s.s_suppkey = l.l_suppkey
WHERE l.l_receiptdate > l.l_commitdate) l1
ON o.o_orderkey = l1.l_orderkey
AND o.o_orderstatus = 'F') l2
ON l2.l_orderkey = t1.l_orderkey) a
WHERE (count_suppkey > 1) or ((count_suppkey=1)
AND (l_suppkey <> max_suppkey))) l3
ON l3.l_orderkey = t2.l_orderkey) b
WHERE (count_suppkey is null)
OR ((count_suppkey=1) AND (l_suppkey = max_suppkey))) c
GROUP BY s_name
ORDER BY numwait DESC, s_name
SELECT s_name, count(1) AS numwait
FROM
(SELECT s_name FROM
(SELECT s_name, t2.l_orderkey, l_suppkey,
count_suppkey, max_suppkey
FROM
(SELECT l_orderkey,
count(distinct l_suppkey) as count_suppkey,
max(l_suppkey) as max_suppkey
FROM lineitem
WHERE l_receiptdate > l_commitdate
GROUP BY l_orderkey) t2
RIGHT OUTER JOIN
(SELECT s_name, l_orderkey, l_suppkey
FROM
(SELECT s_name, t1.l_orderkey, l_suppkey,
count_suppkey, max_suppkey
FROM
(SELECT l_orderkey,
count(distinct l_suppkey) as count_suppkey,
max(l_suppkey) as max_suppkey
FROM lineitem
GROUP BY l_orderkey) t1
SELECT s_name, count(1) AS numwait
FROM
(SELECT s_name FROM
(SELECT s_name, t2.l_orderkey, l_suppkey,
count_suppkey, max_suppkey
FROM
(SELECT l_orderkey,
count(distinct l_suppkey) as count_suppkey,
max(l_suppkey) as max_suppkey
FROM lineitem
WHERE l_receiptdate > l_commitdate
GROUP BY l_orderkey) t2
RIGHT OUTER JOIN
(SELECT s_name, l_orderkey, l_suppkey
FROM
(SELECT s_name, t1.l_orderkey, l_suppkey,
count_suppkey, max_suppkey
FROM
(SELECT l_orderkey,
count(distinct l_suppkey) as count_suppkey,
max(l_suppkey) as max_suppkey
FROM lineitem
GROUP BY l_orderkey) t1
Original Query
Re-written for Hive
29
31. 30
Comparing Big SQL and Hive 0.12 for Ad-Hoc Queries
*Based on IBM internal tests comparing IBM Infosphere Biginsights 3.0 Big SQL with Hive 0.12 executing the "1TB Classic
BI Workload" in a controlled laboratory environment. The 1TB Classic BI Workload is a workload derived from the TPC-H
Benchmark Standard, running at 1TB scale factor. It is materially equivalent with the exception that no update functions are
performed. TPC Benchmark and TPC-H are trademarks of the Transaction Processing Performance Council (TPC).
Configuration: Cluster of 9 System x3650HD servers, each with 64GB RAM and 9x2TB HDDs running Redhat Linux 6.3.
Results may not be typical and will vary based on actual workload, configuration, applications, queries and other variables in
a production environment. Results as of April 22, 2014
Big SQL is up
to 41x faster
than Hive 0.12
Big SQL is up
to 41x faster
than Hive 0.12
32. 31
Comparing Big SQL and Hive 0.12
for Decision Support Queries
* Based on IBM internal tests comparing IBM Infosphere Biginsights 3.0 Big SQL with Hive 0.12 executing the "1TB Modern BI
Workload" in a controlled laboratory environment. The 1TB Modern BI Workload is a workload derived from the TPC-DS Benchmark
Standard, running at 1TB scale factor. It is materially equivalent with the exception that no updates are performed, and only 43 out of
99 queries are executed. The test measured sequential query execution of all 43 queries for which Hive syntax was publically
available. TPC Benchmark and TPC-DS are trademarks of the Transaction Processing Performance Council (TPC).
Configuration: Cluster of 9 System x3650HD servers, each with 64GB RAM and 9x2TB HDDs running Redhat Linux 6.3. Results
may not be typical and will vary based on actual workload, configuration, applications, queries and other variables in a production
environment. Results as of April 22, 2014
Big SQL is
10x faster than Hive
0.12
(Total elapsed time)
Big SQL is
10x faster than Hive
0.12
(Total elapsed time)
33. How many times Faster is Big SQL than Hive 0.12?
* Based on IBM internal tests comparing IBM Infosphere Biginsights 3.0 Big SQL with Hive 0.12 executing the "1TB Modern BI
Workload" in a controlled laboratory environment. The 1TB Modern BI Workload is a workload derived from the TPC-DS Benchmark
Standard, running at 1TB scale factor. It is materially equivalent with the exception that no updats are performed, and only 43 out of
99 queries are executed. The test measured sequential query execution of all 43 queries for which Hive syntax was publically
available. TPC Benchmark and TPC-DS are trademarks of the Transaction Processing Performance Council (TPC).
Configuration: Cluster of 9 System x3650HD servers, each with 64GB RAM and 9x2TB HDDs running Redhat Linux 6.3. Results
may not be typical and will vary based on actual workload, configuration, applications, queries and other variables in a production
environment. Results as of April 22, 2014
Max
Speedup
of 74x
Max
Speedup
of 74x
32
Queries sorted by speed up ratio (worst to best)
Avg
Speedup
of 20x
Avg
Speedup
of 20x
34. Big SQL 3.0 Best Practices
Ensure you have a homogenous and balanced cluster
• Utilize IBM reference architecture
Choose an optimized file format (if possible)
• ORC or Parquet
Choose appropriate data types
• Use the smallest and most precise datatype available
Define informational constraints
• Primary key, foreign key, check constraints
Ensure you have good statistics
• Current and comprehensive
Use the full power of SQL available to you
• Don’t constrain yourself to Hive syntax/capability
33
35. BigInsights Big SQL 3.0: Summary
Big SQL provides rich, robust, standards-based SQL support for data
stored in BigInsights
• Uses IBM common client ODBC/JDBC drivers
Big SQL fully integrates with SQL applications and tools
• Existing queries run with no or few modifications*
• Existing JDBC and ODBC compliant tools can be leveraged
Big SQL provides faster and more reliable performance
• Big SQL uses more efficient access paths to the data
• Queries processed by Big SQL no longer need to use MapReduce
• Big SQL is optimized to more efficiently move data over the network
Big SQL provides and enterprise grade data management
• Security, Auditing, workload management …
34
37. We Value Your Feedback
Don’t forget to submit your Impact session and speaker
feedback! Your feedback is very important to us – we use it to
continually improve the conference.
Use the Conference Mobile App or the online Agenda Builder to
quickly submit your survey
• Navigate to “Surveys” to see a view of surveys for sessions
you’ve attended
36
41. What is Hadoop?
Hadoop is not a piece of software, you can't install "hadoop"
It is an ecosystem of software that work together
• Hadoop Core (API's)
• HDFS (File system)
• MapReduce (Data processing framework)
• Hive (SQL access)
• HBase (NoSQL database)
• Sqoop (Data movement)
• Oozie (Job workflow)
• …. There are is a LOT of "Hadoop" software
However, there is one common component they all build on: HDFS…
40
42. HDFS configuration (shared-nothing cluster)
NN DN
Local
disks
DN
Local
disks
DN
Local
disks
DN
Local
disks
DN
Local
disks
DN
Local
disks
DN
Local
disks
DN
Local
disks
DN
Local
disks
DN
Local
disks
NN = NameNode, which manages all the metadata
DD = DataNode, which reads/writes the file data
41
43. HDFS
Driving principals
• Files are stored across the entire cluster
• Programs are brought to the data, not the data to the program
Distributed file system (DFS) stores blocks across the whole cluster
• Blocks of a single file are distributed across the cluster
• A given block is typically replicated for resiliency
• Just like a regular file system, the contents of a file is up to the application
10110100
10100100
11100111
11100101
00111010
01010010
11001001
01010011
00010100
10111010
11101011
11011011
01010110
10010101
00101010
10101110
01001101
01110100
Logical File
1
2
3
4
Blocks
1
Cluster
1
1
2
2
2
3
3
34
4
4
42
44. Hadoop I/O
Hadoop (HDFS) doesn't dictate file content/structure
• It is just a filesystem!
• It provides standard API's to list directories, open files, delete files, etc.
• In particular it allows your task to ask "where does each block live?"
Hadoop provides a framework for creating "splittable" data sources
• A data source is typically file(s), but not necessarily
• A large input is "split" into pieces, each piece to be processed in parallel
• Each split indicates the host(s) on which that split can be found
• For files, a split typically refers to an HDFS block, but not necessarily
10110100
10100100
11100111
11100101
00111010
01010010
11001001
01010011
00010100
10111010
11101011
11011011
01010110
10010101
1
2
3
Logical File
Splits
1
Cluster
32
App App App
Results
43
45. InputFormat
This splitting process is encapsulated in the InputFormat interface
• Hadoop has a large library of InputFormat's for various purposes
• You can create and provided your own as well
An InputFormat does the following
• Configured with a set of name/value pair properties
• When configured you can ask it for a list of InputSplit's
– Each input split has…
– A list of hosts on which the data for the split is recommended to be processed (optional)
– A size in bytes (optional)
• Given an InputSplit, an InputFormat can produce a RecordReader
A RecordReader does the following
• Acts as an input stream to read the contents of the split
• Produces a stream of records
• There is no fixed definition of a record – it depends upon the input type
Let's look at an example of an InputFormat…
44
46. InputFormat example - TextInputFormat
Purpose
• Reads input file(s) line by line, each read produces one line of text
Configuration
• Configured with the names of one or more (HDFS) files to process
Splits
• Each split it produces represents a single HDFS block of a file
RecordReader
• When opened, finds the first newline of the block it is to read
• Each read produces the next available line of text in the block
• May read into the next block of text to ensure the last line is fully read
– Even if the block is physically located on another host!!
101101
001010
010011
100111
111001
010011
101001
010010
110010
010101
101101
001010
010011
100111
111001
010011
101001
010010
110010
010101
1
2
3
Text File
(logical)
Splits Readers
Records
(lines of text)45
47. Hadoop MapReduce
MapReduce is a way of writing parallel processing programs
Built around InputFormat's (and OutputFormat's)
Programs are written in two pieces: Map and Reduce
Programs are submitted to the MapReduce job scheduler: JobTracker
• The JobTracker asks for the InputFormat splits
• For each split, tries to schedule the processing on a host on which the split lives
• Hosts are chosen based upon available processing resources
Program is shipped to a host and given a split to process
Output of the program is written back to HDFS
46
48. MapReduce - Mappers
Mappers
• Small program (typically), distributed across the cluster, local to data
• Handed a portion of the input data (called a split)
• Each mapper parses, filters, and/or transforms its input
• Produces grouped <key,value> pairs
10110100
10100100
11100111
11100101
00111010
01010010
11001001
01010011
00010100
10111010
11101011
11011011
01010110
10010101
00101010
10101110
01001101
01110100
Logical
Input File
1
2
3
4
1 map
sort
2 map
sort
3 map
sort
4 map
sort
reduce
reduce
copy merge
merge
10110100
10100100
11100111
11100101
00111010
01010010
11001001
10110100
10100100
11100111
11100101
00111010
01010010
11001001
Logical Output File
Logical Output File
To DFS
To DFS
Map Phase
47
49. MapReduce – The Shuffle
The shuffle is transparently orchestrated by MapReduce
The output of each mapper is locally grouped together by key
One node is chosen to process data for each unique key
Shuffle
10110100
10100100
11100111
11100101
00111010
01010010
11001001
01010011
00010100
10111010
11101011
11011011
01010110
10010101
00101010
10101110
01001101
01110100
1
2
3
4
1 map
sort
2 map
sort
3 map
sort
4 map
sort
reduce
reduce
copy merge
merge
10110100
10100100
11100111
11100101
00111010
01010010
11001001
10110100
10100100
11100111
11100101
00111010
01010010
11001001
Logical Output File
Logical Output File
To DFS
To DFS
48
50. MapReduce – Reduce Phase
Reducers
• Small programs (typically) that aggregate all of the values for the key
that they are responsible for
• Each reducer writes output to its own file
Reduce Phase
10110100
10100100
11100111
11100101
00111010
01010010
11001001
01010011
00010100
10111010
11101011
11011011
01010110
10010101
00101010
10101110
01001101
01110100
1
2
3
4
1 map
sort
2 map
sort
3 map
sort
4 map
sort
reduce
reduce
copy merge
merge
10110100
10100100
11100111
11100101
00111010
01010010
11001001
10110100
10100100
11100111
11100101
00111010
01010010
11001001
Logical Output File
Logical Output File
To DFS
To DFS
49
51. Joins in MapReduce
Hadoop is used to group data together at the same reducer based upon the join
key
• Mappers read blocks from each “table” in the join
• The <key> is the value of the join key, the <value> is the record to be joined
• Reducer receives a mix of records from each table with the same join key
• Reducers produce the results of the join
reduce
dept 1
reduce
dept 2
reduce
dept 3
1011010
0101001
0011100
1111110
0101001
1010111
0111010
1
1 map
2 map
2
1 map
employees
1011010
0101001
0011110
0111011
1
depts
select e.fname, e.lname, d.dept_name
from employees e, depts d
where e.salary > 30000
and d.dept_id = e.dept_id
select e.fname, e.lname, d.dept_name
from employees e, depts d
where e.salary > 30000
and d.dept_id = e.dept_id
50
52. Joins in MapReduce (cont.)
For N-way joins involving different join keys, multiple jobs are used
reduce
dept 1
reduce
dept 2
reduce
dept 3
1011010
0101001
0011100
1111110
0101001
1010111
0111010
1 1 map
2 map
2
1 map
employees
1011010
0101001
0011110
0111011
1
select e.fname, e.lname, d.dept_name, p.phone_type, p.phone_number
from employees e, depts d, emp_phones p
where e.salary > 30000
and d.dept_id = e.dept_id
and p.emp_id = e.emp_id
select e.fname, e.lname, d.dept_name, p.phone_type, p.phone_number
from employees e, depts d, emp_phones p
where e.salary > 30000
and d.dept_id = e.dept_id
and p.emp_id = e.emp_id
depts
1011010
0101001
0011100
1111110
0101001
1010111
0111010
1
2
1011010
0101001
0011110
0111011
1
1011010
0101001
0011100
1111110
0101001
1010111
0111010
1
2
1011010
0101001
0011100
1111110
0101001
1010111
0111010
1
2
emp_phones
(temp files)
1 map
2 map
1 map
1 map
2 map
1 map
2 map
reduce
dept 1 reduce
emp_id 1
reduce
emp_id 2
reduce
emp_id N
results
results
results
51