Batch Processing Interactive Analysis Stream Processing
Query runtime Minutes to hours Milliseconds to minutes Never-ending
Data volume TBs to PBs GBs to PBs Continuous stream
Programming model MapReduce Queries DAG
Users Developers Analysts and developers Developers
Originating project Google MapReduce Google Dremel Twitter Storm
Open source project Hadoop / Spark Drill / Shark / Impala
Hbase
Storm / Apache S4 /Kafka
How do I optimize my
fleet based on weather
and traffic patterns?
What’s the social
sentiment for my
brand or products
How do I better
predict future
outcomes?
GAIN COMPETITIVE ADVANTAGE BY MOVING FIRST AND FAST IN YOUR INDUSTRY
Web app
optimization
Smart meter
monitoring
Equipment
monitoring
Advertising
analysis
Life sciences
research
Fraud
detection
Healthcare
outcomes
Weather
forecasting
Natural resource
exploration
Social network
analysis
Churn
analysis
Traffic flow
optimization
IT infrastructure
optimization
Legal
discovery
persistent | distributed
• In Memory
• Efficient at Random
Reads/Writes
• Distributed, large
scale data store
• Utilizes Hadoop for
persistence
• Both HBase and
Hadoop are
distributed
An object contained within a user database
Defines the scheme for the federation
Represent the database being sharded
Database that houses the federation object
System managed SQL databases
Contain part, or “slices” of data
Orders_federation
Orders_federation
CREATE FEDERATION fed_name(fed_key_label fed_key_type distribution_type)
Orders_federation
Orders_federation
The key used for data distribution
int, bigint, guid, varbinary
Represent a single instance of a federation key.
All rows in all federated tables with the same federation key value.
PK=5 PK=25 PK=35
PK=5 PK=25 PK=35