Big Data World Forum (BDWF http://www.bigdatawf.com/) is specially designed for data-driven decision makers, managers, and data practitioners, who are shaping the future of the big data.
Turning Big Data Challenges into
Big Opportunities
Informatica for Big Data
Wei Zheng
Director, Product Management
2
A Little About Us – Informatica
The #1 Independent Leader in Data Integration
• Founded: 1993
• 2010 Revenue: $650 million $650
• 5-year Average Growth Rate: $600
20% per year $550
• Employees: 2,125+ $500
$450
• Partners: 400+
• Major SI, ISV, OEM and $400
On-Demand Leaders $350
• Customers: 4,280+ $300
• 84 of Fortune 100 $250
• 87%+ of Dow Jones $200
• Government Organizations in $150
20 countries
2005 2006 2007 2008 2009 2010
• # 1 in Customer Loyalty
Rankings (5 Years in a Row)
3
The Informatica Platform
Proven Value: Comprehensive, Unified, Open & Economical
Comprehensive Unified Open Economical
Supporting the Maximizing productivity Accessing transaction or Low Total Cost of
complete data with self‐service for interaction data from Ownership (TCO) ‐
integration lifecycle business & IT any source including leveraging
Hadoop
Adaptive data services Mitigating risk by
Enabling any data to deliver data as a working with what you Fast Return on
integration project service from any have now and in the Investment (ROI)
Delivering at any projects future Flexible deployment
latency from years to
scaled to your business
microseconds Consistent interfaces
Open to any domain, needs
across all processing
architectural styles
platforms
4
Big Data on Executive Agenda
MARCH 22, 2011, 11:30 A.M. ET.
Report to the President: Every Federal Agency Needs a 'Big Data'
Strategy
February 4, 2011
How Vendors Are Lowering Big Data
Barriers
March 26, 2011
A Model for the Big Data Era
Data-centric architecture is becoming
fashionable again
Today’s Leaders Are Racing to Uncover New Value and
Opportunities for Competitive Insights and Improved Operations
5
Defining Big Data
Definition: Big data is the confluence of the three trends consisting of
Big Transaction Data, Big Interaction Data and Big Data Processing
BIG TRANSACTION DATA BIG INTERACTION DATA
Online Online Analytical Social Other
Transaction Processing Media Data Interaction Data
Processing (OLAP) &
(OLTP) DW Appliances
Call detail
records, image,
click stream data
Scientific, genomic
BIG DATA INTEGRATION
Machine/Device
BIG DATA PROCESSING
6
Value of Big Data Integration
Unleash the full business potential of Big
Data to empower the data-centric
enterprise
7
Informatica 9.1: Harnessing the Power of Big
Transaction and Interaction Data
Big Data Integration
For All Data
Authoritative and
Trustworthy Data
For All Purposes
Self Service
For All Users
Adaptive
Data Services
For All Projects
8
Big Data Integration
Gain business value from Big Data
Big Data Processing Enabling Solutions
Extend Enterprise Near-universal connectivity
Environment with Other Interaction data to Big Transaction Data
Hadoop •Clickstream,
•Scientific/genomic,
Hadoop – Web Social Connectivity to Big
•Sensor –
processing, text mining, media Interaction Data including
machine/device,
fraud/risk analytics, •mobile, call detail social data
image processing,
records (CDR)
Hadoop - sandbox, •Image files, texts Connectivity to Hadoop
staging, archive
Large scale
processing –
OLTP, OLAP Your information Big Interaction Data
New type of DW management
appliances environment
Big Transaction Data
10
Big Transaction Data
Maximize availability and performance of big transaction data
Uncover
Better
Universal new areas
Actions &
Access for growth &
Operations
efficiency
All data including Reliable, complete Greater confidence
OLTP, OLAP and information Continuous
DW appliances No data discarded innovation
Near-Universal Connectivity
to Big Transaction Data
Database Warehouse Appliances
11
Big Interaction Data
Achieve a complete view with social and interaction data
Turn insights on relationships,
?
influences and behaviors Into
opportunities
Connectivity to Big Interaction
Data including social data
What What will she
How influence
Databases
do with this
connected does she Call Detailed Records,
merchandise?
Image Files, RFIDs
is she? have with her Any
Informatica MDM
family and …
Customer Product additional
External Data
Applications
friends? Providers
services?
12
Big Data Processing
Connectivity to Hadoop and Future Integration
Predictive Portfolio & Risk
Sentiment Fraud Detection Smart Devices
Analytics Analysis
Analysis
Hadoop Cluster
Graphical IDE for
Hadoop
Development
Future: Phase 2
Connectivity for
•• Codeless & metadata driven
Codeless & metadata driven
Hadoop (HDFS) development
development
•• Prepare & integrate data on Hadoop
Prepare & integrate data on Hadoop
9.1 HF1 – June 2011 •• Complete push down optimization
Complete push down optimization
•• Load data to Hadoop from any •• Metadata lineage
Metadata lineage
Load data to Hadoop from any
source
source
•• Extract data from Hadoop to any
Extract data from Hadoop to any
Weblogs, Mobile
target
target Databases, Semi-structured Cloud Applications,
Data, Sensor Data Data Warehouses Unstructured Enterprise Applications
Social Data
13
Multi-Style MDM
Single platform for all architectural styles and data domains
Multi-domain Multi-style
Customer Master Registry
Product Master Analytic
Chart of Accounts Co-existence
Location Master Universal MDM Transactional
… …
Multi-deployment Multi-use
Hub MDM Data Integration
Federated MDM Data Quality
MDM in the Cloud Data Services
MDM as a Service
…
…
15
Reusable Data Quality Policies
A Single Platform For Trusted Data
Business/IT Collaboration/Data Governance
Role-based, Unified, Process-driven
A Single Platform For Trusted Data
One Platform for Data Integration, Data Quality and MDM
Business Data Stewards Architects Developers
One One One
MDM Data Quality Data Integration
Re-usable/Consistent Data Quality Rules
16
Self Service Data Integration
SQL or Web
Service
BI Report
b
We
L or e
DI Analyst SQ ervic
S
Batch ETL
DI Developer Data Warehouse
Informatica’s self-service data integration doubles productivity by eliminating
manual steps and empowering analysts to do more on their own. Analysts can
define and validate source-to-target specifications in an intuitive browser-based
tool without a data architect or DBA. On top of that, once the analyst creates the
source-to-target specification, the mapping logic is automatically generated for a
developer to deploy to production.”
Sean Hickey, Manager Data Integration, T-Mobile
18
Self Service
Point-of-Use Data and Context for Business Users
SFDC
Account with
Data
Controls
Expand
Hierarchy
Account
Hierarchy
Data Control
19
Self Service
Pre-Built Application Accelerators to Jumpstart Projects
Tag content to
Develop mappings augment
based on business project
entities rather than metadata
individual tables
• Improved analyst & developer
productivity and collaboration
• Save project time and cut costs
Jumpstart mapping
20
Multi-Protocol Data Provisioning
Reusable Data Services
• Easily reuse DI logic/LDOs
for any mode/protocol
• Metadata-driven, visual,
graphical env (i.e., no-code)
• Execution & optimization
separate from design-time
• No re-development & re-
building of LDOs
• Reduce duplication of DI
development & maintenance
22
Integrated Data Quality
Apply Data Quality Rules at Point of Access, Dynamically
• Provision data quality
Data rules via data services–
Quality Integrated Data
v9.x.x Quality Read or Write
• Use library of templates
& data quality rules
• Auto generate data
quality transformations
• In real-time – no pre or
post-processing/staging
• Enforce data quality rules at
point of access
23
Informatica for Big Data Integration
Business Imperatives
Improve Mergers Increase
Deliver Increase Improve Acquire & Outsource Governance
Efficiency & Acquisitions Partner
Analytical Business Business Retain Non-core Risk
Reduce & Network
Insight Agility Processes Customers Functions Compliance
Costs Divestitures Efficiency
Big Data Real-Time Complex Big Data
Ultra Big Data Big Data Big Data Social /Big Data
Warehousing & Customer Event Collection &
messaging Services Archiving Consolidation Synchronization
Operational BI View Processing Aggregation
Deliver 5x Saved millions Rationalized Unite operations Increased Deliver cloud Turned human Reduce Time
25% savings in application monthly slot access to 177+ to Market by
faster & direct annually by across 200 brands review
data center portfolio and revenues by 4% million 90% by On-
access to improving over 100+ into automated
footprint ($1M+) saved $1 million while expanding businesses Boarding
customer, risk, trucking countries through alerts
reduce latency by with 6 month target customer worldwide and New Data
claims data in operations and migration of in seconds
83 percent to 340 payback. segments from 53 million Sources
variety of empowering business data for maritime
microseconds, Reduced age of 40 to 160 across contacts. D&B Faster and
sources – DW, business with from five systems security –
enabling a 580 data by 87% for 500 sources in 360 app enabling a
16 legacy, Hadoop-based to one through
percent increase service real-time with updates with wide variety
30000 data free-form geospatial and
in throughput over monitoring & social and linkedin and of Data
marts, 10M questions using video tracking
1B transactions pattern machine data twitter Formats
claims via data sensor, mobile
per day and identification of
feeds at 1/3 of and geospatial
growing large scale data
the cost data
24
Business Benefits of Informatica 9.1 for
Big Data
• Big Data Integration to gain business
value from Big Data
• Authoritative and Trustworthy Data to
increase business insight and consistency
by delivering trusted data for all purposes.
• Self-Service to empower all users to
obtain relevant information while IT
remains in control
• Adaptive Data Services to deliver relevant
data adapted to the business needs of all
projects
25