Contenu connexe
Similaire à Oracle Big Data Governance Webcast Charts (20)
Plus de Jeffrey T. Pollock (15)
Oracle Big Data Governance Webcast Charts
- 1. Copyright © 2014 Oracle and/or its affiliates. All rights reserved.
Oracle Data Integration and Governance For Big Data
Jeff Pollock
Vice President, Oracle Data Integration & Governance
Madhu Raviendran Nair
Marketing Director, Oracle Data Integration & Governance
Data Governance for the Big Data Reservoir
- 2. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Get Fast Answers to New Questions
Create a Data Reservoir
Predict More, More Accurately
AccelerateData-Driven Action
Big Data Reservoir Drives Big Results
Business Drivers for Big Data Initiatives
Oracle Big Data Governance 2
- 3. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle For Big Data Reservoir
Oracle Data Integration Provides the Architectural Components
Oracle Big Data Governance 3
Staging
Detail
Fast load
Fast load
Data
Replication
Data
Synchronization
Hadoop Data
Transformation
HiveQL–Pig/Oozie-Spark
Sources
Data Reservoir
SourcesOracle Data IntegratorOracle Data Integrator
GG to Flume
GG to Kafka
GG to HiveOracle GoldenGate
- 4. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
But What About Data Governance?
Oracle Big Data Governance 4
https://blogs.oracle.com/bigdata/entry/big_data_and_analytic_top
- 5. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
…to manage Risk/Compliance
Records retention
Rediscovery
Litigation support
Data access management
Information security and protection
Minimize corporate liability through proper governance of data
…to drive Business Value
Metadata discovery
Metadata & glossary cataloging
Data profiling
Data cleansing lifecycle
Data remediation
Maximize opportunity by ensuring trusted data is easily available for data driven business processes
5
The Data Governance Opportunity with Big Data
Oracle Big Data Governance
Solving businessand IT data challenges
- 6. Copyright © 2014 Oracle and/or its affiliates. All rights reserved.
Big Data Governance Myths
Do the same principles apply for Big Data and Traditional Data Governance?
Oracle Big Data Governance 6
Perception
1.Data Governance has reduced significance in Big Data
2.Data Reservoirs should always contain only raw data in full fidelity
3.Big Data and Hadoop architectures are black boxes
Reality
1.Big Data without governance and quality is just Big Bad Data
2.Data Reservoirs contains all data. Raw, formatted and enriched.
3.If you use the data (you will!), you need to govern it’s lifecycle.
- 7. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Governance and the Data Reservoir
Oracle Big Data Governance
- 8. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Big Data Governance 8
The Big Data Governance Problem
1 –How do we clean up the data lake?
2 –How do we keep the data reservoir clean?
- 9. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Governance is Not Easy, there is No Silver Bullet!
Oracle Big Data Governance 9
Data Governance
Metadata Management
Business Glossary
Data Profiling
Data Cleansing
Data Archiving
Data Privacy
PEOPLE
PROCESS
TECHNOLOGY
…people and process first,…tools and capabilities next,…and, there is no magic!
“…the overall impact of poor- quality data on the whole dataset remains the same. In addition, much of the data that organizations use in a big data context comes from outside, or is of unknown structure and origin. This means that the likelihood of data quality issues is even higher than before. So data quality is actually more important in the world of big data."
-Ted Friedman, Gartner
http://www.gartner.com/newsroom/id/2854917
- 10. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Governance for the Data Reservoir Right Now
Oracle Big Data Governance 10
Data Governance
Metadata Management
Business Glossary
Data Profiling
Data Cleansing
Data Archiving
Data Privacy
Oracle Enterprise Metadata Management
Oracle Enterprise Data Quality delivers a complete, best-of-breed and business friendly approach to data cleansing resulting in trustworthy data for applications and to improve business reliability.
•Metadata Management –horizontal and semantic data lineage for all big data sources
•Business Glossary –simple tools to catalog, link and collaborate on business terms
Oracle Enterprise Data Quality
Oracle Enterprise Data Quality delivers a complete, best-of-breed and business friendly approach to data cleansing resulting in trustworthy data for applications and to improve business reliability.
•Profiling–simple to use data health check that can work with sample sets of all data
•Cleansing–validate, match and de-duplicate data records from any business application
Oracle Big Data SQL
Extends Oracle SQL to Hadoop and NoSQL and the security of Oracle Database to all your data. It also includes a unique Smart Scan service that minimizes data movement and maximizes performance.
•Data Privacy –leverage the Oracle DB security model on data that physically resides in Hadoop
•Archiving–Seamlessly locate aged data in a queryabledata tables physically located in Hadoop
- 11. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Enterprise Metadata Management (OEMM)
Oracle Big Data Governance 11
•Metadata Management –horizontal and semantic data lineage for all big data sources
•Business Glossary –simple tools to catalog, link and collaborate on business terms
Business Data Catalog
Report to Source Lineage
Impact Analysis
Audit, Versioning & Diff Reports
Social/Collaboration Features
Annotations and Tagging
Comprehensive Harvesting
3rdParty BI Metadata
3rdParty ETL Metadata
3rdParty DB Metadata
3rdParty Modeling Tools
Big Data Metadata
Metadata Standards
- 12. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Value of Enterprise Metadata Management
Oracle Big Data Governance 12
ETL
BIDashboards
App
ETL
ETL
How was sales figure calculated?
What will happen if I change this table?
What reports use the mainframe data?
Sys Admin
Executive
BI Developer
Where did this data come from?
Application User
Which reports use this customer data?
CDC
Data Reservoir
Data Steward
Can I trust the sources of this customer data?
ETLDeveloper
Solves significant pain points for wide variety of business consumers and technical staff
I want to design an experiment to measure the success of a signup page. What data do I have?
Data Scientist
GG
- 13. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Metadata Management Use Cases
Oracle Big Data Governance
My dashboard does not match this report…why?
Where did this data come from?
Where can I find the data I need for analytics?
Which ETL mappings or BI Reports will be affected by my column change?
What systems does the data flow through?
13
- 14. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Simple Screens for both Business and IT User Profiles
Oracle Big Data Governance 14
Comprehensive Data Lineage for IT
Simple to Navigate All Metadata
Business / IT Collaboration
Search Driven Business Access
- 15. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
15
Vertical Lineage
Links a business friendly set of terms to the IT metadata and operational assets
Capture Business Glossary, Taxonomy, Ontology, Conceptual Models
Horizontal Column Level
Links the data fields from Business Intelligence Dashboards or Reports back to the Source Columns
Schemas, BI View Layers, ETL Transformations, Calculations, etc.
Oracle Big Data Governance
Vertical Lineage
Horizontal Lineage
“NE_SALES”
“SALES”
“NAME”
“ACCT_NAME”
“NORTH”
“AGG_TOTAL”
BI Fields to Source Columns
“FNAME|LNAME”
“Customer”
Biz Terms to IT
Two Crucial Styles of Metadata Management
- 16. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
ActionableEvents
Event Engine
Data Reservoir
Data Factory
Enterprise Information Store
Reporting
Discovery Lab
Actionable
Information
ActionableInsights
DataStreams
Execution
Innovation
Discovery Output
Events & Data
Data Flow View –Data Factory and Metadata Management
StructuredEnterprise
Data
Other
Data
Oracle Big Data Governance 16
Metadata Management and Business Glossary
- 17. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Comprehensive Data Integration & Governance Capabilities
Oracle Big Data Governance 17
Dynamic Data Movement
–Low impact capture, stage in Hadoop
–Continuous data availability
Data Transformation
–Bulk data movement
–Pushdown data processing
Data Federation
–Virtualized Data Services
Data Quality & Verification
–Fix quality at the source
–Verify data consistency
Metadata Management
–Lineage and Impact Analysis
–Business Glossary Semantics
Data GovernanceFoundationOracle Data Integrator(Transformation) Enterprise Data Quality(Profile, Cleanse, Match and De-duplicate)
FastLoadOracle GoldenGate(Movement) Enterprise Metadata Management & Business Glossary(Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator(Federation) GoldenGateVeridata(Online Data Verification)
ELT Processingon Hadoop or SQL
Continuous Availability
- 18. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Enterprise Data Quality
Oracle Big Data Governance 18
•Profiling–simple to use data health check that can work with sample sets of all data
•Cleansing–validate, match and de-duplicate data records from any business application
Profile
Standardize
Match
Govern
Unified Workbench
Market-leading businessusability for all types of data
Unparalleled time-to-value, rapid deployments
High performance engine operates in real-time or batch
Out-of-the-box global knowledge- base for world-wide coverage
Foundation for comprehensivedata governance program
- 19. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Big Data Governance Lifecycle Tools
Oracle Big Data Governance 19
Operational Data Flows
Business Sources
Quality KPIs
Case Management
Governance Cockpit for Data Stewards & Stakeholders
Exception Review
Metadata Management
Business Glossary
Design Time
- 20. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Enterprise-Wide Governance Board
Top US Payroll ProviderOracle Enterprise Data Quality for Governance on 100m records per month
20
- 21. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Privacy and Deep Data Access with Oracle Big Data SQL
Oracle Big Data Governance 21
SELECT w.sess_id, c.name
FROM web_logs w, customers c
WHERE w.source_country = ‘Brazil’
AND w.cust_id = c.customer_id;
Relevant SQL runs on BDA nodes
10’s of Gigabytes of Data
Only columns and rows needed to
answer query are returned
Hadoop Cluster
B B B
Big Data SQL
Oracle Database
WEB_LOGS CUSTOMERS
• Data Privacy – leverage the Oracle DB security model
on data that physically resides in Hadoop
• Archiving – Seamlessly locate aged data in a queryable
data tables physically located in Hadoop
- 22. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Does Big Data Integration & Governance Better
22
Dynamic Data Movement
NoETLEngine
Most Heterogeneous
vs.
Batch Data Movement & Weak CDC Tools
ETL Engine H/W Alongside Hadoop
Proprietary Vendor Lock- in, Incomplete Metadata
vs.
vs.
Oracle Big Data Governance
Oracle Data Integration Governance vs. “Other Guys”
BusinessFriendly
Governance Tools
Wide & Current3rdParty Support
Comprehensive Platform
vs.
Mix and Match of 6+ Legacy Tools
Inflexible Metadata Models & Frameworks
Incomplete Governance Features
vs.
vs.
Data Governance
Metadata Management
Business Glossary
Data Profiling
Data Cleansing
Data Archiving
Data Privacy
- 23. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Most Heterogeneous, Deep 3rdParty Support
Oracle Big Data Governance 23
Hadoop HBase
Hadoop Hive/Flume
HP Enscribe
HP NonStop
HP Neoview
Hypersonic SQL
IBM DB2 iSeries
IBM DB2 UDB
IBM DB2 z Series
IBM Informix
IBM Netezza
JMS / MQ
Microsoft Access
Microsoft SQLServer
MySQL
Pivotal Greenplum
PostgreSQL
Salesforce.com
SAP BW / BI
SAP ERP / ECC
SAS
SQL/MP
SQL/MX
Sybase ASE
Sybase IQ
Teradata
Adaptive
Altova
Apache Hcatalog
Apache Hive/HQL
Borland
CA ERwin
ClouderaImpala
COBOL Copybook
DataStax
Embarcadero
EMC ProActivity
GentleWare
Google BigQuery
Grandite
HadaptHive
HortonworksHive
IBM Cognos
IBM DB2
IBM DataStage
IBM Discovery
IBM Federation Server
IBM Lotus Notes
IBM Netezza
IBM Rational Rose
IBM Rational Architect
InformaticaMetadata Mgr.
InformaticaPowerCenter
CoSORT
ISO SQL Standard (DDL)
MapRHadoop Hive
MicroFocus
Microsoft Access
Microsoft Office Excel
Microsoft Visio
Microsoft SQL Server
Microsoft SSIS
Microsoft Visual Studio
Microstrategy
Magic Draw
OMG CWM Standard
OMG UML Standard
Oracle BI Answers
Oracle BI Enterprise Edition
Oracle BI Server
Oracle DAC
Oracle Data Integrator
Oracle Data Modeler
Oracle Database
Oracle Designer
Oracle Hyperion Applications
Oracle Hyperion Essbase
Oracle Warehouse Builder
Pivotal Greenplum
PostgreSQL
QlikView
SAP BO Crystal Reports
SAP BO Designer
SAP BO Desktop Intelligence
SAP BO Repository
SAP BO Data Integrator
SAP BO Data Steward
SAP Master Data Management
SAP Sybase PowerDesigner
SAP Sybase ASE Database
SAS Data Integration Studio
SAS BI Server
SAS Information Map
SAS Metadata Management
SAS OLAP Server
Select
SparxArchitect
Syncsort
Tableau
Talend
Teradata
Tigris
Visible
W3C DTD & XSD Schema
Operational Integration (Movement / Transformation)
Metadata Harvesting (Glossary, Lineage & Impact Analysis)
Oracle Database
Oracle Exadata
Oracle Big Data Appliance
Oracle TimesTen
Oracle OLAP
Oracle Business Intelligence
Oracle BI Applications
Oracle E-Business Suite
Oracle JD Edwards Enterprise One
Oracle JD EdwardsWorld
Oracle Fusion Applications
Oracle Governance Risk and Compliance
Oracle Fusion AIA
OracleRetail Applications
Oracle Agile BI/ DW
OracleAgile PLM for Process
OracleiFlexFlexCUBE
Oracle iFlexMantas
Oracle HyperionApplications
Oracle PeopleSoft
Oracle Siebel CRM / OnDemand
Oracle Communications
Oracle WebLogic Server
Oracle Coherence Data Grid
Oracle SOA Suite
Oracle Enterprise Service Bus
+ open APIs and standards based meta-model
- 24. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
…to manage Risk/Compliance
Records retention
Rediscovery
Litigation support
Data access management
Information security and protection
Minimize corporate liability through proper governance of data
…to drive Business Value
Metadata discovery
Metadata & glossary cataloging
Data profiling
Data cleansing lifecycle
Data remediation
Maximize opportunity by ensuring trusted data is easily available for data driven business processes
24
The Data Governance Opportunity with Big Data
Oracle Big Data Governance
Solving businessand IT data challenges
- 25. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Simplifies Big Data Integration & Governance
Comprehensive Big Data Integration and Data Governance Platform
Appliance w/Hadoop Cluster
Analytic Tools
DI Tools and Connectors
Heterogeneous & Best of Breed
Differentiated and powerful DI capabilities for Teradata, Netezza, Microsoft, DB2, Sybase..
Faster Time to Value
Flexible configurations
OOTB performance with DI
Unified Mgmt -EM Plug-ins for Appliance and DI Tools
Single Support Contact – Hardware/Software/Networking and ASR
Oracle Big Data Governance 25
- 26. Copyright © 2014,Oracle and/or its affiliates. All rights reserved. |
Join the Community
#ODI12c #GoldenGate12c #OEDQ #OEMM
Connect with Oracle on Social Media
OR connect via the web
Oracle Data Integration blogblogs.oracle.com/dataintegration
Oracle Data Integration Home Pageoracle.com/goto/dataintegration
Oracle Big Data Governance 26
- 28. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Big Data Governance 28