SlideShare a Scribd company logo
1 of 6
Download to read offline
USE BIG DATA TECHNOLOGIES TO
MODERNIZE YOUR ENTERPRISE
DATA WAREHOUSE
                  Most organizations’ enterprise data warehouses were built with online transaction
                  processing (OLTP)-centric technologies and architectures that are 15-20 years old.
                  Over the years, more data has been bolted on to these systems, and the query
                  load being driven by both traditional and mobile business intelligence products has
                  increased exponentially, resulting in brittle, over-burdened, costly data warehouses
                  that can take hours to return results. They don’t meet the growing data appetite of
                  the business, and don’t answer the questions needed to run the business at the
                  required levels of granularity, or at the necessary speed. Yet too much has been
                  invested in them to simply throw them out.

                  Big Data market dynamics have resulted in the creation of new technologies,
                  products, and approaches that can be used to modernize these stodgy, inflexible
                  data warehouses, and make them more responsive to the business—without
                  throwing out what is already in place. This paper describes four tactics that can be
                  implemented quickly, using an organization’s existing skill sets, and that can
EMC PERSPECTIVE   rapidly show a return on investment.
TACTIC #1: ACCELERATE YOUR DATA
                                                      WAREHOUSE WITH MPP-BASED
                                                      ARCHITECTURES
                                                      Massively Parallel Processing (MPP)-based databases provide a cost effective, scale-
                                                      out data warehouse environment that allows organizations to leverage Moore’s Law1
                                                      on performance-to-cost ratio improvements in x86 processors. MPP databases provide
       BENEFITS                                       a non-intrusive analytical platform/data warehouse for data discovery and exploratory
                                                      work on massive amounts of data. Built on inexpensive commodity clusters, MPP
       Leverage more detailed,
                                                      databases can extend, complement, or replace parts of your existing data warehouse,
       more robust dimensional
                                                      managing massive volumes of detailed data, while providing agile query, reporting,
       data
                                                      dashboards, and analytics (see Figure 1).
        •     Seasonality to forecast
              retail sales and energy                 MPP databases, while offering many of the same benefits as your existing data

              consumption                             warehouse, also provide the following advantages:

        •     Localization to pin point               •      Extreme scalability on general purpose systems

              lending or fraud exposure               •      Automatic parallelization

        •     Hyper-dimensionality for                •      Ability to load and query like any other database
              digital media attribution
                                                      •      Scanning and processing of all nodes in parallel
              or health care treatment
              analysis                                •      Extreme scalability and optimized I/O

                                                      •      Linear scalability to easily add nodes and storage

                                                      •      Improved query and loading performance




                                                      	
  

                                                      Figure	
  1:	
  MPP	
  Data	
  Warehouse	
  Architectures	
  Scale	
  Easily	
  to	
  Speed	
  Results	
  and	
  Process	
  More	
  
                                                      Data	
  

                                                      	
  

                                                      Figure	
  1:	
  MPP	
  Data	
  Warehouse	
  Architectures	
  Scale	
  Easily	
  to	
  Speed	
  Results	
  and	
  Process	
  More	
  
                                                      Data	
  




1
    Moore's law is the observation that over the history of computing hardware, the number of transistors on integrated circuits doubles approximately
every two years. The result is the doubling of computing power at the same cost every 18 to 24 months. http://en.wikipedia.org/wiki/Moore%27s_law
An MPP data warehouse will enable more granular data for query, reporting, and
                               dashboard drill-down and drill-across exploration. Analysis can be performed on
                               detailed data instead of data aggregates.

                               On the analytics side, once a model has been developed and business insights have
                               been gleaned from these data sets, simply migrate the model and/or the insights into
                               the existing data warehouse for integration into the current business intelligence
                               environment. Alternatively, analytic modeling can also be done on the MPP platform,
                               making it part of the production process.



                               TACTIC #2: STOP MOVING DATA TO THE
                               ANALYTICS; BRING THE ANALYTICS TO THE
                               DATA
BENEFITS                       One of the most dramatic developments in Big Data is the advent of in-database
                               analytics. In-database analytics addresses one of biggest shortcomings in performing
Leverage low-latency (high-    advanced analytics—the requirement to move large amounts of data around. That has
velocity) data access
                               caused many organizations and data scientists to have to settle with working with
 •   Drive realtime customer   aggregate tables because the data transfer issue is so debilitating to the analytic
     acquisition, predictive   exploration and discovery process. In-database analytics reverses the process by
     maintenance, or network   moving the analytic algorithms to where the data is stored, accelerating the
     optimization decisions    development and deployment of modeling. Elimination of data movement results in
                               substantial benefits:
 •   Update analytic models
     on-demand based upon      •      Moving a few terabytes can take hours. With in-database analytics, it drops to
     current market or local          zero.
     weather conditions
                               •      Because the movement of data is the most time-consuming activity in logical
                                      processing time, reducing data movement reduces the processing time by 1/N,
                                      where N is the number of processing units. Processing time for 1 TB can be
                                      reduced by a factor of 16 with only a five-processor system, going from 193
                                      minutes to 12 minutes (see Figure 2).




                               	
  

                               	
  

                               Figure	
  2:	
  In-­‐database	
  Analytics	
  Dramatically	
  Speeds	
  Processing	
  Time	
  
TACTIC #3: USE ALL OF YOUR DATA WITH A
BENEFITS                        NEXT GENERATION OPERATIONAL DATA STORE
                                The Hadoop Distributed File System (HDFS) provides a powerful yet inexpensive
Manage a wide variety of
                                option for modernizing Operational Data Store (ODS) and Data Staging areas. HDFS
structured and unstructured
data sources                    is a cost-effective, large storage system with an intrinsic computing and analytical
                                capability (MapReduce). Built on commodity clusters, HDFS simplifies the acquisition
 •   Integrate unstructured     and storage of diverse data sources, whether structured, semi-structured (e.g., web
     claims descriptions to     logs and sensor feeds), or unstructured (e.g., social media, image, video, and audio).
     reduce fraudulent claims   Once in the Hadoop file system, MapReduce and commercial Hadoop-based tools are
 •   Leverage mobile data to    available to prepare the data for loading into an existing data warehouse. The ability
     create realtime            to “define schema on query” versus “define schema on load” simplifies amassing data
     promotions                 from a variety of sources, even if you are not sure when and how you might use that
                                data later (see Figure 3).
 •   Leverage sensor readings
     to optimize yield and      The result is a single platform for feeding both your data warehouse and analytics
     pricing                    environment. This inexpensive, scale-out solution can be used to store ALL of your
                                data.




BENEFITS
                                Figure	
  3:	
  Use	
  Hadoop	
  as	
  an	
  Operational	
  Data	
  Store	
  and	
  Analyze	
  ALL	
  of	
  the	
  Data	
  
Leverage new metrics,
dimensions, and
dimensional attributes
gleaned from unstructured       TACTIC #4: LEVERAGE UNSTRUCTURED DATA
data sources
                                TO ADD NEW METRICS TO AN ENTERPRISE
 •   Leverage customers’
     interests, passions,
                                DATA WAREHOUSE
     associations, and          An easy way to start building experience with Hadoop and MapReduce is to use these

     affiliations to improve    technologies to create new metrics from an unstructured data source that can be fed

     micro-segmentation         into the enterprise data warehouse. This will provide the ability to leverage data such
                                as social, mobile, consumer comments, email, doctors’ notes, or claims descriptions
 •   Add sensor-generated
                                to identify new metrics that are better predictors of performance. Most organizations’
     performance data into
                                existing data warehouses are treasure troves of key performance indicators and
     your manufacturing,
                                metrics used to monitor business performance. Use Hadoop and MapReduce to parse
     supply chain, or product
                                through unstructured data to identify new business performance metrics that can be
     predictive maintenance
                                integrated into the existing data warehouse (see Figure 4).
     models
Figure	
  4:	
  Parse	
  Unstructured	
  Data	
  Using	
  Hadoop/MapReduce	
  and	
  Incorporate	
  Results	
  into	
  
the	
  Enterprise	
  Data	
  Warehouse	
  

Once these new metrics are in the enterprise data warehouse, they can be used to
enhance existing business intelligence queries, reports, dashboards, and analyses
(see Figure 5).




Figure	
  5:	
  Integrate	
  Social	
  Media	
  Metrics	
  into	
  the	
  Existing	
  BI	
  Environment	
  

Note: implementing this tactic places companies in a good position as Hadoop
continues its assimilation into the relational database market. Being able to create
metrics and process data on Hadoop, leveraging tools like HBase and Hive that are
evolving quickly, and having BI tools connect directly to HDFS, may make data
warehouse professionals question why they need to move data to a relational
database at all.

MODERNIZE YOUR DATA WAREHOUSE TODAY
In the world of revolutionary, game-changing Big Data developments, data
warehouse modernization may sound like an evolutionary development. However, it is
something that can be executed today, with existing data warehouse skills, and
represents a simple first step toward gleaning immediate business value and
organizational agility from Big Data technologies. Why are you waiting?
EMC CONSULTING
                                    As part of EMC® Corporation, the world’s leading developer and provider of
                                    information infrastructure technology and solutions, EMC Consulting provides
                                    strategic guidance and technology expertise to help organizations exploit information
                                    to its maximum potential. With worldwide expertise across organizations’ businesses,
                                    applications, and infrastructures, as well as deep industry understanding, EMC
                                    Consulting guides and delivers revolutionary thinking to help clients realize their
                                    ambitions in an information economy. EMC Consulting drives execution for its clients,
                                    including more than half of the Global Fortune 500 companies, to transform
                                    information into actionable strategies and tangible business results.




CONTACT US
For more information, visit
www.EMC.com/consulting or
contact your local EMC Consulting
representative.




                                    EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the
                                    United States and other countries. © Copyright 2012 EMC Corporation. All rights reserved.
                                    Published in the USA. 08/12 EMC Perspective H10915

                                    EMC believes the information in this document is accurate as of its publication date. The
www.EMC.com                         information is subject to change without notice.

More Related Content

What's hot

Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time AnalyticsMohsin Hakim
 
High-Performance Storage for the Evolving Computational Requirements of Energ...
High-Performance Storage for the Evolving Computational Requirements of Energ...High-Performance Storage for the Evolving Computational Requirements of Energ...
High-Performance Storage for the Evolving Computational Requirements of Energ...Hitachi Vantara
 
Meet the Data Processing Workflow Challenges of Oil and Gas Exploration with ...
Meet the Data Processing Workflow Challenges of Oil and Gas Exploration with ...Meet the Data Processing Workflow Challenges of Oil and Gas Exploration with ...
Meet the Data Processing Workflow Challenges of Oil and Gas Exploration with ...Hitachi Vantara
 
Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsCognizant
 
Pervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityPervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityCloudera, Inc.
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...oj08
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsDavid Portnoy
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Denodo
 
Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time AnalyticsMohsin Hakim
 
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...
Dev Lakhani, Data Scientist at Batch Insights  "Real Time Big Data Applicatio...Dev Lakhani, Data Scientist at Batch Insights  "Real Time Big Data Applicatio...
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...Dataconomy Media
 
Cisco and Greenplum Partner to Deliver High-Performance Hadoop Reference ...
Cisco and Greenplum  Partner to Deliver  High-Performance  Hadoop Reference  ...Cisco and Greenplum  Partner to Deliver  High-Performance  Hadoop Reference  ...
Cisco and Greenplum Partner to Deliver High-Performance Hadoop Reference ...EMC
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionAppfluent Technology
 
Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data WarehouseCaserta
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training reportSarvesh Meena
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabatinabati
 

What's hot (20)

Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time Analytics
 
High-Performance Storage for the Evolving Computational Requirements of Energ...
High-Performance Storage for the Evolving Computational Requirements of Energ...High-Performance Storage for the Evolving Computational Requirements of Energ...
High-Performance Storage for the Evolving Computational Requirements of Energ...
 
Meet the Data Processing Workflow Challenges of Oil and Gas Exploration with ...
Meet the Data Processing Workflow Challenges of Oil and Gas Exploration with ...Meet the Data Processing Workflow Challenges of Oil and Gas Exploration with ...
Meet the Data Processing Workflow Challenges of Oil and Gas Exploration with ...
 
Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical Workloads
 
Pervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityPervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricity
 
1 ieee98
1 ieee981 ieee98
1 ieee98
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse Platforms
 
Ch03
Ch03Ch03
Ch03
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?
 
Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time Analytics
 
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...
Dev Lakhani, Data Scientist at Batch Insights  "Real Time Big Data Applicatio...Dev Lakhani, Data Scientist at Batch Insights  "Real Time Big Data Applicatio...
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...
 
Hadoop & Data Warehouse
Hadoop & Data Warehouse Hadoop & Data Warehouse
Hadoop & Data Warehouse
 
Cisco and Greenplum Partner to Deliver High-Performance Hadoop Reference ...
Cisco and Greenplum  Partner to Deliver  High-Performance  Hadoop Reference  ...Cisco and Greenplum  Partner to Deliver  High-Performance  Hadoop Reference  ...
Cisco and Greenplum Partner to Deliver High-Performance Hadoop Reference ...
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
 
Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data Warehouse
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training report
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
Big data analytics - hadoop
Big data analytics - hadoopBig data analytics - hadoop
Big data analytics - hadoop
 

Similar to Use Big Data Technologies to Modernize Your Enterprise Data Warehouse

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
 
Migration services (DB2 to Teradata)
Migration services (DB2  to Teradata)Migration services (DB2  to Teradata)
Migration services (DB2 to Teradata)ModakAnalytics
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundationshktripathy
 
Analyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast DataAnalyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast DataEMC
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsJane Roberts
 
Redefining Data Analytics Through Search
Redefining Data Analytics Through SearchRedefining Data Analytics Through Search
Redefining Data Analytics Through SearchConnexica
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Dataintensive
DataintensiveDataintensive
Dataintensivesulfath
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Denodo
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End UsersFrom Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End UsersDenodo
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...DATAVERSITY
 

Similar to Use Big Data Technologies to Modernize Your Enterprise Data Warehouse (20)

Big Data
Big DataBig Data
Big Data
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Migration services (DB2 to Teradata)
Migration services (DB2  to Teradata)Migration services (DB2  to Teradata)
Migration services (DB2 to Teradata)
 
Data mining notes
Data mining notesData mining notes
Data mining notes
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
Analyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast DataAnalyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast Data
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Redefining Data Analytics Through Search
Redefining Data Analytics Through SearchRedefining Data Analytics Through Search
Redefining Data Analytics Through Search
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Dataintensive
DataintensiveDataintensive
Dataintensive
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End UsersFrom Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
AtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White PapaerAtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White Papaer
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 

More from EMC

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDEMC
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 

More from EMC (20)

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 

Recently uploaded

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Recently uploaded (20)

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Use Big Data Technologies to Modernize Your Enterprise Data Warehouse

  • 1. USE BIG DATA TECHNOLOGIES TO MODERNIZE YOUR ENTERPRISE DATA WAREHOUSE Most organizations’ enterprise data warehouses were built with online transaction processing (OLTP)-centric technologies and architectures that are 15-20 years old. Over the years, more data has been bolted on to these systems, and the query load being driven by both traditional and mobile business intelligence products has increased exponentially, resulting in brittle, over-burdened, costly data warehouses that can take hours to return results. They don’t meet the growing data appetite of the business, and don’t answer the questions needed to run the business at the required levels of granularity, or at the necessary speed. Yet too much has been invested in them to simply throw them out. Big Data market dynamics have resulted in the creation of new technologies, products, and approaches that can be used to modernize these stodgy, inflexible data warehouses, and make them more responsive to the business—without throwing out what is already in place. This paper describes four tactics that can be implemented quickly, using an organization’s existing skill sets, and that can EMC PERSPECTIVE rapidly show a return on investment.
  • 2. TACTIC #1: ACCELERATE YOUR DATA WAREHOUSE WITH MPP-BASED ARCHITECTURES Massively Parallel Processing (MPP)-based databases provide a cost effective, scale- out data warehouse environment that allows organizations to leverage Moore’s Law1 on performance-to-cost ratio improvements in x86 processors. MPP databases provide BENEFITS a non-intrusive analytical platform/data warehouse for data discovery and exploratory work on massive amounts of data. Built on inexpensive commodity clusters, MPP Leverage more detailed, databases can extend, complement, or replace parts of your existing data warehouse, more robust dimensional managing massive volumes of detailed data, while providing agile query, reporting, data dashboards, and analytics (see Figure 1). • Seasonality to forecast retail sales and energy MPP databases, while offering many of the same benefits as your existing data consumption warehouse, also provide the following advantages: • Localization to pin point • Extreme scalability on general purpose systems lending or fraud exposure • Automatic parallelization • Hyper-dimensionality for • Ability to load and query like any other database digital media attribution • Scanning and processing of all nodes in parallel or health care treatment analysis • Extreme scalability and optimized I/O • Linear scalability to easily add nodes and storage • Improved query and loading performance   Figure  1:  MPP  Data  Warehouse  Architectures  Scale  Easily  to  Speed  Results  and  Process  More   Data     Figure  1:  MPP  Data  Warehouse  Architectures  Scale  Easily  to  Speed  Results  and  Process  More   Data   1 Moore's law is the observation that over the history of computing hardware, the number of transistors on integrated circuits doubles approximately every two years. The result is the doubling of computing power at the same cost every 18 to 24 months. http://en.wikipedia.org/wiki/Moore%27s_law
  • 3. An MPP data warehouse will enable more granular data for query, reporting, and dashboard drill-down and drill-across exploration. Analysis can be performed on detailed data instead of data aggregates. On the analytics side, once a model has been developed and business insights have been gleaned from these data sets, simply migrate the model and/or the insights into the existing data warehouse for integration into the current business intelligence environment. Alternatively, analytic modeling can also be done on the MPP platform, making it part of the production process. TACTIC #2: STOP MOVING DATA TO THE ANALYTICS; BRING THE ANALYTICS TO THE DATA BENEFITS One of the most dramatic developments in Big Data is the advent of in-database analytics. In-database analytics addresses one of biggest shortcomings in performing Leverage low-latency (high- advanced analytics—the requirement to move large amounts of data around. That has velocity) data access caused many organizations and data scientists to have to settle with working with • Drive realtime customer aggregate tables because the data transfer issue is so debilitating to the analytic acquisition, predictive exploration and discovery process. In-database analytics reverses the process by maintenance, or network moving the analytic algorithms to where the data is stored, accelerating the optimization decisions development and deployment of modeling. Elimination of data movement results in substantial benefits: • Update analytic models on-demand based upon • Moving a few terabytes can take hours. With in-database analytics, it drops to current market or local zero. weather conditions • Because the movement of data is the most time-consuming activity in logical processing time, reducing data movement reduces the processing time by 1/N, where N is the number of processing units. Processing time for 1 TB can be reduced by a factor of 16 with only a five-processor system, going from 193 minutes to 12 minutes (see Figure 2).     Figure  2:  In-­‐database  Analytics  Dramatically  Speeds  Processing  Time  
  • 4. TACTIC #3: USE ALL OF YOUR DATA WITH A BENEFITS NEXT GENERATION OPERATIONAL DATA STORE The Hadoop Distributed File System (HDFS) provides a powerful yet inexpensive Manage a wide variety of option for modernizing Operational Data Store (ODS) and Data Staging areas. HDFS structured and unstructured data sources is a cost-effective, large storage system with an intrinsic computing and analytical capability (MapReduce). Built on commodity clusters, HDFS simplifies the acquisition • Integrate unstructured and storage of diverse data sources, whether structured, semi-structured (e.g., web claims descriptions to logs and sensor feeds), or unstructured (e.g., social media, image, video, and audio). reduce fraudulent claims Once in the Hadoop file system, MapReduce and commercial Hadoop-based tools are • Leverage mobile data to available to prepare the data for loading into an existing data warehouse. The ability create realtime to “define schema on query” versus “define schema on load” simplifies amassing data promotions from a variety of sources, even if you are not sure when and how you might use that data later (see Figure 3). • Leverage sensor readings to optimize yield and The result is a single platform for feeding both your data warehouse and analytics pricing environment. This inexpensive, scale-out solution can be used to store ALL of your data. BENEFITS Figure  3:  Use  Hadoop  as  an  Operational  Data  Store  and  Analyze  ALL  of  the  Data   Leverage new metrics, dimensions, and dimensional attributes gleaned from unstructured TACTIC #4: LEVERAGE UNSTRUCTURED DATA data sources TO ADD NEW METRICS TO AN ENTERPRISE • Leverage customers’ interests, passions, DATA WAREHOUSE associations, and An easy way to start building experience with Hadoop and MapReduce is to use these affiliations to improve technologies to create new metrics from an unstructured data source that can be fed micro-segmentation into the enterprise data warehouse. This will provide the ability to leverage data such as social, mobile, consumer comments, email, doctors’ notes, or claims descriptions • Add sensor-generated to identify new metrics that are better predictors of performance. Most organizations’ performance data into existing data warehouses are treasure troves of key performance indicators and your manufacturing, metrics used to monitor business performance. Use Hadoop and MapReduce to parse supply chain, or product through unstructured data to identify new business performance metrics that can be predictive maintenance integrated into the existing data warehouse (see Figure 4). models
  • 5. Figure  4:  Parse  Unstructured  Data  Using  Hadoop/MapReduce  and  Incorporate  Results  into   the  Enterprise  Data  Warehouse   Once these new metrics are in the enterprise data warehouse, they can be used to enhance existing business intelligence queries, reports, dashboards, and analyses (see Figure 5). Figure  5:  Integrate  Social  Media  Metrics  into  the  Existing  BI  Environment   Note: implementing this tactic places companies in a good position as Hadoop continues its assimilation into the relational database market. Being able to create metrics and process data on Hadoop, leveraging tools like HBase and Hive that are evolving quickly, and having BI tools connect directly to HDFS, may make data warehouse professionals question why they need to move data to a relational database at all. MODERNIZE YOUR DATA WAREHOUSE TODAY In the world of revolutionary, game-changing Big Data developments, data warehouse modernization may sound like an evolutionary development. However, it is something that can be executed today, with existing data warehouse skills, and represents a simple first step toward gleaning immediate business value and organizational agility from Big Data technologies. Why are you waiting?
  • 6. EMC CONSULTING As part of EMC® Corporation, the world’s leading developer and provider of information infrastructure technology and solutions, EMC Consulting provides strategic guidance and technology expertise to help organizations exploit information to its maximum potential. With worldwide expertise across organizations’ businesses, applications, and infrastructures, as well as deep industry understanding, EMC Consulting guides and delivers revolutionary thinking to help clients realize their ambitions in an information economy. EMC Consulting drives execution for its clients, including more than half of the Global Fortune 500 companies, to transform information into actionable strategies and tangible business results. CONTACT US For more information, visit www.EMC.com/consulting or contact your local EMC Consulting representative. EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries. © Copyright 2012 EMC Corporation. All rights reserved. Published in the USA. 08/12 EMC Perspective H10915 EMC believes the information in this document is accurate as of its publication date. The www.EMC.com information is subject to change without notice.