SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
Modern Data Integration
www.diyotta.com
Whitepaper
Page | 2 www.diyotta.com
Table of contents
Preface(by Jonathan Wu)...................................................................................................................................3
The Pardigm Shift ........................................................................................................................................ 4
The Shift in Data.................................................................................................................................................5
The Shift in Complexity......................................................................................................................................6
New Challenges Require New Approaches ....................................................................................................6
Big Data Changes Everything............................................................................................................................7
Five Principles of Modern Data Integration............................................................................................... 7
Five Capabilities of Modern Data Integration............................................................................................ 9
Modern Data Integration in Action................................................................................................................10
Looking Beyond Modernization ............................................................................................................... 11
Page | 3 www.diyotta.com
Preface by Jonathan Wu
The competitive environment of business continues to accelerate due to technology. Those organizations
that are able to utilize information to analyze activities and trends, and create new insights are leading their
industries. Business leaders rely on fact based decision-making and information analysis for their
competitive advantage. Innovations in technology ranging from automation of manual activities to the
interconnectivity of devices for the Internet of Things are generating overwhelming amounts of data.
While Big Data technologies have addressed the storage and access of vast amounts of data, legacy data
integration approaches have not evolved to keep up. The process of moving data from source to target
through a data integration application server causes Big Data bottlenecks and simply does not work. Many
IT professionals are forced to abandon existing data integration applications for manually creating code
scripts. While custom code can solve the throughput problem, it creates a whole new set of problems
associated with maintenance, documentation, and knowledge transference.
Sanjay Vyas and his team have challenged conventional thinking about data integration and have created
the principles of modern data integration that are needed today’s Big Data environment.
Challenging All Conventions
Jonathan has over thirty years of experience with Business Intelligence, Data Warehousing, Data Integration,
and Information Management solutions. He has held various executive leadership positions with several leading
Business Intelligence companies.
Find Jonathan on LinkedIn at https://www.linkedin.com/in/wujonathan
Page | 4 www.diyotta.com
The Paradigm Shift
There is a major paradigm shift taking place in the data industry. Everyday we see massive amounts of new
data being created. Unlike ever before, this data is moving fluidly in many directions; from on- premise to
the cloud, cloud to on premise, and people are accessing data from all over the place. The old paradigms of
data integration have become unsuitable in this new data landscape. It is time for a paradigm shift. It’s time
to make the move to modern data integration.
Think about where we’ve come in the last twenty years. When data integration was just emerging on the
scene in the mid-to-late nineties, the sources for data were limited. Most of the data came from
mainframes, operational systems that were built on relational databases, and data service providers. The
targets were primarily data warehouses and data marts, with an occasional loop back to operational
systems for closed loop business intelligence. It is no surprise that most of the prevailing integration tools
we use today were created in a vastly different world with very different demands.
Most data integration software was built to run data through ETL servers. It worked well at the time for
several reasons: there wasn’t that much data—1TB was considered a large amount of data at the time;
most data was structured, and the turnaround time for that data was monthly. Even back then, daily
loads became a problem for most companies. Because of the limitations of the early tools, much of the
work was hand-coded, without documentation, and no central management. If these legacy data
integration tools had grave challenges back then, consider how much more obsolete they are in today’s
big data world.
Page | 5 www.diyotta.com
The Shift in Data
We are already well into the age of big data with more data than ever before, “From 2013 to 2020, the
digital universe will grow by a factor of 10 – from 4.4 trillion gigabytes to 44 trillion. It more than doubles
every two years.”1
New data and new data types are emerging every day with very limited structure. In fact, 80% of all
enterprise data is unstructured or semi-structured.2
Data is also coming from many different places. There are a growing number of mobile devices and
applications producing data; 20 billion devices are already connected to the Internet and by 2020, this
number will grow by 50% to 30 billion connected devices.3
Data in SaaS applications and the cloud will also continue to grow at record pace. In 2013, about 20% of the
data in the digital universe was “touched” by the cloud; either stored, perhaps temporarily, or processed in
some way. By 2020, that percentage will double to 40%.4
Data is pouring out of thousands of new API’s that are created by apps and the Internet of Things. There
are billions of sensors and devices creating trillions of time stamped events and the number of sensors will
soon number in the trillions. “Fed by sensors soon to number in the trillions, working with intelligent
systems in the billions, and involving millions of applications, the Internet of Things will drive new
consumer and business behavior that will demand increasingly intelligent industry solutions…”5
1
EMC Digital Universe Study, 2014 – Executive Summary
(http://www.emc.com/leadership/digitaluniverse/2014iview/executive-summary.htm)
2
According to IDC, Forrester and Gartner (http://www.eweek.com/storage/slideshows/managing-massive-
unstructured-data-troves-10-best-practices#sthash.KAbEigHX.dpuf)
3
Ibid.
4
EMC Digital Universe Study, 2014 – Executive Summary
(http://www.emc.com/leadership/digitaluniverse/2014iview/executive-summary.htm)
5
EMC Digital Universe Study, 2014 – Internet of Things
(http://www.emc.com/leadership/digitaluniverse/2014iview/internet-of-things.htm)
Page | 6 www.diyotta.com
The Shift in Complexity
To further complicate things, the number of platforms and targets where data is being landed has also
grown to include many new open source and purpose-built platforms. In addition, there has been a rapid
expansion in users. The mix of data users now includes all classes of data users and analysts, plus
applications, machines, mobile users, and massive communities of data-savvy, end users. What used to be
hundreds and thousands of users has now grown to hundreds of thousands of users, and in some cases,
millions of users.
Complexity continues to grow with data that used to be stored on premise, now also being stored in the
cloud and in SaaS environments. The users that were formerly confined to data on premise are also now
accessing data in the cloud and SaaS data stores. Data that used to flow in one direction now flows in all
directions. We now live in a matrix of data where the complexity of sources and targets continues to grow
incrementally and there is no end.
New Challenges Require New Approaches
As the saying goes, “You cannot put new wine in old wineskins.” If you do, the wineskins burst. That’s
exactly what is happening with companies who are trying to use legacy in ETL (Extract, Transform, and
Load) technology to process and provision data in today’s interwoven world of data.
An ETL server becomes a huge bottleneck; flowing everything through a single server image creates
massive overuse of networks. Many of the legacy integration tools have been re-engineered to handle new
kinds of data, but there are only so many add-ons you can try to attach to old technology. Multiple
connectors and modules bring even more complexity into an already complex problem. Trying to bring an
“old world” tool into this new landscape requires massive amounts of hand-coding for the growing mix of
data types, sources, and API’s. Thus, code management continues to be a nightmare for those making the
move to big data.
Meanwhile, lack of documentation and limited reusability means that the work has to be redone every time
new data is added or new platforms are implemented. Ultimately, this amounts to more money spent,
more time and resources wasted, and less effort spent on bottom line efforts and pure innovation. Now
more than ever before, new challenges require new approaches.
Page | 7 www.diyotta.com
Big Data Changes Everything
This story wouldn’t be complete without factoring in the impact of big data and Hadoop. Hadoop adoption
is skyrocketing, driven by three main drivers:
 It solves technology challenges like scalability, performance, and maintainability.
 It empowers the business to forge new ground in both discovery and predictive analytics.
 It is affordable enough to allow companies to keep all of their data and discover the value later.
While some still question the longevity of the Hadoop explosion, at this point, the game is over. Hadoop is
here to stay and gaining more ground every day. The pace at which we are moving is no longer linear, but
exponential; so now is the time to embrace all that we can do with big data.
Hadoop must also live within the existing corporate and internet environment. However, it is not the be-all
and end-all of data platforms. There are many things that it does well, but there are also many things that
are done better on other kinds of platforms—like analytic platforms. For this reason, Gartner has come up
with the Logical Data Warehouse and the Context-Aware Data Warehouse. Essentially, multiple platforms
must live and work together.
Even though Hadoop may end up being a big player in big data, today it is one of many players. As a result,
some of the greatest challenges involve moving data into and out of Hadoop. Hadoop is both a landing
spot for massive data coming in and a provisioning platform for processed data coming out.
Five Principles of Modern Data Integration
We’ve established that we live in a new era of data. It’s not just big data, it is modern data with new types,
sources, volumes, and locations of data that we have never before experienced. We also live an era with
new solutions emerging to meet the most pressing data integration challenges.
Modern data is more complex and more distributed than anything we have encountered in the past, and if
we are going to take advantage of it, then we must connect emerging data with modern data integration.
It’s time for a shift in paradigm so that we can spend more energy gaining insight from our data rather
than fumbling around with it.
Page | 8 www.diyotta.com
There are five key principles of modern data integration that unlock unprecedented new insight from the
matrix of data that surrounds us. Working together, they take advantage of the emergence of new data
and new platforms, rather than fighting against the rising tide.
 Take the processing to where the data lives.
There is too much data to move every time you need blend or transform it. It makes more sense to
place agents near where the data lives and process it locally. Carefully coordinated instructions can
be sent to the agent, and the work is done on the host platform before any data is moved. By
taking the processing to where the data lives, you eliminate the bottleneck of the ETL server and
decrease the movement of data across the network.
 Fully leverage all platforms based on what they were designed to do well.
You’ve already invested in powerful databases and you are making new investments in modern
data platforms. Every platform has a set of workloads that it handles well and a set of built-in
functions. Modern Data Integration allows you to call those native functions locally, process them
within powerful platforms, and distribute the workload in the most efficient way possible. By fully
leveraging existing platforms, you increase the performance of all of your data blending and
enrichment while minimizing data movement.
 Move data point-to-point to avoid single server bottlenecks.
Since data is moving in all directions, it is vital to establish processes that move data point-to-point,
rather than through a server. This new approach gives you the ability to move data at the right
time. There are times when you want to move all of your data; but most of the time, it makes more
sense to process it in its native environment, and then move the result set. By moving data point-
to-point and eliminating server bottlenecks, you get massive network savings and increase the
speed at which you transfer data.
 Manage all of the business rules and data logic centrally.
Operating dispersed integration applications creates chaos in today’s changing data ecosystem.
New data and modern data platforms must be managed centrally with all business rules and data
logic in a single design studio. You can only do this in an architecture where a central design studio
is completely separate from local processing agents using native functions. Managing all of your
business rules and data logic centrally gives you complete transparency, accessible lineage, and
maximum reuse. You’ll never have to waste your time redesigning flows.
 Make changes using existing rules and logic.
With all management handled centrally, you are also able to keep all the existing business rules
and data logic templates in the metadata repository. As a result, when changes need to be made
with new data, new platforms or even migrations from one platform to another; you are able to
make those changes quickly, using the existing rules and logic.
Page | 9 www.diyotta.com
Five Capabilities of Modern Data Integration
When you make the move to modern data integration, you will eliminate many of the challenges brought
on by legacy technology. In addition, the new paradigm brings with it five new capabilities, well beyond
anything you have experienced with old technology:
 Design once and use many times.
Reuse is not a new concept, but the level of reuse possible after modernization will be significantly
increased. For instance, if you are moving from Oracle to Netezza today, and in six months you
want to move from Netezza to Hadoop, you already have 80% of all of that business rules and data
logic stored for reuse.
 Gain complete knowledge of your data.
In modern data integration, much of the metadata is generated based on what you read out of the
source platform. You’ll capture the remainder of the metadata as you go through the process of
designing your blending and enriching. Ultimately, you will know everything you need to know
about your data and have it all stored in one central location.
 Manage complex data environments without complexity.
By storing your logic centrally, deploying agents locally, and sending out instructions as needed,
you remove the complexity of data flowing in all directions through many different servers. As your
environment become more complex and lives in hybrid environments, you’ll be able to deploy
multiple agents to wherever the data lives while operating all of your data from one central
location, in one consistent manner.
Page | 10 www.diyotta.com
 Optimize your actions.
When you are moving data from one platform to another, you need to be looking at the complete
set of transformations that need to happen and assign them to the highest performing platform.
This allows you to optimize performance across your entire data matrix. For example, you can send
your analytic workloads to an analytic platform and your text processing to Hadoop.
 Adapt quickly.
All businesses appreciate action. Having the ability to change, extend and migrate existing business
logic and data flow designs into future is, perhaps, the most valuable aspect of modernization. A
typical business logic change request often meets the response, “That’s nice. Let’s wait six months.”
One of the great things about modern data integration is being able to respond quickly to those
business needs.
Modern Data Integration in Action
We recently worked with a large international bank with a footprint across the globe—with 25 million
customers in 55 countries, 86K employees, 22 billion in revenue, 7.3 billion in net worth—and they are
projected to grow. They needed to improve customer insight by expanding their data warehouse
capabilities. They were tired of the high cost of their old traditional warehouses and the inflexibility in
those platforms to deal with large volumes of historical data, different variety of data and ability to get
intraday data from various parts of the organization including mortgage, to real estate, securities and line
of credit . So, they made a decision to switch to Hadoop platform.
As we came in, the entire proof of value was done in two days. We were able to do the entire design
process for the migration in under two days, all data and definitions were moved into Hadoop in under a
week, , and they are now moving over 200 tables with 10GB of data daily delivering valuable insights to
their business users for further analysis
Ultimately, this was a migration project they thought was going to take six months, but we got it done in
just three weeks. They’re using 100% of their existing resources and they were able to lower costs by more
than 50% compared to the competitive offerings of the ETL approach. Beyond that, they are able to reuse
80% of what they did in this migration for future migrations. So, if they want to move to Spark in six
months, they will be able to reuse all that they’ve done.
Looking Beyond Modernization
Think about what lies ahead on this road to unleashing new paradigms. The emerging data landscape is
only accelerating. The transition to stream data on-demand, in real-time will soon become the norm in
business. Businesses looking at monthly reports will look at daily reports; daily reports will become hourly
reports and even minutes and seconds.
As an executive it will be essential to leverage real-time data, to know exactly what is happening in your
business right now. You need to know anything that might damage or benefit your business immediately.
Where is fraud happening in the financial sector? How are your customers are pleased
Page | 11 www.diyotta.com
or even dissatisfied? How are buying patterns affecting your business? Processing all that data in real-time
will only grow in necessity. You need the quickest time to value outcome, transparency in your data, and
adaptability to meet these needs.
Technology may be changing, but the requirements are not. You need to be able to design your data
movement, blending and transformation with complete transparency and unlimited reuse. You must be
able to accelerate the execution of ongoing data enrichment and analysis by deploying agents that deliver
instructions and optimize actions. It is essential that you adapt your environments to meet the ever-
changing business and technical demands; migrating data from platform to platform should be effortless
in these changing times.
Now that you know the values, how will modern data integration influence the most pressing projects in
your enterprise? We’ve built a tool that encompasses all five principles and capabilities of modern data
integration. Diyotta helps you leverage what is happening across all of your platforms, turning big data into
an information hub. We are the industry’s first data provisioning platform, purpose-built for big data—
hybrid system and the cloud—by the industries top data integration experts. Only we offer a design-once,
multiple use architecture for rapid change of both data and platforms. We’ve built a tool that encompasses
all five principles and capabilities of modern data integration.
Let’s get started by arranging a deep dive of our offerings, providing access to data and platforms to
further demonstrate value, or even installing Diyotta to evaluate and guide the internal use of the product.
Take control of your data with Diyotta.
Page | 12 www.diyotta.com
Diyotta, Inc.3700 Arco CorporateDrive,Suite #410, Charlotte, NC 28273. +1-704-817-4646,
+1-888-365-4230, +1-877-813-1846
© 2015 diyotta.com , All Rights Reserved

Contenu connexe

Tendances

Age Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big DataAge Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big DataAgeFriendlyEconomy
 
Big Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceBig Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceDavid Feinleib
 
Analysis on big data concepts and applications
Analysis on big data concepts and applicationsAnalysis on big data concepts and applications
Analysis on big data concepts and applicationsIJARIIT
 
A next generation introduction to data science and its potential to change bu...
A next generation introduction to data science and its potential to change bu...A next generation introduction to data science and its potential to change bu...
A next generation introduction to data science and its potential to change bu...InnoTech
 
Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...
Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...
Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...Taniya Fansupkar
 
IDC Report Extracting-Value-from-Chaos
IDC Report Extracting-Value-from-ChaosIDC Report Extracting-Value-from-Chaos
IDC Report Extracting-Value-from-ChaosLarry Levine
 
Big data and digital ecosystem mark skilton jan 2014 v1
Big data and digital ecosystem mark skilton jan 2014 v1Big data and digital ecosystem mark skilton jan 2014 v1
Big data and digital ecosystem mark skilton jan 2014 v1Mark Skilton
 
Open Innovation - Winter 2014 - Socrata, Inc.
Open Innovation - Winter 2014 - Socrata, Inc.Open Innovation - Winter 2014 - Socrata, Inc.
Open Innovation - Winter 2014 - Socrata, Inc.Socrata
 
O'Reilly eBook: Creating a Data-Driven Enterprise in Media | eubolr
O'Reilly eBook: Creating a Data-Driven Enterprise in Media | eubolrO'Reilly eBook: Creating a Data-Driven Enterprise in Media | eubolr
O'Reilly eBook: Creating a Data-Driven Enterprise in Media | eubolrVasu S
 
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...Stuart Blair
 
Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Stefano A Gazziano
 

Tendances (19)

Age Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big DataAge Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big Data
 
Big Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceBig Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 Conference
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Analysis on big data concepts and applications
Analysis on big data concepts and applicationsAnalysis on big data concepts and applications
Analysis on big data concepts and applications
 
A next generation introduction to data science and its potential to change bu...
A next generation introduction to data science and its potential to change bu...A next generation introduction to data science and its potential to change bu...
A next generation introduction to data science and its potential to change bu...
 
Big data survey
Big data surveyBig data survey
Big data survey
 
Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...
Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...
Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...
 
Benefits of big data
Benefits of big dataBenefits of big data
Benefits of big data
 
The state of the Big Data market
The state of the Big Data marketThe state of the Big Data market
The state of the Big Data market
 
Leveraging IOT and Latest Technologies
Leveraging IOT and Latest TechnologiesLeveraging IOT and Latest Technologies
Leveraging IOT and Latest Technologies
 
IDC Report Extracting-Value-from-Chaos
IDC Report Extracting-Value-from-ChaosIDC Report Extracting-Value-from-Chaos
IDC Report Extracting-Value-from-Chaos
 
Internet_of_Things_CEO_ magazine
Internet_of_Things_CEO_ magazineInternet_of_Things_CEO_ magazine
Internet_of_Things_CEO_ magazine
 
Big data and digital ecosystem mark skilton jan 2014 v1
Big data and digital ecosystem mark skilton jan 2014 v1Big data and digital ecosystem mark skilton jan 2014 v1
Big data and digital ecosystem mark skilton jan 2014 v1
 
The 25 Predictions About The Future Of Big Data
The 25 Predictions About The Future Of Big DataThe 25 Predictions About The Future Of Big Data
The 25 Predictions About The Future Of Big Data
 
Open Innovation - Winter 2014 - Socrata, Inc.
Open Innovation - Winter 2014 - Socrata, Inc.Open Innovation - Winter 2014 - Socrata, Inc.
Open Innovation - Winter 2014 - Socrata, Inc.
 
O'Reilly eBook: Creating a Data-Driven Enterprise in Media | eubolr
O'Reilly eBook: Creating a Data-Driven Enterprise in Media | eubolrO'Reilly eBook: Creating a Data-Driven Enterprise in Media | eubolr
O'Reilly eBook: Creating a Data-Driven Enterprise in Media | eubolr
 
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
 
Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2
 
The M2M platform for a connected world
The M2M platform for a connected worldThe M2M platform for a connected world
The M2M platform for a connected world
 

Similaire à Modern data integration | Diyotta

A Special Report on Infrastructure Futures: Keeping Pace in the Era of Big Da...
A Special Report on Infrastructure Futures: Keeping Pace in the Era of Big Da...A Special Report on Infrastructure Futures: Keeping Pace in the Era of Big Da...
A Special Report on Infrastructure Futures: Keeping Pace in the Era of Big Da...IBM India Smarter Computing
 
Qtility Whitepaper - Integration Innovation 2015 - Trends In Content Management
Qtility Whitepaper - Integration Innovation 2015 - Trends In Content ManagementQtility Whitepaper - Integration Innovation 2015 - Trends In Content Management
Qtility Whitepaper - Integration Innovation 2015 - Trends In Content Managementclarkems
 
Platform 3.0 Ripe to Give Standard Access to Advanced Intelligence and Automa...
Platform 3.0 Ripe to Give Standard Access to Advanced Intelligence and Automa...Platform 3.0 Ripe to Give Standard Access to Advanced Intelligence and Automa...
Platform 3.0 Ripe to Give Standard Access to Advanced Intelligence and Automa...Dana Gardner
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big DataSonovate
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Oomph! Recruitment
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Thingspateelhs
 
The Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | QuboleThe Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | QuboleVasu S
 
Process oriented architecture for digital transformation 2015
Process oriented architecture for digital transformation   2015Process oriented architecture for digital transformation   2015
Process oriented architecture for digital transformation 2015Vinay Mummigatti
 
Attaining IoT Value: How To Move from Connecting Things to Capturing Insights
Attaining IoT Value: How To Move from Connecting Things to Capturing InsightsAttaining IoT Value: How To Move from Connecting Things to Capturing Insights
Attaining IoT Value: How To Move from Connecting Things to Capturing InsightsSustainable Brands
 
Big Data
Big DataBig Data
Big DataBBDO
 
141900791 big-data
141900791 big-data141900791 big-data
141900791 big-dataglittaz
 
Big Data idea implementation in organizations: potential, roadblocks
Big Data idea implementation in organizations: potential, roadblocksBig Data idea implementation in organizations: potential, roadblocks
Big Data idea implementation in organizations: potential, roadblocksIRJET Journal
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.saranya270513
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data CentersGina Buck
 
Unlocking-Business-Value-Through-Industrial-Data-Management-whitepaper.pdf
Unlocking-Business-Value-Through-Industrial-Data-Management-whitepaper.pdfUnlocking-Business-Value-Through-Industrial-Data-Management-whitepaper.pdf
Unlocking-Business-Value-Through-Industrial-Data-Management-whitepaper.pdfJaninePimentel4
 
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...Comarch_Services
 

Similaire à Modern data integration | Diyotta (20)

A Special Report on Infrastructure Futures: Keeping Pace in the Era of Big Da...
A Special Report on Infrastructure Futures: Keeping Pace in the Era of Big Da...A Special Report on Infrastructure Futures: Keeping Pace in the Era of Big Da...
A Special Report on Infrastructure Futures: Keeping Pace in the Era of Big Da...
 
Qtility Whitepaper - Integration Innovation 2015 - Trends In Content Management
Qtility Whitepaper - Integration Innovation 2015 - Trends In Content ManagementQtility Whitepaper - Integration Innovation 2015 - Trends In Content Management
Qtility Whitepaper - Integration Innovation 2015 - Trends In Content Management
 
Platform 3.0 Ripe to Give Standard Access to Advanced Intelligence and Automa...
Platform 3.0 Ripe to Give Standard Access to Advanced Intelligence and Automa...Platform 3.0 Ripe to Give Standard Access to Advanced Intelligence and Automa...
Platform 3.0 Ripe to Give Standard Access to Advanced Intelligence and Automa...
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Things
 
The Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | QuboleThe Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | Qubole
 
Process oriented architecture for digital transformation 2015
Process oriented architecture for digital transformation   2015Process oriented architecture for digital transformation   2015
Process oriented architecture for digital transformation 2015
 
Attaining IoT Value: How To Move from Connecting Things to Capturing Insights
Attaining IoT Value: How To Move from Connecting Things to Capturing InsightsAttaining IoT Value: How To Move from Connecting Things to Capturing Insights
Attaining IoT Value: How To Move from Connecting Things to Capturing Insights
 
Big Data
Big DataBig Data
Big Data
 
141900791 big-data
141900791 big-data141900791 big-data
141900791 big-data
 
Transforming Big Data into business value
Transforming Big Data into business valueTransforming Big Data into business value
Transforming Big Data into business value
 
Big Data idea implementation in organizations: potential, roadblocks
Big Data idea implementation in organizations: potential, roadblocksBig Data idea implementation in organizations: potential, roadblocks
Big Data idea implementation in organizations: potential, roadblocks
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
08258937bigdata en la industria 4.0
08258937bigdata en la industria 4.008258937bigdata en la industria 4.0
08258937bigdata en la industria 4.0
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
 
Big Data - CRM's Promise Land
Big Data - CRM's Promise LandBig Data - CRM's Promise Land
Big Data - CRM's Promise Land
 
Unlocking-Business-Value-Through-Industrial-Data-Management-whitepaper.pdf
Unlocking-Business-Value-Through-Industrial-Data-Management-whitepaper.pdfUnlocking-Business-Value-Through-Industrial-Data-Management-whitepaper.pdf
Unlocking-Business-Value-Through-Industrial-Data-Management-whitepaper.pdf
 
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
 
130214 copy
130214   copy130214   copy
130214 copy
 

Dernier

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 

Dernier (20)

E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 

Modern data integration | Diyotta

  • 2. Page | 2 www.diyotta.com Table of contents Preface(by Jonathan Wu)...................................................................................................................................3 The Pardigm Shift ........................................................................................................................................ 4 The Shift in Data.................................................................................................................................................5 The Shift in Complexity......................................................................................................................................6 New Challenges Require New Approaches ....................................................................................................6 Big Data Changes Everything............................................................................................................................7 Five Principles of Modern Data Integration............................................................................................... 7 Five Capabilities of Modern Data Integration............................................................................................ 9 Modern Data Integration in Action................................................................................................................10 Looking Beyond Modernization ............................................................................................................... 11
  • 3. Page | 3 www.diyotta.com Preface by Jonathan Wu The competitive environment of business continues to accelerate due to technology. Those organizations that are able to utilize information to analyze activities and trends, and create new insights are leading their industries. Business leaders rely on fact based decision-making and information analysis for their competitive advantage. Innovations in technology ranging from automation of manual activities to the interconnectivity of devices for the Internet of Things are generating overwhelming amounts of data. While Big Data technologies have addressed the storage and access of vast amounts of data, legacy data integration approaches have not evolved to keep up. The process of moving data from source to target through a data integration application server causes Big Data bottlenecks and simply does not work. Many IT professionals are forced to abandon existing data integration applications for manually creating code scripts. While custom code can solve the throughput problem, it creates a whole new set of problems associated with maintenance, documentation, and knowledge transference. Sanjay Vyas and his team have challenged conventional thinking about data integration and have created the principles of modern data integration that are needed today’s Big Data environment. Challenging All Conventions Jonathan has over thirty years of experience with Business Intelligence, Data Warehousing, Data Integration, and Information Management solutions. He has held various executive leadership positions with several leading Business Intelligence companies. Find Jonathan on LinkedIn at https://www.linkedin.com/in/wujonathan
  • 4. Page | 4 www.diyotta.com The Paradigm Shift There is a major paradigm shift taking place in the data industry. Everyday we see massive amounts of new data being created. Unlike ever before, this data is moving fluidly in many directions; from on- premise to the cloud, cloud to on premise, and people are accessing data from all over the place. The old paradigms of data integration have become unsuitable in this new data landscape. It is time for a paradigm shift. It’s time to make the move to modern data integration. Think about where we’ve come in the last twenty years. When data integration was just emerging on the scene in the mid-to-late nineties, the sources for data were limited. Most of the data came from mainframes, operational systems that were built on relational databases, and data service providers. The targets were primarily data warehouses and data marts, with an occasional loop back to operational systems for closed loop business intelligence. It is no surprise that most of the prevailing integration tools we use today were created in a vastly different world with very different demands. Most data integration software was built to run data through ETL servers. It worked well at the time for several reasons: there wasn’t that much data—1TB was considered a large amount of data at the time; most data was structured, and the turnaround time for that data was monthly. Even back then, daily loads became a problem for most companies. Because of the limitations of the early tools, much of the work was hand-coded, without documentation, and no central management. If these legacy data integration tools had grave challenges back then, consider how much more obsolete they are in today’s big data world.
  • 5. Page | 5 www.diyotta.com The Shift in Data We are already well into the age of big data with more data than ever before, “From 2013 to 2020, the digital universe will grow by a factor of 10 – from 4.4 trillion gigabytes to 44 trillion. It more than doubles every two years.”1 New data and new data types are emerging every day with very limited structure. In fact, 80% of all enterprise data is unstructured or semi-structured.2 Data is also coming from many different places. There are a growing number of mobile devices and applications producing data; 20 billion devices are already connected to the Internet and by 2020, this number will grow by 50% to 30 billion connected devices.3 Data in SaaS applications and the cloud will also continue to grow at record pace. In 2013, about 20% of the data in the digital universe was “touched” by the cloud; either stored, perhaps temporarily, or processed in some way. By 2020, that percentage will double to 40%.4 Data is pouring out of thousands of new API’s that are created by apps and the Internet of Things. There are billions of sensors and devices creating trillions of time stamped events and the number of sensors will soon number in the trillions. “Fed by sensors soon to number in the trillions, working with intelligent systems in the billions, and involving millions of applications, the Internet of Things will drive new consumer and business behavior that will demand increasingly intelligent industry solutions…”5 1 EMC Digital Universe Study, 2014 – Executive Summary (http://www.emc.com/leadership/digitaluniverse/2014iview/executive-summary.htm) 2 According to IDC, Forrester and Gartner (http://www.eweek.com/storage/slideshows/managing-massive- unstructured-data-troves-10-best-practices#sthash.KAbEigHX.dpuf) 3 Ibid. 4 EMC Digital Universe Study, 2014 – Executive Summary (http://www.emc.com/leadership/digitaluniverse/2014iview/executive-summary.htm) 5 EMC Digital Universe Study, 2014 – Internet of Things (http://www.emc.com/leadership/digitaluniverse/2014iview/internet-of-things.htm)
  • 6. Page | 6 www.diyotta.com The Shift in Complexity To further complicate things, the number of platforms and targets where data is being landed has also grown to include many new open source and purpose-built platforms. In addition, there has been a rapid expansion in users. The mix of data users now includes all classes of data users and analysts, plus applications, machines, mobile users, and massive communities of data-savvy, end users. What used to be hundreds and thousands of users has now grown to hundreds of thousands of users, and in some cases, millions of users. Complexity continues to grow with data that used to be stored on premise, now also being stored in the cloud and in SaaS environments. The users that were formerly confined to data on premise are also now accessing data in the cloud and SaaS data stores. Data that used to flow in one direction now flows in all directions. We now live in a matrix of data where the complexity of sources and targets continues to grow incrementally and there is no end. New Challenges Require New Approaches As the saying goes, “You cannot put new wine in old wineskins.” If you do, the wineskins burst. That’s exactly what is happening with companies who are trying to use legacy in ETL (Extract, Transform, and Load) technology to process and provision data in today’s interwoven world of data. An ETL server becomes a huge bottleneck; flowing everything through a single server image creates massive overuse of networks. Many of the legacy integration tools have been re-engineered to handle new kinds of data, but there are only so many add-ons you can try to attach to old technology. Multiple connectors and modules bring even more complexity into an already complex problem. Trying to bring an “old world” tool into this new landscape requires massive amounts of hand-coding for the growing mix of data types, sources, and API’s. Thus, code management continues to be a nightmare for those making the move to big data. Meanwhile, lack of documentation and limited reusability means that the work has to be redone every time new data is added or new platforms are implemented. Ultimately, this amounts to more money spent, more time and resources wasted, and less effort spent on bottom line efforts and pure innovation. Now more than ever before, new challenges require new approaches.
  • 7. Page | 7 www.diyotta.com Big Data Changes Everything This story wouldn’t be complete without factoring in the impact of big data and Hadoop. Hadoop adoption is skyrocketing, driven by three main drivers:  It solves technology challenges like scalability, performance, and maintainability.  It empowers the business to forge new ground in both discovery and predictive analytics.  It is affordable enough to allow companies to keep all of their data and discover the value later. While some still question the longevity of the Hadoop explosion, at this point, the game is over. Hadoop is here to stay and gaining more ground every day. The pace at which we are moving is no longer linear, but exponential; so now is the time to embrace all that we can do with big data. Hadoop must also live within the existing corporate and internet environment. However, it is not the be-all and end-all of data platforms. There are many things that it does well, but there are also many things that are done better on other kinds of platforms—like analytic platforms. For this reason, Gartner has come up with the Logical Data Warehouse and the Context-Aware Data Warehouse. Essentially, multiple platforms must live and work together. Even though Hadoop may end up being a big player in big data, today it is one of many players. As a result, some of the greatest challenges involve moving data into and out of Hadoop. Hadoop is both a landing spot for massive data coming in and a provisioning platform for processed data coming out. Five Principles of Modern Data Integration We’ve established that we live in a new era of data. It’s not just big data, it is modern data with new types, sources, volumes, and locations of data that we have never before experienced. We also live an era with new solutions emerging to meet the most pressing data integration challenges. Modern data is more complex and more distributed than anything we have encountered in the past, and if we are going to take advantage of it, then we must connect emerging data with modern data integration. It’s time for a shift in paradigm so that we can spend more energy gaining insight from our data rather than fumbling around with it.
  • 8. Page | 8 www.diyotta.com There are five key principles of modern data integration that unlock unprecedented new insight from the matrix of data that surrounds us. Working together, they take advantage of the emergence of new data and new platforms, rather than fighting against the rising tide.  Take the processing to where the data lives. There is too much data to move every time you need blend or transform it. It makes more sense to place agents near where the data lives and process it locally. Carefully coordinated instructions can be sent to the agent, and the work is done on the host platform before any data is moved. By taking the processing to where the data lives, you eliminate the bottleneck of the ETL server and decrease the movement of data across the network.  Fully leverage all platforms based on what they were designed to do well. You’ve already invested in powerful databases and you are making new investments in modern data platforms. Every platform has a set of workloads that it handles well and a set of built-in functions. Modern Data Integration allows you to call those native functions locally, process them within powerful platforms, and distribute the workload in the most efficient way possible. By fully leveraging existing platforms, you increase the performance of all of your data blending and enrichment while minimizing data movement.  Move data point-to-point to avoid single server bottlenecks. Since data is moving in all directions, it is vital to establish processes that move data point-to-point, rather than through a server. This new approach gives you the ability to move data at the right time. There are times when you want to move all of your data; but most of the time, it makes more sense to process it in its native environment, and then move the result set. By moving data point- to-point and eliminating server bottlenecks, you get massive network savings and increase the speed at which you transfer data.  Manage all of the business rules and data logic centrally. Operating dispersed integration applications creates chaos in today’s changing data ecosystem. New data and modern data platforms must be managed centrally with all business rules and data logic in a single design studio. You can only do this in an architecture where a central design studio is completely separate from local processing agents using native functions. Managing all of your business rules and data logic centrally gives you complete transparency, accessible lineage, and maximum reuse. You’ll never have to waste your time redesigning flows.  Make changes using existing rules and logic. With all management handled centrally, you are also able to keep all the existing business rules and data logic templates in the metadata repository. As a result, when changes need to be made with new data, new platforms or even migrations from one platform to another; you are able to make those changes quickly, using the existing rules and logic.
  • 9. Page | 9 www.diyotta.com Five Capabilities of Modern Data Integration When you make the move to modern data integration, you will eliminate many of the challenges brought on by legacy technology. In addition, the new paradigm brings with it five new capabilities, well beyond anything you have experienced with old technology:  Design once and use many times. Reuse is not a new concept, but the level of reuse possible after modernization will be significantly increased. For instance, if you are moving from Oracle to Netezza today, and in six months you want to move from Netezza to Hadoop, you already have 80% of all of that business rules and data logic stored for reuse.  Gain complete knowledge of your data. In modern data integration, much of the metadata is generated based on what you read out of the source platform. You’ll capture the remainder of the metadata as you go through the process of designing your blending and enriching. Ultimately, you will know everything you need to know about your data and have it all stored in one central location.  Manage complex data environments without complexity. By storing your logic centrally, deploying agents locally, and sending out instructions as needed, you remove the complexity of data flowing in all directions through many different servers. As your environment become more complex and lives in hybrid environments, you’ll be able to deploy multiple agents to wherever the data lives while operating all of your data from one central location, in one consistent manner.
  • 10. Page | 10 www.diyotta.com  Optimize your actions. When you are moving data from one platform to another, you need to be looking at the complete set of transformations that need to happen and assign them to the highest performing platform. This allows you to optimize performance across your entire data matrix. For example, you can send your analytic workloads to an analytic platform and your text processing to Hadoop.  Adapt quickly. All businesses appreciate action. Having the ability to change, extend and migrate existing business logic and data flow designs into future is, perhaps, the most valuable aspect of modernization. A typical business logic change request often meets the response, “That’s nice. Let’s wait six months.” One of the great things about modern data integration is being able to respond quickly to those business needs. Modern Data Integration in Action We recently worked with a large international bank with a footprint across the globe—with 25 million customers in 55 countries, 86K employees, 22 billion in revenue, 7.3 billion in net worth—and they are projected to grow. They needed to improve customer insight by expanding their data warehouse capabilities. They were tired of the high cost of their old traditional warehouses and the inflexibility in those platforms to deal with large volumes of historical data, different variety of data and ability to get intraday data from various parts of the organization including mortgage, to real estate, securities and line of credit . So, they made a decision to switch to Hadoop platform. As we came in, the entire proof of value was done in two days. We were able to do the entire design process for the migration in under two days, all data and definitions were moved into Hadoop in under a week, , and they are now moving over 200 tables with 10GB of data daily delivering valuable insights to their business users for further analysis Ultimately, this was a migration project they thought was going to take six months, but we got it done in just three weeks. They’re using 100% of their existing resources and they were able to lower costs by more than 50% compared to the competitive offerings of the ETL approach. Beyond that, they are able to reuse 80% of what they did in this migration for future migrations. So, if they want to move to Spark in six months, they will be able to reuse all that they’ve done. Looking Beyond Modernization Think about what lies ahead on this road to unleashing new paradigms. The emerging data landscape is only accelerating. The transition to stream data on-demand, in real-time will soon become the norm in business. Businesses looking at monthly reports will look at daily reports; daily reports will become hourly reports and even minutes and seconds. As an executive it will be essential to leverage real-time data, to know exactly what is happening in your business right now. You need to know anything that might damage or benefit your business immediately. Where is fraud happening in the financial sector? How are your customers are pleased
  • 11. Page | 11 www.diyotta.com or even dissatisfied? How are buying patterns affecting your business? Processing all that data in real-time will only grow in necessity. You need the quickest time to value outcome, transparency in your data, and adaptability to meet these needs. Technology may be changing, but the requirements are not. You need to be able to design your data movement, blending and transformation with complete transparency and unlimited reuse. You must be able to accelerate the execution of ongoing data enrichment and analysis by deploying agents that deliver instructions and optimize actions. It is essential that you adapt your environments to meet the ever- changing business and technical demands; migrating data from platform to platform should be effortless in these changing times. Now that you know the values, how will modern data integration influence the most pressing projects in your enterprise? We’ve built a tool that encompasses all five principles and capabilities of modern data integration. Diyotta helps you leverage what is happening across all of your platforms, turning big data into an information hub. We are the industry’s first data provisioning platform, purpose-built for big data— hybrid system and the cloud—by the industries top data integration experts. Only we offer a design-once, multiple use architecture for rapid change of both data and platforms. We’ve built a tool that encompasses all five principles and capabilities of modern data integration. Let’s get started by arranging a deep dive of our offerings, providing access to data and platforms to further demonstrate value, or even installing Diyotta to evaluate and guide the internal use of the product. Take control of your data with Diyotta.
  • 12. Page | 12 www.diyotta.com Diyotta, Inc.3700 Arco CorporateDrive,Suite #410, Charlotte, NC 28273. +1-704-817-4646, +1-888-365-4230, +1-877-813-1846 © 2015 diyotta.com , All Rights Reserved