Watch the on-demand recording here:
https://event.on24.com/wcc/r/1632072/803744C924E8BFD688BD117C6B4B949B
Evolution of Big Data and the Role of Analytics | Hybrid Data Management
IBM, Driving the future Hybrid Data Warehouse with IBM Integrated Analytics System.
The hybrid Enterprise Data Warehouse of the Future – Do More with your Data
The Enterprise Data Warehouse (EDW) has traditionally been the foundation for data storage. So how do you leverage current investments while remaining relevant and competitive? It is important for your organization to continue to evolve, accelerate development/deployment times, provide high performance and to provide a cloud-ready platform.
Thank you for joining us today. In this webinar we will begin talking about current data challenges, and how Data Science and Machine Learning is driving the need for advanced analytic decisioning. We will also discuss the need for a Hybrid data strategy, and how the Enterprise Data Warehouse of the future remains an integral part of that strategy. Finally we will introduce you to the IBM Integrated Analytics System
Thank you for joining us today. In this webinar we will begin talking about current data challenges, and how Data Science and Machine Learning is driving the need for advanced analytic decisioning. We will also discuss the need for a Hybrid data strategy, and how the Enterprise Data Warehouse of the future remains an integral part of that strategy. Finally we will introduce you to the IBM Integrated Analytics System, our next generation data warehouse appliance for advancing analytics, Machine Learning and Data Science.
The volume, velocity and variety of data is growing at a rapid pace, challenging many of today’s organizations. In the article “Data is Eating the World” it is predicted that by the year 2025, 163 Trillion Gigaytes of data will be created.
For most organizations this rapid growth is seen as a challenge. And while this is definitely true, it can also be seen as an opportunity. By harnessing and analyzing this data, companies are able to gain drive analytics, data science, machine learning that improve analytic that improve customer interactions, streamline processes and improve operations.
data to pre Artificial Intelligence (AI), Internet of Things (IOT), cloud, mobile and other new technologies are driving the need for real and near-time analytic decisioning. Today’s data environments are not just a vital piece of IT infrastructure, but a key component of corporate strategy. Organizations are realizing that better insights can improve customer interactions, streamline processes and improve operations.
Data is Eating the World: 163 Trillion Gigaytes Will be Created in 2025
With the volume, velocity and variety of data growing at a rapid pace, many businesses are uncovering lucrative opportunities. It is well known that businesses thrive when they uncover trends and patterns, and make richer data driven decisions, no matter where the data resides or how it is structured.
Consider the facts: ……
Having a Hybid Data Management strategy enables your enterprise architect leaders to build the right foundation for their data. They are gaining actionable insights around customer behavior and market opportunities to grow market share, reduce costs, and deliver superior customer service
Every organization wants to become a digital business. But before you they become a digital business, they have to be insight driven, before they are insight driven they have to be data driven.
Data Driven: How do these projects start? It begins with one project in an organization where a team comes together and they are relentless about solving a perceived notion. They are motivated by factual data. They succeed and put everyone else to shame. It requires a cultural change to be intentional with data driven decisions. It’s about breaking down silos in the business to get the data (from IT, from finance, from HR, etc). Most questions in this phase are just about understanding “what” happened. Some go into “why” did it happen.
Insight Driven: This is where organizations know what’s happening and why it’s happening, but they want to get predictive and answer what will happen next. They want to optimize their outcomes. They want to automate decisions. The foundation for this stage is AI, Machine Learning, and Deep Learning. It’s the basis of a data science business. They are looking for answers to become more competitive and begin to disrupt.
Digital Transformation: This is business model transformation. It’s where organizations move from a one time, perpetual charge to as a service and selling outcomes. Things like availability, uptimes, and revenue share.
Most organizations think they are in the digital transformation stage, but in reality most of them area really still in the data driven stage.
So what’s needed in each of these stages for success?
So let‘s begin with talking about what your company is trying to solve.
Innovation – For most organizations becoming more innovative is key to remaining competitive. A first step to doing this is to enable your data scientists/analysts, line of business owners and developers to deliver more intelligent insights from their data with embedded machine learning and analytics.
New Data Types – historically, companies have used data that is highly structured. New forms of semi and unstructured data such as streaming audio, video, click stream, and social media are changing the status quo.
Flexibility – Run analytics on data across multiple locations for quick insights, letting you put data where it's needed Provide portability. You have the flexibility to switch cloud platforms or database there are more and more choices when storing, accessing and analyzing data.
Efficiency – Save on storage investments with in memory analytics and deliver data and analytics quickly with high performance workload processing. Save DBAs time by moving data between on -premises and cloud seamlessly at
200 - 300 GB/s. Democratize access to data. Deliver data and insights where it’s needed so that developers and data workers are empowered to find, access, trust and gain insights from their data
Enterprise Strong – Data is everywhere in your organization, siloed across regions, lines of business, etc. Adressing data sprawl and scalability is key to your organizations growth.
Portability – Finally, accessing your data where it resides is key in
In this next section we will discuss hybrid data management and the important considerations in creating a solid corporate strategy,
1. Collect data. Example: The head of claims at an insurance company needs to reduce labor costs of claims while improving the customer experience. The first step is gathering and ingesting all that data: pictures taken from smartphones at the crash, incident details, claim history, etc.
2. Organize and Protect the data. Processes must be in place to ensure all data is protected and in compliance with current regulations so that only authorized people see just the necessary information to perform their task. The data must be clean, trusted and easily accessible: a prerequisite for processing and extracting value.
3. Deliver Value. In the insurance company example, machine learning image recognition algorithms used on pictures of car accidents combined with analysis of all related claim data can help automate claims processing without bias for optimal outcomes.
At IBM we have organized our portfolio to address these three stages.
· Hybrid Data Management is designed to help gather and ingest ALL relevant data with no limit of volume, variety or velocity. Clients can choose any style of database or data warehouse, best-of-breed and open source software and leverage their existing skill set. It enables data to be viewed as an unified and easily accessible asset.
· Unified Governance and Integration helps satisfy all aspects of integrating and governing data, from compliance, e-discovery, data retention and archiving, data masking / obfuscation, to securely organizing that information so it can be used in tools like our own data science business and analytics tools, or any third-party tools.
· Data Science and Business Analytics is the only complete stack, across the entire analytics lifecycle, that enables clients to apply collaborative data science no matter the skill level, support all data, no matter what it is or where it is, and deploy advanced data science to where the data lives.
At IBM we believe in setting expectations upfront. In doing so we feel it is important to show what our strategy is and what it is not.
IBM® is committed to delivering SQL commonality, on database platforms implementing the Common SQL Engine, in a way that is common and portable and supports the ANSI/ISO SQL standards. Since products are configured and optimized for select workloads, some products with Common SQL Engine provide greater focus on OLTP applications, while others are fine tuned for delivering operational analytics, or supporting big data open analytics environments.
IBM Db2, Db2 Warehouse, Db2 Hosted, Db2 on Cloud, IBM Integrated Analytics System (NEW), and IBM Db2 Big SQL are all designed with the Common SQL Engine. Since the Common SQL Engine supports data federation, other databases–non-IBM and open source databases– also can plug into the engine for SQL processing. To make things even easier, IBM Data Server Manager provides administration, alerting, monitoring, federation, and SQL execution support across the Common SQL Engine platforms
The Enterprise Data Warehouse (EDW) has traditionally been the foundation for enterprise data storage. As the volume, velocity and variety of data continues to evolve, so should the data warehouse. It is important that it continue to evolve, providing high performance, accelerating the time to development/deployment and providing a cloud ready platform. So I hope to answer questions how you leverage your current investments while staying relevant and competitive.
The IBM Integrated Analytics System is all that IBM Puredata Systems and Netezza are and much more, it is a revolution in how we provide you analytics. It’s a unified data science platform. Everything you need to connect your data scientists with data and provide them with the right tools is in this solution. We can talk about a few different facets to the solution:
Common SQL Engine – for you, this is about workload portability and skill sharing across public and private cloud
data science tooling, built in – IBM Data Science Experience is included, or data scientists can to collaboratively analyze data or they can use their own tools like Jupyter Notebooks
ease of use – one of the core elements of the solution, reliability (to ensure the system is available to run the analytics), elastic and flexible to grow with your requirements and all of this reduces and simplifies management resources
hybrid data management – supporting the broadest array of data types and workload deployment options so that the data scientists are not limited to what data is available to them
in-place analytics – runs analytics where the data resides, reduces process and increases performance. This is done on the Apache Spark processing engine
Machine Learning – new types of workloads that your data scientists need to accelerate decision making bringing new opportunities to the business
Performance – as an optimized single solution (links with “ease of use” above) it’s easy to deploy and manage while still providing the highest levels of performance you need.
IBM has a history of innovating and evolving our data warehouse appliance. As the volume, velocity and variety of data changes, IBM has responded.
So let’s look at the actual hardware configuration in each rack and the details; Power System server components, the FlashSystem storage and the networking switch. IAS is a fully integrated hardware and software system offering you convenience rather that the time and cost of building it out. The system is delivered configured and performance optimized for the purpose of letting you run your analytics faster.
This chart shows the specific system configurations that are available to the client. Start with filling 1/3 of a rack and then expand to 2/3 or a full rack. Multiple racks can be configured to be a system as well. These systems are a single part number., one serial number.
7 Compute Nodes in 1 rack containing
IBM Power 8 S822L 24 core server 3.02GHz
512 GB of RAM (each node)
2x 600GB SAS HDD
Red Hat® Linux OS
Up to 3 Flash Arrays in 1 rack containing
IBM FlashSystem 900
Dual Flash controllers
Micro Latency Flash modules
2-Dimensional RAID5 and hot swappable spares for high availability
2x Mellanox 10G Ethernet switches
48x10G ports
12x40/50G ports
Dual switches form resilient network
IBM SAN64B 32G Fibre Channel SAN
16Gb FC Switch
48x 32Gb/s SFP+ ports
1Assume up to 4x compression to calculate user data aka pre-load uncompressed user data.
Integrated Analytics System Console
But performance is only part of it of what sets this offering apart. You need to ensure that the analytics you run are always available to your users and the organization. These are workloads that must be available and must hit your service level agreements. This is why we designed IAS to have no single point of failure with redundancies and fault tolerance. We’ve selected the most reliable hardware components in the form of the Power System and FlashSystem for server and storage respectively. And of course, we provide the monitoring with the IBM Data Server Manager, used across the family of IBM hybrid data management offerings.
On question we always get is scalability and expansion options. When you think about expansion on the IBM Integrated Analytics System, it’s important to think about it in two ways. The first is the actual hardware. The IBM integrated Analytics System offers in-place expansion that is non-disruptive. So when you order your system if you need to add more compute and storage, it’s done without disruption to your system as you scale out.
The other aspect of expansion is the cloud-readiness of the IBM Common SQL Engine. Workloads you have on the system can be seamlessly moved to the cloud based on your requirements. You have the option to put workloads where you need them for a greater level of flexibility to run your infrastructure.
A video this demonstration is available at https://www.youtube.com/watch?v=XTzEc00jx_E
NEED: “TV has evolved into a multi-channel, multi-stream business, and cable networks need to get smarter about how they market to and connect with audiences across all of those streams. Relying on traditional ratings data and third-party analytics providers is going to be a losing strategy: you need to take ownership of your data, and use it to get a richer picture of who your viewers are, what they want, and how you can keep their attention in an increasingly crowded entertainment marketplace.
CHALLENGE: ”The challenge is that there is just so much information available—hundreds of billions of rows of data from industry data providers such as Nielsen and comScore, from channels such as AMC’s TV Everywhere live web streaming and video on demand service, from retail partners such as iTunes and Amazon, and from third-party online video services such as Netflix and Hulu.
RESULTS: Many of the results delivered by this new analytics capability demonstrate a real transformation in the way AMC operates. For example, the company’s business intelligence department has been able to create sophisticated statistical models that help the company refine its marketing strategies and make smarter decisions about how intensively it should promote each show.
With deeper insight into viewership, AMC’s direct marketing campaigns are also much more successful. In one recent example, intelligent segmentation and lookalike modeling helped the company target new and existing viewers so effectively that AMC video on demand transactions were higher than would be expected otherwise.
This newfound ability to reach out to new viewers based on their individual needs and preferences is not just valuable for AMC—it also has huge potential value for the company’s advertising partners. AMC is currently working on providing access to its rich data-sets and analytics tools as a service for advertisers, helping them fine-tune their campaigns to appeal to ever-larger audiences across both linear and digital channels.