This document provides an overview of big data and how it relates to different industries. It discusses how big data is leading to an environment of change and how the status quo of data management is facing complications. It then outlines Cloudera's vision for next-generation data management through keystones like business intelligence, advanced analytics, and applications. Cloudera positions itself as poised for innovation by bringing applications to data. The document argues Cloudera complements the existing ecosystem and promotes stepwise progression from operational efficiency to competitive advantage through capabilities like deep business intelligence, schema-less querying of all data types, and data consolidation.
4. Reality of Big Data
AUTOMOTIVE COMMUNICATIONS CONSUMER FINANCIAL EDUCATION
Auto sensors Location-based PACKAGED GOODS SERVICES & RESEARCH
reporting location, advertising Sentiment analysis Risk & portfolio analysis Experiment
problems of what’s hot, New products sensor analysis
customer service
HIGH TECHNOLOGY / LIFE SCIENCES MEDIA / ON-LINE SERVICES / HEALTH CARE
INDUSTRIAL MFG. Clinical trials ENTERTAINMENT SOCIAL MEDIA Patient sensors,
Mfg quality Genomics Viewers / advertising People & career monitoring, EHRs
Warranty analysis effectiveness matching Quality of care
Website
optimization
OIL & GAS RETAIL TRAVEL & UTILITIES LAW ENFORCEMENT
Drilling exploration Consumer sentiment TRANSPORTATION Smart Meter & DEFENSE
sensor analysis Optimized Sensor analysis for optimal analysis for Threat analysis - social
marketing traffic flows network media monitoring,
Customer sentiment capacity photo analysis
4
7. Disconnected to Priorities
Increasing enterprise growth Analytics and business intelligence
Delivering operational results Mobile technologies
Reducing enterprise costs Cloud computing (SaaS, IaaS, PaaS)
Attracting and retaining new customers Collaboration technologies (workflow)
Improving IT applications and infrastructure Legacy modernization
Cage pdc srenvtn
rtnwr u&vs oi )
ei o t e ( a
n s ii o cn IT management
Improving efficiency CRM
Attracting and retaining the workforce Virtualization
Implementing analytics and big data Security
Expanding into new markets & geographies ERP Applications
7
10. Keystones to Initiatives
Business Advanced
Intelligence Analytics Applications
Innovation and Advantage
Ask bigger questions in the pursuit of discovering something incredible
Operational Efficiency
Perform existing workloads faster, cheaper, better
Data Processing Data Storage
10
11. Poised for Innovation
2006-2012 2013-???
Bringing Bringing
Compute Applications
to Data to Data
11
13. Stepwise Progression
Not
Only
Operational Efficiency SQL Competitive Advantage
Deep BI
Visibility
Ability Data
ETL EDW Of Consolidation
Acceleration Optimization Schema Hub
Historical
Compliance
Any
Data
Type
13 IT Business
When you look out over today’s business landscape, you see a growing array of systems and tools to tackle Big Data. But there is also a lot of confusion as to what is the path forward, where to look for answers, where to start within your organization. You might question if these tools and applications are appropriate for your needs. And even more so, are your business objectives even related to this thing called Big Data?Today, we will show you why Big Data matters to you and your business and how we are making Hadoop ready to take on the challenges within the enterprise.My request is that during today, you consider the starting points, the entry points, within your business that might benefit from Hadoop, and we will show you the why and how, in no uncertain terms, you can get on track to using platforms like Hadoop to address your immediate and future data management challenges.”Big DataA lot has been written; Often technical tone; Not always a good description; Big Data is “bigger” than thatRelated to business issues like:Not getting it done in time?; SLAs for critical business processing; Doing same reporting as least 10 years, but now more dataTrouble making the technical investment? Unclear or inadequate return on investments; Diminishing returns on scale out of same DW architectureKnowingly working with constrained knowledge in decision making? What to keep, archive, trash decisions more frequent, more severe consequences; New compliance, new business needs for access more data changing economics, changing standard operating proceduresFacing escalating costs of changes? Time, resources, skills, and capital to enact change needed for business; Changes to status quo reports takes weeks and months now, business is unhappy with results and delays
In fact, Big Data takes on lots of questions and formsPervasive; Every industry, vertical, market has its version of Big Data problems and answersThis slide; Example of the breadth and depth of Big DataReal and substantial and measurableIf BigData is so widespread and applicable, why the confusion?
Been too limited in scope of definition3 V’s; Too literal a definition; What lies behind these technical characteristics? Not represent for every case; Tail that wags the dog?Big Data mythsNot only petabytes; Started there with web propertiesNot only social streams; 1TB/day intake more commonplaceNot only mashups of different data structures; More commonplace, Social and other unstructured forms?Not only “shiny objects”; Promise of predictive analyticsBig Data; Simply just data“The difference, though, is not in the data itself – it’s the changed relationshipbetween you (and people) and data.”
Our perspective, however, show disconnects in businessGartner CIO priorities; Business == Predictive analytics only (see #9); Narrow focus, too narrow; Technology == BI only (see #1); Also too narrowHere today to show how to bridge these gapsShow the core functions of Hadoop; Delivering operational efficiency and meeting SLAs; Reducing enterprise costs and increasing data ROI; Building enterprise growth by dropping barriers to sharing, breaking data silosIn short, here today to show how HadoopMakes work easier, faster, cheaper; How to find the competitive edge
Issues with current data managementStructureData no longer fits neatly into rows and columns; Yet, tremendous potential; Huge uptick in this kind of data; “Unstructured” really means valuable, yet inaccessibleStruggle with rate of change to structures; Root of change management SLAs; Schemas as backbone, not designed for change; Considerable energy, time to changeStorageNot an issue of housing data, but accessibility of data; Tape is inexpensive, yet hugely inaccessible; Extremely expensive to make readily valuable.DW; Accessible, yet cost prohibitive, structured data onlySAN; Appealing, yet only storage, no compute; Still have to move to get value from itSampling/windowing; Alternative; Working with constrained data, not idealNetworkNetwork is slowest of all (storage/disk, network, compute); Any data movement == penalty; Reason for SAN/NAS inferiority; Reason for ETL shift to ELTSegmentation/divisionData and system segmentation; Work in isolation to gain local efficienciesWorldview of one LOB == particular schema, fidelity of data; Meshing multiple schemas really hardSegment for compute needs; Segment to reduce resource contention, ensure performance; Leads to separate systems to accommodateSeparate systems == separate data sets; High cost to synchronize, manage separation; And growing“How then are you to reduce enterprise costs yet increase enterprise value while promoting operational efficiency when the technology drivers behind each business priority are diametrically opposed to one and another?”
“Hadoop is like a data warehouse, but it can store more data and more kinds of data and perform more flexible analyses.”Here is how Hadoop fits in DM ecosystemOpen source economics; Runs on variety of systems (industry-standard to engineered systems); Achieve 1-2 order of magnitude in economicsDistributed, high-throughput computing; Flexible storage; Scale and fault-toleranceNot the panacea, howeverComplement to IT investments; Optimize your workloads across the entire ecosystem; “So everything works better.”Can use today! Build incrementallyIf you are:Having difficulties with processing deadlines and SLAsThrowing away good data for analysisHitting the upper limits of infrastructureNeed to connect more people and more dataSeek the change agent for your business“Today, we will show you how Hadoop can help.”Any and all of these scenarios are keystonesProgress with Hadoop has benefits along the way, not just at the end goal“Each step on this journey can have immediate and meaningful impact to your business and its objectives while propelling you further towards your strategic goals.“
Keystones establish stable basesPlatform for future initiatives; Subsequent projects can build on momentum; Collected knowledge is compounded“Our belief is this – this next generation data ecosystem, empowered with all of your collected wisdom and information, positions you and your business to take full advantage of not only all of your data, but also all of the upcoming advances and innovations built into and on top of the platform. These evolutions are and will be substantial and represent a significant competitive edge for your business.”Proven with the history of HadoopDistributed, fault-tolerant storage and batch; Familiar query languages; Low-latency databases; Machine learning libraries; Real-time query processing“If you build upon this platform, you and your business will benefit from the development momentum. So you have to ask yourself, where is the platform heading next – storage, batch, query, then what?”
That’s the really exciting part; Poised for next major transformation with HadoopInitial focus of Hadoop; 2006-2012; Bring your compute to your data;Schema-on-Read; Scale out architecturesNext phase just beginning; 2013-???; Bring your application to your data; Building on Hadoop foundationsWhere to next?
Still grounded; Still here in reality, todayHadoop is a complement to IT infrastructure; Will always be part of the greater toolset for youOur goalMature the offering; Unify management; Simplify integrationWhy?Not a point solution; Hadoop is a platform; “The platform for building business that can meet and exceed the challenges of the fundamental changes enveloping the marketplace”For that to happen; Hadoop as platform; Need to mature and become “enterprise ready”
One company’s adoptionExample of progression; We believe represents natural, sustaining, repeatable, profitableStarts with singular initiativeResolve immediate, tactical need; Blossom into new projects, only ever imagined; Just getting startedEDW at capacity: ETL processes consume 7 days; takes 5 weeks to make historical data available for analysisPerformance issues in business critical apps; little room for discovery, analytics, ROI from opportunities; Spending 44% of its resources on operational functions and 42% on ETL processing, leaving only 11% for analytics and discovery of ROI from new opportunitiesCloudera Enterprise offloads data storage, processing & some analytics from EDW; EDW can focus on operational functions & analytics; Saves millions by optimizing existing DW for analytics & reducing data storage costs by 99%Now stabilized systemsNew projects ready to go, build on initial projects
Example has shown us“We should focus on the steps, but be mindful of the journey.”Discussed our vision for HadoopPlatform for Big Data initiatives; Value in the end state; Value in the process and stepsStart small and start focusedKnow that Hadoop will grow with you; Hadoop can fit to your ambitions and prioritiesFocus on the topics and starting points covered todayIntegrate; How you can streamline and accelerate your existing data integration processesCollect; How you can keep more of your data active for fuller analysis and better decisionsConnect; How you can put more of your data together to foster sharing and build valueManage; How you can make Big Data easier to use and to supervise within your organizationOur recommendationIdentify with one of the upcoming sessions; Start there“A straightforward and focused task, yet you should have the confidence that when the time comes to take that next step towards a broader objective, to face the challenges posed by the macro events effecting all the business world, Hadoop will grow as you grow, and Cloudera will be your partner in these endeavors. That’s the promise of Hadoop, and that’s our promise to you.”