Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
1© Cloudera, Inc. All rights reserved.
Building a Modern Analytic
Database with Cloudera 5.8
Justin Erickson | Sr Director...
2© Cloudera, Inc. All rights reserved.
Agenda
• Building a Modern Analytic Database with Hadoop
• Key Use Cases Enabled
• ...
3© Cloudera, Inc. All rights reserved.
Common Application Patterns
Operational Efficiency New Business Value
OPERATIONS
DA...
4© Cloudera, Inc. All rights reserved.
Analytic
Database
More data of all types is being
tapped for analytics, across
envi...
5© Cloudera, Inc. All rights reserved.
Key Use Cases
EDW
Optimization
Data
Preparation
Self-Service BI
& Exploration
Use y...
6© Cloudera, Inc. All rights reserved.
Cloudera’s Analytic Database Solution
OPERATIONS
DATAMANAGEMENT
UNIFIED SERVICES
PR...
7© Cloudera, Inc. All rights reserved.
ETL & Data Preparation
• Flexible & Scalable
• Process larger data volumes, of
any ...
8© Cloudera, Inc. All rights reserved.
Self-Service BI & Exploratory Analytics
• Self-Service Data Agility
• No rigid data...
9© Cloudera, Inc. All rights reserved.
Optimize the Enterprise Data Warehouse
• Decrease Storage Costs
• Focus on high-val...
10© Cloudera, Inc. All rights reserved.
What’s New in Cloudera 5.8
11© Cloudera, Inc. All rights reserved.
Advancements with Cloudera 5.8
Impala Hue
Navigator
Optimizer
• Cloud-Native:
• Re...
12© Cloudera, Inc. All rights reserved.
Self-Service Data Discovery & BI
at Marketing Associates
Andy Frey
13© Cloudera, Inc. All rights reserved.
About Me – Andy Frey
From Assembler to Ajax, Modem to Mobile, and Mainframe to Clo...
14© Cloudera, Inc. All rights reserved.
Introducing Magnify and Marketing Associates
• Magnify Analytic Solutions — a whol...
15© Cloudera, Inc. All rights reserved.
Different Challenges for Different Clients
The B2C Challenge
• Previously using ex...
16© Cloudera, Inc. All rights reserved.
Different Challenges for Different Clients
The CRM Challenge
• Another project for...
17© Cloudera, Inc. All rights reserved.
Evaluation & Decision
Key criteria for modern analytic database:
• Handle huge spi...
18© Cloudera, Inc. All rights reserved.
Evaluation & Decision
Key criteria for modern analytic database:
• Handle huge spi...
19© Cloudera, Inc. All rights reserved.
Evaluation & Decision
Key criteria for modern analytic database:
• Handle huge spi...
20© Cloudera, Inc. All rights reserved.
Solution
• Hadoop Platform:
• Cloudera Enterprise
• Hadoop Components:
• Apache Fl...
21© Cloudera, Inc. All rights reserved.
Solution: Self-Service Data Discovery & BI
• Self-service data discovery capabilit...
22© Cloudera, Inc. All rights reserved.
Why We Chose Cloudera
• Cloudera Manager became a major differentiator.
• Made clu...
23© Cloudera, Inc. All rights reserved.
Benefits & Impact
• All-inclusive Cloudera Enterprise costs less than the required...
24© Cloudera, Inc. All rights reserved.
Lessons Learned
• First used non-Cloudera consulting: Big mistake – design incorre...
25© Cloudera, Inc. All rights reserved.
What’s Next for Cloudera’s Analytic
Database?
26© Cloudera, Inc. All rights reserved.
Analytic Database Roadmap
Faster, richer, more expressive
SQL
• Hive-on-Spark GA
•...
27© Cloudera, Inc. All rights reserved.
Next Steps
• Download Cloudera 5.8
• cloudera.com/downloads
• Release Notes
• clou...
28© Cloudera, Inc. All rights reserved.
Questions?
Prochain SlideShare
Chargement dans…5
×

Building a Modern Analytic Database with Cloudera 5.8

1 974 vues

Publié le

Analytic workloads and the ability to determine “what happened” are some of the most common use cases across enterprises today - helping you understand and adapt based on changing trends. However, for most businesses today, they are only able to see a piece of the story. Analytics are limited by the amount of data able to be stored and ultimately accessed, it’s time-intensive to bring in new datasets or fit unstructured data into rigid schemas, and user access is constrained to a select few who must already know the questions they’re trying to answer.
It’s no surprise that big data is disrupting this modus operandi for analytics. A modern, Hadoop-based platform is designed to help businesses break free of these analytic limitations, providing a new kind of adaptive, high-performance analytic database. The recent release of Cloudera 5.8 continues to advance Cloudera Enterprise as the foundation for these analytic workloads.
Join Justin Erickson, Senior Director of Product Management at Cloudera, and Andy Frey, Chief Technology Officer at Marketing Associates, as they discuss:
-What technology is needed to build a modern analytic database with Hadoop
-What’s new with Cloudera 5.8
-How to align your teams around agile analytics
-Real world success from Marketing Associates
-What’s next for Cloudera Enterprise’s Analytic Database

Publié dans : Logiciels
  • Soyez le premier à commenter

Building a Modern Analytic Database with Cloudera 5.8

  1. 1. 1© Cloudera, Inc. All rights reserved. Building a Modern Analytic Database with Cloudera 5.8 Justin Erickson | Sr Director of Product | Cloudera Andy Frey | CIO | Marketing Associates
  2. 2. 2© Cloudera, Inc. All rights reserved. Agenda • Building a Modern Analytic Database with Hadoop • Key Use Cases Enabled • What’s New with Cloudera 5.8 • Marketing Associates Customer Case Study • What’s Next?
  3. 3. 3© Cloudera, Inc. All rights reserved. Common Application Patterns Operational Efficiency New Business Value OPERATIONS DATAMANAGEMENT UNIFIED SERVICES PROCESS,ANALYZE, SERVE STORE INTEGRATE Process data, develop & serve predictive models Data Engineering & Science ELT, reporting, exploratory business intelligence Analytic Database Build data-driven applications to deliver real-time insights. Operational Database
  4. 4. 4© Cloudera, Inc. All rights reserved. Analytic Database More data of all types is being tapped for analytics, across environments Self-Service BI & Data Open up new possibilities for real-time insights as data changes Real-Time Analysis BI & analytics are critical but only tell part of the story. Get more value by sharing data across workloads Converged Workloads
  5. 5. 5© Cloudera, Inc. All rights reserved. Key Use Cases EDW Optimization Data Preparation Self-Service BI & Exploration Use your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over large data volumes, so data is always ready for your business Fastest time-to-insights with a modern analytic database designed with Hadoop’s flexibility and agility
  6. 6. 6© Cloudera, Inc. All rights reserved. Cloudera’s Analytic Database Solution OPERATIONS DATAMANAGEMENT UNIFIED SERVICES PROCESS,ANALYZE, SERVE STORE INTEGRATE Identify, offload, & optimize workloads to Hadoop Navigator Optimizer Intelligent SQL editor Hue Audit, lineage, encryption, key management, & policy lifecycles Navigator Integration with the leading BI tools BI Partners Interactive query engine for BI & SQL analytics Impala Large-scale ETL & batch processing engine Hive-on- Spark
  7. 7. 7© Cloudera, Inc. All rights reserved. ETL & Data Preparation • Flexible & Scalable • Process larger data volumes, of any type • Fastest Data Processing • Distributed processing and best-of-breed technologies for the fastest performance • Minimize Data Movement • Prepared data immediately available for analytics with shared storage and metadata
  8. 8. 8© Cloudera, Inc. All rights reserved. Self-Service BI & Exploratory Analytics • Self-Service Data Agility • No rigid data modeling encumbrances for agile acquisition • Iteratively analyze and flexibly model • Self-Service Exploratory Analytics • Interactive responses for iterative exploration • Confidently handle all BI and SQL users • Cost-Effective Scalability with Users/Data • Easily add nodes to handle more data and users • Leverage the full potential of available data • Productively Use Existing Tools and Skills • Integration with all leading BI tools & compatible analytic SQL language • Metadata and lineage for easy data discovery • Intelligent SQL editor for greater developer productivity
  9. 9. 9© Cloudera, Inc. All rights reserved. Optimize the Enterprise Data Warehouse • Decrease Storage Costs • Focus on high-value reporting data in the EDW • Keep More/All Data Online • Unlimited scale keeps data accessible and out of archive • Improve Performance • Eliminate contention and meet SLAs for routine reporting • Get New Insights • Enable ad hoc and exploratory analytics Siemens’ TCO Assessment (cost/TB)
  10. 10. 10© Cloudera, Inc. All rights reserved. What’s New in Cloudera 5.8
  11. 11. 11© Cloudera, Inc. All rights reserved. Advancements with Cloudera 5.8 Impala Hue Navigator Optimizer • Cloud-Native: • Read/write directly from Amazon S3 • Performance: • >10x faster performance on secure clusters • Data Discovery: • Preview, tag, search, pin tables in browser • Query Design Assistance: • Autocomplete of tables, columns, syntax • Efficient troubleshooting • Collaboration & Sharing: • Save & share queries with peers • Set permissions directly on results • Now GA! • Ease offloading path to Hadoop • Active Data Optimization to enable peak performance for Hive and Impala
  12. 12. 12© Cloudera, Inc. All rights reserved. Self-Service Data Discovery & BI at Marketing Associates Andy Frey
  13. 13. 13© Cloudera, Inc. All rights reserved. About Me – Andy Frey From Assembler to Ajax, Modem to Mobile, and Mainframe to Cloud, Andy Frey, developed his deep knowledge as a technologist, and CIO at leading national corporations such as GAB Robins, Compuware, J. Walter Thompson, Coolfire and now Marketing Associates, providing Fortune 100 corporations with technologically advanced enterprise solutions.
  14. 14. 14© Cloudera, Inc. All rights reserved. Introducing Magnify and Marketing Associates • Magnify Analytic Solutions — a wholly-owned division of Detroit, Michigan-based Marketing Associates serving primarily Fortune 100 clients — uses technology-driven data analysis to offer clients a range of informed business services that increase profitability through its four lines of service: business intelligence, digital intelligence, credit risk management, and marketing analytics. • Established in 1967 Marketing Associates is a full-service, technology enabled marketing services company headquartered in Detroit, Michigan with offices in Wilmington, Delaware and Charlotte, North Carolina. MA offers private and public cloud hosting, custom web development, and data transformation among its’ IT based services. Offering Cloudera Hadoop IaaS and experienced Data Scientists
  15. 15. 15© Cloudera, Inc. All rights reserved. Different Challenges for Different Clients The B2C Challenge • Previously using expensive RDBMS systems to deliver B2C marketing contests and product giveaways. Up to 150 in a year. • Huge spikes in web event data posed challenges. 200,000 hits in first minute for popular brands’ campaigns. • Cost to license for biggest spike made projects unprofitable. • Also needed to monitor and manipulate massive amounts of data in real time. RDBMS could not respond adequately during massive data intake during campaign run. “When has a campaign reached its limit? Has total supply of product been allocated?”
  16. 16. 16© Cloudera, Inc. All rights reserved. Different Challenges for Different Clients The CRM Challenge • Another project for a large client involved managing a repository of customer data from multiple sources. The magnitude was vast, data was multi-structured and new sources were being added on a regular basis. • Initially executed using 4 relational databases, query times slowed and costs soared. • Difficulty merging unstructured data from multiple sources using traditional RDBMS. • Deployment of prominent SQL RDBMS estimated @ $5 million cost (approx. 150 terabyte).
  17. 17. 17© Cloudera, Inc. All rights reserved. Evaluation & Decision Key criteria for modern analytic database: • Handle huge spikes in web event data. • Manage and manipulate massive data volumes in real-time. • Scalability and performance. • Ability to skill transfer from current SQL based programming team. • Reduce costs.
  18. 18. 18© Cloudera, Inc. All rights reserved. Evaluation & Decision Key criteria for modern analytic database: • Handle huge spikes in web event data. • Manage and manipulate massive data volumes in real-time. • Scalability and performance. • Ability to skill transfer from current SQL based programming team. • Reduce costs. Considered various offerings: • Considered SQL Server (discarded due to cost). • Knew Hadoop could be the solution and started looking at commercial implementations. • Considered non-commercial & shorted listed two Hadoop vendors: Cloudera & Hortonworks. • Determined non-commercial too risky, too burdensome – left it to the experts.
  19. 19. 19© Cloudera, Inc. All rights reserved. Evaluation & Decision Key criteria for modern analytic database: • Handle huge spikes in web event data. • Manage and manipulate massive data volumes in real-time. • Scalability and performance. • Ability to skill transfer from current SQL based programming team. • Reduce costs. Considered various offerings: • Considered SQL Server (discarded due to cost). • Knew Hadoop could be the solution and started looking at commercial implementations. • Considered non-commercial & shorted listed two Hadoop vendors: Cloudera & Hortonworks. • Determined non-commercial too risky, too burdensome – left it to the experts. • Launched June 2014 • Why Hadoop: Cost, Tech Requirements, Data Size • Why Cloudera: Most mature solution with better overall enterprise toolset; Cloudera Team Decision
  20. 20. 20© Cloudera, Inc. All rights reserved. Solution • Hadoop Platform: • Cloudera Enterprise • Hadoop Components: • Apache Flume, Apache Sqoop, Apache Hive, MapReduce, Apache Impala (incubating), Hue, Cloudera Manager • Third-Party BI & Analytic Tools: • D3.js, SAS, Tableau, R, Angoss • Security Tools: • Kerberos, Apache Sentry, Cloudera Navigator
  21. 21. 21© Cloudera, Inc. All rights reserved. Solution: Self-Service Data Discovery & BI • Self-service data discovery capabilities allow us to eliminate the need for distribution of multiple Excel reports instead allowing our clients to interact directly with Hadoop. • Security enhanced as the need for distribution of Excel reports via email went away. • Use of Tableau to run Impala queries produces real-time reporting resulting in significant value add and convenience for our clients. • Offers scalability and flexibility to accommodate diverse and growing client demands. • Allows us to scale our web event product giveaways. • Accommodates the addition of new data sources. • Easily add nodes to avoid potential performance bottlenecks.
  22. 22. 22© Cloudera, Inc. All rights reserved. Why We Chose Cloudera • Cloudera Manager became a major differentiator. • Made cluster management easy • User friendly = reduced learning curve • Chose Impala for its real-time query performance. • Proven Cloudera innovation and zeal to maintain an enterprise class solution by offering new tools and functions while maintaining/supporting the Apache project. • Cloudera appeared to be the prominent choice of large Hadoop installs in the Fortune 500. Best IT Analyst rating. • Impressed with Cloudera team before purchase.
  23. 23. 23© Cloudera, Inc. All rights reserved. Benefits & Impact • All-inclusive Cloudera Enterprise costs less than the required relational database licenses alone - Over 90% cost reduction. • Other benefits - cheaper hardware and easier to manage. • Cloudera Navigator provides a single interface to locate and classify data, audit who is accessing what data, and protect the data with centralized key management. • Critical tool when handling PII and other sensitive data. Comprehensive audit trail allows for easy monitoring of PII data access. • Allows us to satisfy strict security compliance regulations with ease. • Cloudera Professional Services are knowledgeable, responsive, and help establish best practices for our internal development team. They helped us get it right the second time. Any time we had a crisis they were there to help Why we are glad we chose Cloudera?
  24. 24. 24© Cloudera, Inc. All rights reserved. Lessons Learned • First used non-Cloudera consulting: Big mistake – design incorrect for data collected. Work with Cloudera Professional Services to design it right the first time. • Start small, if you can, and grow solution. • Don’t need big capital investment upfront • Get value out of small cluster (eg. 3 nodes) and expand as needed. • Install services to meet your current needs. Install additional services as your data needs change. • Look at all Cloudera solutions, learn them, and use them. • Training: Be generous, conduct in phases to keep new skills relevant as you build and deploy. • What’s next? • Prebuilt analytical models as a platform. • Evaluate Navigator Optimizer to improve query performance and identify best candidates for legacy application migration
  25. 25. 25© Cloudera, Inc. All rights reserved. What’s Next for Cloudera’s Analytic Database?
  26. 26. 26© Cloudera, Inc. All rights reserved. Analytic Database Roadmap Faster, richer, more expressive SQL • Hive-on-Spark GA • Insert, update delete via Kudu • Performance improvements • Nested JSON Improved multitenancy • Fewer OOM errors • Graceful node decomission • Admission control enhancements • Improved YARN integration Better SQL workbench • Higher Hue concurrency • SQL editor usability improvements • Intelligent recommendations of tables, joins & more for Hue users • Exposing tags & lineage through the Hue query experience Deeper integration with BI tools • Joint workload optimizations • Support for nested types and s • Data discovery functionality injected into the BI experience Workload optimization • Multi-platform workload profiling • Recommendation of in-line materialized views Confidential – Do not Redistribute
  27. 27. 27© Cloudera, Inc. All rights reserved. Next Steps • Download Cloudera 5.8 • cloudera.com/downloads • Release Notes • cloudera.com/documentation/enterprise/release- notes/topics/rg_release_notes.html • Learn more about Navigator Optimizer and BI in the Cloud • Register for Parts 2 & 3 of the Webinar Series! • cloudera.com/about-cloudera/events/webinars/5-8-webinar-series.html
  28. 28. 28© Cloudera, Inc. All rights reserved. Questions?

×