Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Series: Gartner

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 19 Publicité

Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Series: Gartner

You will also learn how to understand key challenges when deploying a Hadoop cluster in production, manage the entire Hadoop lifecycle using a single management console, deliver integrated management of the entire cluster to maximize IT and business agility.

You will also learn how to understand key challenges when deploying a Hadoop cluster in production, manage the entire Hadoop lifecycle using a single management console, deliver integrated management of the entire cluster to maximize IT and business agility.

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (17)

Les utilisateurs ont également aimé (11)

Publicité

Similaire à Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Series: Gartner (20)

Plus par Cloudera, Inc. (20)

Publicité

Plus récents (20)

Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Series: Gartner

  1. 1. MODERNIZING YOUR IT INFRASTRUCTURE WITH HADOOP Merv Adrian, VP Research Gartner Charles Zedlewski, VP Product Cloudera 1
  2. 2. "Big Data‖ Crystallizes Extreme Information Management Challenges Perishability Fidelity ―Big Data‖ refers to high volume, velocity and variety information assets that demand cost-effective, innovative forms of Validation Linking information processing for enhanced insight Classification Contracts and decision-making. They require: Extreme information management - the Technology Pervasive concept that your current information Use Velocity Volume infrastructure must be intentionally managed along 12 complementary dimensions to meet the challenges of the Variety Complexity 21st century Information Age.
  3. 3. Where Does Big Data Come From? Email Transactions Enterprise Partner, Employee ―Dark Data‖ Customer, Supplier Contracts Observations Public Commercial Credit Weather Sensors Population Social Media Economic Network Sentiment Correlations and patterns from disparate, linked data sources yield the greatest insights and transformative opportunities 3
  4. 4. The Big Data Challenge: Putting Together the Pieces Quickly and Efficiently I Through 2015: N E R • 85% of Fortune 500 T organizations will be INFRASTRUCTURE I unable to exploit big A data for competitive advantage. LEADERSHIP R I • Business analytics S needs will drive 70% K of investments in the S ANALYTICS S INVESTMENT expansion and K modernization of I ORGANIZATION information L infrastructure. ARCHITECTURE L S
  5. 5. Interest in "Big Data" Is Rising Rapidly — Though ―Hadoop‖ Remains Steady 500 450 400 350 300 250 Hadoop 200 Big Data 150 100 50 0 Client searches for ―Big Data‖ and "hadoop‖ on gartner.com over 12 months ending June 12
  6. 6. Hadoop's A Good Idea, But Confusion Has Slowed Commercial Adoption • IT organizations are confused, but want to make a decision • The standards steward — Apache Software Foundation — distributes some, but not all "projects" in Hadoop distributions • Many distributions exist —and they differ in their components and their "openness" • Choosing the right distributions should be driven by business analytics needs
  7. 7. Maturity is Growing and Will Spur Adoption As Marketing Ramps Up • Wave of new releases, beginning with Apache 1.0 and 2.0 (alpha), has crystalized the platform • Current versions are beginning to address early problems with availability, security, performance • Distribution vendors add features and support enterprises require • Rapidly evolving ecosystem includes differentiated distributions, data integration, business intelligence, and services vendors 7
  8. 8. Recommendations • Don't surrender to hype-ocracy. This is early, maturing technology. • Deploy when you have a clear use case, not ―to play with the technology‖ • Use a commercially supported distribution when you move to production; till then, experiment "in the cloud" or on existing hardware with ―free‖ downloads • Choose the distribution based on business need — leverage specific, supported projects • Skills will be key: define "data science" needs and hire/train for them • Manage your big data – don‘t abdicate. Practice Extreme Information Management.
  9. 9. CLOUDERA: THE STANDARD FOR APACHE HADOOP IN THE ENTERPRISE CHARLES ZEDLEWSKI, VP PRODUCT
  10. 10. 1 HIGH AVAILABILITY 2 GRANULAR SECURITY THERE‘S NO DOWNTIME. YOUR DATA IS PROCESS AND CONTROL SENSITIVE ALWAYS AVAILABLE FOR DECISIONS DATA WITH CONFIDENCE 3 ROBUST MANAGEMENT 4 SCALABLE AND EXTENSIBLE ACHIEVE OPTIMAL PERFORMANCE VIA ADAPTS TO YOUR WORKLOAD AND CENTRALIZED ADMINISTRATION GROWS WITH THE BUSINESS 5 CERTIFIED AND COMPATIBLE 6 GLOBAL SUPPORT AND SERVICES EXTEND AND LEVERAGE EXISTING ACHIEVE SLAs AND ADHERE TO INFRASTRUCTURE INVESTMENTS EXISTING IT POLICIES 10
  11. 11. 1 INDUSTRY TERM VERTICAL INDUSTRY TERM 2 Social Network Analysis Web Clickstream Sessionization ADVANCED ANALYTICS DATA PROCESSING Content Optimization Media Engagement Network Analytics Telco Mediation Loyalty & Promotions Analysis Retail Data Factory Fraud Analysis Financial Trade Reconciliation Entity Analysis Federal SIGINT Sequencing Analysis Bioinformatics Genome Mapping 11
  12. 12. CDH4 Cloudera’s Distribution Including Apache Hadoop (CDH) STORAGE COMPUTATION ACCESS INTEGRATION Big Data storage, processing and analytics platform based on Apache Hadoop – 100% open source Cloudera Enterprise 4.0 Cloudera Manager DIAGNOSTICS DEPLOYMENT CONFIGURATION MONITORING End-to-end management application for the deployment & REPORTING and operation of CDH Production Support ISSUE ESCALATION KNOWLEDGE OPTIMIZATION Our dedicated team of experts on call to help you RESOLUTION PROCESSES BASE meet your Service Level Agreements (SLAs) Cloudera University Partner Ecosystem Equipping the Big Data workforce – 12,000+ trained 250+ partners across hardware, software, platforms and services Professional Services Use case discovery, pilots, process & team development 12
  13. 13. All the industry leaders integrate with CDH. CDH4 STORAGE COMPUTATION ACCESS INTEGRATION Big Data storage, processing and analytics platform based on Apache Hadoop – 100% open source BI / Analytics Data Integration Database OS / Cloud / Sys Mgmt Hardware 13
  14. 14. END-TO-END MANAGEMENT APPLICATION FOR APACHE HADOOP 1 DEPLOY INSTALL, CONFIGURE AND START YOUR CLUSTER IN 3 SIMPLE STEPS 2 CONFIGURE & MANAGE ENSURE OPTIMAL SETTINGS FOR ALL HOSTS AND SERVICES 3 MONITOR, DIAGNOSE & REPORT FIND AND FIX PROBLEMS QUICKLY, VIEW CURRENT AND HISTORICAL ACTIVITY AND RESOURCE USAGE 15
  15. 15. REQUIRED SKILLS  LINUX ADMIN OR DBA BACKGROUND  JAVA KNOWLEDGE  NETWORKING KNOWLEDGE RESPONSIBILITIES  KEEP IMPORTANT WORKLOADS WITHIN SLA  INSTALL, CONFIGURE AND UPGRADE HADOOP  MONITOR SYSTEM HEALTH & PERFORMANCE  PLAN FOR THE FUTURE 16
  16. 16. CLOUDERA MANAGER DEMO 17
  17. 17. Q&A 18
  18. 18. REGISTER NOW FOR THE REMAINING ‗POWER OF HADOOP‘ WEBINARS: REALIZING THE PROMISE OF BIG DATA WITH HADOOP THANK FORRESTER AND CLOUDERA THURSDAY, JULY 26, 11AM PST YOU! WHAT THE HADOOP: WHY YOUR BUSINESS CAN’T AFFORD TO IGNORE THE POWER OF HADOOP GIGAOM AND CLOUDERA WEDNESDAY, AUGUST 29, 10AM PST THE BUSINESS ADVANTAGE OF HADOOP: LESSONS FROM THE FIELD 451 RESEARCH AND CLOUDERA WEDNESDAY, SEPTEMBER 26, 10AM PST 19

Notes de l'éditeur

  • The 21st century CIO is will effectively have three major information management issues coming together in an almost coordinated, simultaneous "perfect storm.""Big data" is a first taste of the extreme information challenges that will become increasingly difficult to address. The concept of data volumes so large that even as storage technology and network infrastructures increase in capacity, the data simply grows at a faster rate. In addition to volume, information assets will exhibit irregular rates of change, governance rules will become increasingly fluid, new asset types will continue to emerge and the desire of all levels of an analyst to use that data will increase exponentially. All of this in combination begins to define the extreme information management environment and any one of 12 different dimensions can overcome your existing systems — let alone two or more of them in combination.The combination of consumerization and mobility has only begun to explore the many different types of information assets that will be introduced as well as the many information "create" or "write" situations that will occur. This is only one, obvious example of the proliferation of information sources that will highlight the importance of understanding how information assets are linked together and their inherent reliability!Comprehensive, complete and deliberate architectures are the only hope for creating standards or guidance to deal with the massive "news to noise" ratios that are pending and the wide diversity of information assets and the resulting use cases.Action Item: Determine if the organization will develop in-house architectural expertise to lead the enterprise strategy based on hiring and training, or if it will follow a strategy of out-sourcing for this expertise.
  • By far, Hadoop is the technology getting the most attention in the market for analytic use cases involving big data at rest. Its early usage in Web 2.0 companies was largely designed for batch processing of large volumes of data, data mining-style, so it's not surprising that this should be so. And while there is more to Hadoop than this, client inquiries fielded by the information management team reinforce our position that it is the most typical catalyst for Hadoop consideration.
  • While its acquisition costs are lower, using open-source Apache Hadoop entails bigger risks than traditional commercial database environments; it is less mature, is fragmented, and Apache lacks a commercial support organization. Other open-source offerings have a similar profile, but typical Hadoop implementations consist of more independent "moving parts" and version numbers than most. Enterprises can construct and maintain a Hadoop solution themselves if they have the time, resources and expertise to do so. Preferably, they will choose a commercial Apache Hadoop distribution whose component projects are pre-integrated and may be backed by the vendor's support. More mature distributions provide scripts to install all the pieces, often with a graphic script that allows the user to choose the pieces appropriate for their specific needs, hardware and network setup, etc. Numerous vendors offer Hadoop distributions with pre-integrated components, but the vendors have varying levels of credibility and experience, and offer different combinations of projects at different release stages. Data management leaders can easily go wrong.However, commercial distributions include different projects along with the core Hadoop projects, and no commercial distributions include or support all available projects. Distributions also have varying release levels of the included projects and update them at different rates. Thus, data management leaders run the risk of choosing a Hadoop solution that does not meet enterprise needs.
  • Choose one of two options, depending on your needs and circumstances: build a customstack, or use a distribution from a major Apache Hadoop provider. In the latter case,subscribing to support is recommended.■ Choose your approach (custom or distribution) based on your team's skills, whether this is atactical experiment or a strategic initiative, and on how well available distributions fit your usecase.■ Balance the enterprise's longer-term needs with immediate pressures to deliver, and considerprojects that may be needed for future initiatives.■ Evaluate the distribution vendor as a whole if you plan to run Apache Hadoop for the long term.Look at financial viability, support capabilities, partnerships and future technology plans. Aboveall, talk to reference customers; treat distributions as you would development tools orworkbenches.

×