The document discusses Cisco's Hadoop as a service offering on their Intercloud platform. Some key points:
- Cisco provides managed Hadoop, including Cloudera's distribution, on optimized instances with local storage and object storage. This offers a scalable, reliable, and secure environment for Hadoop workloads.
- Use cases discussed include predictive maintenance using IoT data and analyzing customer journeys across multiple channels.
- A pilot test showed Cisco's platform could process over 100 million records from production data across various Hadoop jobs.
- Cisco also discusses their data virtualization product CiscoDV, which can integrate data across on-premises, cloud sources on Cisco and AWS.
-
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
DEVNET-1166 Open SDN Controller APIs
1.
2. Hadoop in the Cisco
Cloud
Kartik Kanakasbesan
kartikka@cisco.com
3. • Introduction
• What is Cisco Cloud services?
• Cisco’s Big Data as a service
• Why Hadoop in the Cloud?
• Use Cases for Hadoop in the Cloud
• Customer Experiences so far
• Going forward
Agenda
4. The Intercloud
The globally connected network of clouds
Enterprise
Private
Clouds
Public
Clouds
Intercloud
Alliance
Intercloud
Services
INTERCLOUD
Intercloud
Providers
CIMK v1.0
5. The Intercloud
Customer Value
INTERCLOUD CONTROL
Across services, in
every location
COMPLIANCE
Manage risk locally
and globally
CHOICE
Cloud the way
you need it
Public
Clouds
Enterprise
Private
Clouds
Intercloud
Alliance
Intercloud
Services
Intercloud
Providers
with…
and…
CIMK v1.0
Intercloud
Services
6. Worldwide leader in
cloud building, cloud
services and
managed services
Extensive global
partner network
Global Scale
Workload mobility
Open Standards
Customer ValueCisco
Cisco & OpenStack – Delivering value
430 Companies and
growing
17000+ individual
members
2655 Cumulative
contributions
OpenStack
7. Cisco Intercloud Services
OpenStack and
standards based
cloud
OPEN
STANDARDS
Self-service
infrastructure
enabling
application
lifecycle
GLOBAL
SCALE
PUBLIC
CLOUD
PLATFORM APIS
Workload mobility
with control and
compliance
Empowering
developers
and cloud-
scale
applications
RAPID
INNOVATION
Best-of-breed
of Cisco’s
products and
best practices
8. Cisco Intercloud Services: Target Customers
• Hybrid workloads
requiring common
network and
security policies
Enterprise
• Value-added
services with
NGN/NFV
• Federation
capabilities
Network-Based
Service Providers
Developers
• SaaS, Network-
centric workloads
• IOT/IOE, SP Video,
Collaboration, and
Mobility workloads
9. IoT World Forum Reference Model
Levels
Application
(Reporting, Analytics, Control)
Data Abstraction
(Aggregation & Access)
Data Accumulation
(Storage)
Edge Computing
(Analysis & Transformation)
Connectivity
(Communication & Processing Units)
Physical Devices & Controllers
(The “Things” in IoT)
Collaboration & Processes
(Involving People & Business Processes)
Sensors, Devices, Machines,
Intelligent Edge Nodes of all types
Center
Edge
1
2
3
4
5
6
7
11. Cisco Intercloud services – platform
Cloud-Centric Networking, Security, Policy
Core Cloud Services Analytics Building Blocks
Application
Enablement Tools
Enterprise/Hybrid Services
Cisco Micro Services
Collaboration SP Video Analytics
Inter-Region Virtual
Private Backbone
Automated Private
VPN Connectivity
Network-optimized
Workload Placement
Managed Public + Private Cloud
Marketplace for Cisco, Third Party ISV, Enterprise Applications and Services
Third Party Open
Source Tools
Security Network/Device ManagementIOE
Global SP Backbone
Dashboard
Basic Cloud
Resource
Monitoring
Network
Performance
Metrics
Deployment/
Management Tools,
PaaS
VNF Library Orchestration, Auto-Scaling
Federated Network
& Security Policies
Compute Storage Database Virtual Network LB VPN Hadoop
Service Chaining
Data Virtualization
Intercloud Fabric support for
heterogeneous environments
Data Ingest
App-Level Sovereignty, Privacy Policies
12. Market place
3rd party
algorithms etc.
OpenstackAPIs,SQL,REST
Data Ingestion as a
service
Hadoop as a
service
Machine Learning
as a service
Data Warehouse
services
Data Virtualization
services
Other….
Vision for Cisco Cloud Provided Data Service*
IOE/IOT Applications
Proactive
Maintenance
Manufacturing
Apps
Machine as a
service
Oil and Gas
Service Provider Analytics Apps
Network
Diagnostic
Service
Provider
Analytics
Customer
Loyalty
Analytics
Feature
Analytics
Collaboration Analytics Apps
Telepresence
Analytics
Collaboration
Analytics
Social Analytics
Sentiment
Analysis
Others Applications
Marketing Apps
Availability
Analytics
Demand
Planning
Sentiment
Analysis
Deliver an
integrated and
managed
environment of
these primitives
Deliver
analytics
applications to
customers
(Hybrid, on-
Premise or
Software as a
service
DataSources
Remove the
burden of
managing the
infrastructure
Allow Organization
and Line of
businesses to focus
on Market
Opportunities and
develop Analytics
Applications
CIS Provided Data Services
*Subject to change based on market feedback
13. Cisco’s Big Data as a service*
Hadoop as a service (aaS)
Data Ingestion -aaS
Visualization -aaS
Machine learning -aaS
Data Virtualization- aaS
Analytics -aaS• All these services
need
• Provisioning
• Monitoring
• Scaling
• Consumption model
• Integrated
• Individually
• Cisco Branded service
• Minimal Flexibility
on Vendor choice
Provisioned on
Big Optimized instances
• Local Storage
• Object and Block Storage
*Subject to change based on market feedback
14. Cisco’s Hadoop as service*
Hadoop
Reliable,
Secure &
Monitored
HaaS
Cisco’s Hadoop as a service
• Provides market leading Cloudera’s
Hadoop distribution
• Flexibility to deploy Hadoop optimized
templates for Streaming and Batch
processing
• Data ingest with Apache Kafka
• Support Apache Spark Stack
• Core, SQL, Mlib,& GraphX
• Running on YARN
• Secure access to Hadoop APIs
• Integrate with on premise Hadoop
distributions (if needed)
Openstack
*Subject to change based on market feedback
16. Why Hadoop in the Cloud makes sense?
• Reducing barriers in adopting Hadoop
• Cloud and Hadoop provide the perfect
solution to “test” the Hadoop waters
• Help customers build IoE/IoT applications
faster with Cisco’s solutions
• Run your Hadoop dev/test workloads in
the cloud and provision them on premises
• Leverage Cisco’s Networking capabilities
as a differentiator for your capabilities
• Provide consistent policies on the cloud
just like on-premises
• Provide a scalable, reliable, and secure
environment
17. • $16B by 2020*
• Targeted to grow by 70.8% CAGR
• Over 20 plus players in the market
• Highly fragmented Amazon, Azure, IBM, Google, Rackspace, and many more
• North America is the leading market
• Europe is further behind
• AP markets are maturing fast
Hadoop as a service(HaaS) Market size & forecast
*Source:GigaOM
18. Use Case: Preventative Maintenance
Data ingestion Data Processing
In-memory
database
Hadoop
In-memory
querying
Real-time
query with
low latency
1000’s of robots streaming
messages (structured &
unstructured data)
Lambda architecture in the Cloud for
IoT/IoE (elastically scalability and secure)
Data
aggregation at
the plant floor
done by a
Cisco UCS
box
19. Use Case: Omni-Channel Customer Journeys
Server
Logs
Social
& Chat
Mobile
Event
Stream
s
Call
Center
S/W
Download
Open Trouble
Ticket
Assign
Engineer
Update
Trouble Ticket
Close Trouble
Ticket
Resolve
Trouble Ticket
Read Support
Documents
View Design
Documents
View Tech
Documents
New
Registration
Bug Search FAQs
Contract
Details
Product
Details
Device
Coverage
Interaction Touch points
Channels
Journey
Case Resolution
Software Upgrade
The customers’ interaction with Cisco across multiple touch points to get the desired business
outcome.
20. Pilot Test Data
• Test performed on one day’s production data
• Total no. of records processed – 110,852,667
• Total data ingest size – 32GB/day
• Total no. of M/R jobs in the data pipeline – 17
• Two test cycles
• Cycle 1: Heterogeneous CCS nodes (vCPUs,
storage, memory)
• Cycle 2: Homogeneous CCS nodes
21. AWS to CIS Migration – Success Criteria
Successful synthesis of customer interaction data
Successful automation of the end-end data process pipeline
Build behavioral insight services
Access to data and services via data discovery and visualization tools
Meet the performance, scale and platform stability requirements
Successful deployment of CiscoDV on CIS
Connect HDFS and Hive DS with CiscoDV via Hive and Impala
Build and expose insight services for consumption by limited users
22. AWS and CIS Data Node Sizing Comparison
Hadoop Cluster for Batch and Query Analytics
Node Service AWS Instance Type vCPU Mem Storage
Number of
Data Nodes
Comments
Data Nodes/
Node Master m3.2xlarge 8 30 2x80 GB 30
Each hadoop data node has 1500GB of EBS
available for HDFS storage
AWS Sizing
CCS Sizing
Node Service CCS Instance Type vCPU Mem Storage
Number of
Data Nodes
Comments
Data Nodes/
Node Master GP-2XLarge 8 32 50 35
Each hadoop data node has 1500GB of
volume storage available for HDFS storage
25. Discover data beyond the enterprise: Virtual integration that combines
traditional enterprise data, Big Data stores on CIS and AWS, cloud data from
SaaS providers and, Cisco Customers and Partners
Seamless interoperability offers easy access to data across distributed data
sources in the intercloud analytics platform
Universal data governance maximizes enforcement of data security rules
Analytics Data Hubs: Deployment flexibility to build hybrid/virtual sandboxes
that enable nimble data discovery and rapid data analytics to support multiple
LOBs
In addition to Hadoop: Cisco Data virtualization
26. CiscoDV on Intercloud Analytics Platform (CIS)
Scenario 1
CIS Cisco DV to Cisco
Enterprise Data Store
Scenario 2
CIS CiscoDV to Impala and
Hive on CIS Intercloud
Analytics Platform
Scenario 3
CIS Cisco DV to Hive on AWS
Big Data Cluster
Scenario1
Scenario 3
28. Cisco’s Hadoop services is available to select customers only before
General Availability
Part of the broader Cisco Big Data as a service play
Let us know the kind of tools you use for
Visualization
Machine Learning
How can we address your Data challenges together ?
Going forward
The Intercloud is a globally connected network of clouds. Built by Cisco and our partners, the Intercloud gives our customers a nearly unlimited choice of cloud infrastructure and applications with the compliance and control they need to connect to the cloud with confidence.
The Intercloud includes Enterprise Private clouds with Intercloud Fabric and ACI to make them Intercloud Ready. Intercloud Providers – Cisco Powered Services Providers who are adding Cisco Intercloud Technologies (Cisco Intercloud Fabric (ICF), Application-Centric Infrastructure (ACI), OpenStack, and the Cisco ONE Enterprise Cloud Suite) to their cloud services to create Intercloud Services. Cisco and the other Intercloud Alliance Partners deliver rigidly standardized services from a common infrastructure.
Cisco and our partners enable Choice with:
Flexible consumption and deployment models
Cloud automation integrated at the customer’s pace
Hundreds of trusted providers and partners
Workload placement regardless of hypervisor
Thousands of ready-to-consume, proven services and integrated applications
Choice helps customers to:
Improve their strategic allocation of IT budget
Enhance their ability to better align IT and business requirements
Accelerate their time to market to capture new revenue opportunities
Expand their markets
Cisco and our partners enable Compliance by:
Providing security solutions and services to protect users, data, and workloads; enable visibility, secure connections, and advanced threat protection
Enabling secure workload portability and placement across private and public clouds
Offing cloud services from data centers located around the world with local and regional hosting
Providing validated architectures and an ecosystem of proven Intercloud Providers and Resellers
Compliance helps customers to:
Manage their exposure to risk from
Network and data security threats
Integration of private cloud resources into public clouds
Industry and government compliance regulations
Maintaining multiple, point-to-point business and financial relationships with cloud providers
Cisco and our partners enable Control by
Unifying service and capacity management
Controlling placement and secure portability of workloads
Assuring application performance with application-centric policies that “follow” the workload
Leveraging service catalogs to allow IT to easily broker services for the business
Leveraging integrated application data from across multiple clouds
Control helps customers to:
Lower costs through operational efficiencies
Enable IT to assume the role of service broker to better partner with the business
Minimize risk with flexible capacity management, consistent security policies, control of sensitive information, and multi-vendor support
Deliver a consistent user experience
Why should you consider Cisco OpenStack Private Cloud?
With a track record of running large-scale ops and early private clouds like Ticketmaster and Yahoo, the founders of Metacloud, now a part of Cisco, have world-class OpenStack and operations engineering experience going back to 2011.
Combined with Cisco’s leadership in cloud and managed services, and with our extensive partner network and intercloud focus, we deliver a better customer experience -- a public cloud experience for developers and operations teams, delivered as a service to infrastructure teams.
Remove middle
Key Points:
Devices have the potential to generate much more data than people.
This introduces the need to filter and aggregate data as close to the edge of the network as possible.
As a result, there will be a new category of “middleware” (perhaps we should call it “edgeware”) to collect, normalize, filter, aggregate, and provide the data from the devices and their controllers to the applications.
Key Points:
Devices have the potential to generate much more data than people.
This introduces the need to filter and aggregate data as close to the edge of the network as possible.
As a result, there will be a new category of “middleware” (perhaps we should call it “edgeware”) to collect, normalize, filter, aggregate, and provide the data from the devices and their controllers to the applications.
Add builds
Q3
Integrated & Multi-tenant Platform as a Service
Hadoop & Data ingestion
Machine learning
Data Warehouse
Composite as a service
Elastic environment, managed, stable, and secure
Allow teams to focus on
Application and algorithm development
No need to manage the Big Data infrastructure