SlideShare a Scribd company logo
1 of 24
2015
Cloudman: Cluster
Management for Big Data
in the Cloud
Swati Singhi
December 3, 2015
#GHCI15
2015
2015
▪Fixed pre-provisioned capacity
▪Variable and Unpredictable workloads
▪Do not scale well
▪Expensive
▪On-site IT team
Challenges of On-premise Big Data Infra
2015
Cloud offers salvation...
▪Stretches with the workload
▪Pay-as-you-go
...but brings its own challenges
▪Moving data to the cloud
▪Security/Privacy
Big Data in the Cloud
2015
Qubole as Big Data Service
▪Enables Big Data on the cloud
▪Enterprise ready deployments
▪On major public clouds
▪Simple and Fast
2015
Cloudman
▪Qubole’s Cluster management software
▪Launches half a million nodes per month
▪Works across AWS, GCE and Azure
▪Provides higher level APIs
2015
Cloudman Goals
▪Automated cluster provisioning
▪Configure Big Data Stack
▪Manage cluster lifecycle
▪Highly optimized cost of compute
2015
UI SDK API
Cloudman
Layers of Big Data as a Service
2015
Architecture
2015
Challenges
▪ Autoscale based on workload
▪ Abstractions to address differences in
behaviors of each cloud provider
Examples
− Image creation and registration
− Configuring clusters
2015
▪ Launched automatically when needed
▪ Expands automatically if the load is high
▪ Terminate the cluster with no running jobs
▪ Remove nodes at billing boundary
Autoscaling Clusters
2015
insert overwrite table dest
select … from ads join campaigns on …group by …;
Map Tasks
ReduceTasks
Demand Supply
Progress
Master
Slaves
Job Tracker
Cloudman
Cloudman: AutoScaling
2015
Image registration in AWS vs. Azure
Image creation and registration
2015
▪ Image creation
▪ Public images in AWS
▪ Not well supported in Azure
▪ Images copied to user’s account in Azure
Image creation and registration
2015
▪ Configure credentials
−Storage and Compute keys
▪Configure the big data stack
−Start appropriate s/w, example JobTracker and
NameNode on Master and TaskTracker and
DataNode on Slaves
Cluster Configuration
2015
Optimizing cost of compute in Cloud
▪ Utilize ephemeral compute instances to lower cost
− AWS Spot Instances
− GCE Preemptible VMs
▪ Challenges
− Data loss
− Big data job failures
2015
Demo
2015
2015
Key Takeaways
▪ Highly efficient cluster management system
▪ Proven at scale in production
▪ Works on multiple clouds
2015
Got Feedback?
Rate and review the session on our mobile app – Convene
For all details visit: http://ghcindia.anitaborg.org
2015
Appendix
2015
Architecture
▪ ​QDS has a user interface, Python and Java SDKs
and APIs that allows users to talk to QDS and
analyze data sets without knowing cluster
management.
▪ A QDS user can submit primitive commands to
logical clusters.
▪ The middleware layer communicates to the cloud
orchestration layer called Cloudman
▪ Cloudman is responsible for spinning up clusters in
the concerned cloud
2015
▪ One such example is Image creation and registration
▪ Procedure
▪ Precreate a machine image with all the the
softwares to be deployed baked into it
▪ We start the cluster machines using this as the
underlying image
▪ Saves us the time in deploying the softwares on
the nodes after they are up
▪ This process is very different in all the cloud
providers
Image creation and registration
2015
Cluster Configuration
▪Another operation that had to be implemented
differently for each cloud
▪Startup scripts are used for to programmatically
customize virtual machine instances
▪AWS and Google cloud had support for this
▪Azure did not support automatic execution of
this script at the VM boot up time in the Centos
VMs
2015
▪Hadoop clusters in QDS come up automatically
when applications that require them are
launched
▪If the load on the cluster is high, then the
cluster automatically expands.
▪Cloudman automatically launches additional
nodes which eventually join the running cluster
and are able to pick up part of the workload
Autoscaling Clusters

More Related Content

What's hot

Azure Service Operator - Provision Your Resources in a Cloud-Native Way
Azure Service Operator - Provision Your Resources in a Cloud-Native WayAzure Service Operator - Provision Your Resources in a Cloud-Native Way
Azure Service Operator - Provision Your Resources in a Cloud-Native WayNico Meisenzahl
 
SquareScale Munich Cloud Native Night
SquareScale Munich Cloud Native NightSquareScale Munich Cloud Native Night
SquareScale Munich Cloud Native NightStephane Jourdan
 
[WSO2Con USA 2018] Architecting for Container-native Environments
[WSO2Con USA 2018] Architecting for Container-native Environments[WSO2Con USA 2018] Architecting for Container-native Environments
[WSO2Con USA 2018] Architecting for Container-native EnvironmentsWSO2
 
Orchestrating Cloud Events - Knative Meetup 2020
Orchestrating Cloud Events - Knative Meetup 2020Orchestrating Cloud Events - Knative Meetup 2020
Orchestrating Cloud Events - Knative Meetup 2020Mauricio (Salaboy) Salatino
 
Cloud Native Application Framework
Cloud Native Application FrameworkCloud Native Application Framework
Cloud Native Application FrameworkVMware Tanzu
 
Google Cloud Functions & Firebase Crash Course
Google Cloud Functions & Firebase Crash CourseGoogle Cloud Functions & Firebase Crash Course
Google Cloud Functions & Firebase Crash CourseDaniel Zivkovic
 
Buzzwords: Microservices, containers and serverless - real life applications ...
Buzzwords: Microservices, containers and serverless - real life applications ...Buzzwords: Microservices, containers and serverless - real life applications ...
Buzzwords: Microservices, containers and serverless - real life applications ...drnugent
 
Servers? Where we're going we don't need servers.
Servers? Where we're going we don't need servers.Servers? Where we're going we don't need servers.
Servers? Where we're going we don't need servers.drnugent
 
Gwt training presentation
Gwt training presentationGwt training presentation
Gwt training presentationMUFIX Community
 
Create A Mapping Web Part
Create A Mapping Web PartCreate A Mapping Web Part
Create A Mapping Web PartTom Resing
 
CloudStack User Group Overview And News - 12 feb 2015
CloudStack User Group Overview And News - 12 feb 2015CloudStack User Group Overview And News - 12 feb 2015
CloudStack User Group Overview And News - 12 feb 2015ShapeBlue
 
The Future of Workflow Automation Is Now - Hassle-Free ARM Template Deploymen...
The Future of Workflow Automation Is Now- Hassle-Free ARM Template Deploymen...The Future of Workflow Automation Is Now- Hassle-Free ARM Template Deploymen...
The Future of Workflow Automation Is Now - Hassle-Free ARM Template Deploymen...Nico Meisenzahl
 
Cloud Automation with ProActive
Cloud Automation with ProActiveCloud Automation with ProActive
Cloud Automation with ProActiveBrian AMEDRO
 
Extending and Integrating QlikView
Extending and Integrating QlikViewExtending and Integrating QlikView
Extending and Integrating QlikViewHelena Caligari
 

What's hot (20)

Gluster d2.0
Gluster d2.0Gluster d2.0
Gluster d2.0
 
Azure Service Operator - Provision Your Resources in a Cloud-Native Way
Azure Service Operator - Provision Your Resources in a Cloud-Native WayAzure Service Operator - Provision Your Resources in a Cloud-Native Way
Azure Service Operator - Provision Your Resources in a Cloud-Native Way
 
SquareScale Munich Cloud Native Night
SquareScale Munich Cloud Native NightSquareScale Munich Cloud Native Night
SquareScale Munich Cloud Native Night
 
[WSO2Con USA 2018] Architecting for Container-native Environments
[WSO2Con USA 2018] Architecting for Container-native Environments[WSO2Con USA 2018] Architecting for Container-native Environments
[WSO2Con USA 2018] Architecting for Container-native Environments
 
Orchestrating Cloud Events - Knative Meetup 2020
Orchestrating Cloud Events - Knative Meetup 2020Orchestrating Cloud Events - Knative Meetup 2020
Orchestrating Cloud Events - Knative Meetup 2020
 
Cloud Native Application Framework
Cloud Native Application FrameworkCloud Native Application Framework
Cloud Native Application Framework
 
Google Cloud Functions & Firebase Crash Course
Google Cloud Functions & Firebase Crash CourseGoogle Cloud Functions & Firebase Crash Course
Google Cloud Functions & Firebase Crash Course
 
Orchestrating Microservices
Orchestrating MicroservicesOrchestrating Microservices
Orchestrating Microservices
 
Buzzwords: Microservices, containers and serverless - real life applications ...
Buzzwords: Microservices, containers and serverless - real life applications ...Buzzwords: Microservices, containers and serverless - real life applications ...
Buzzwords: Microservices, containers and serverless - real life applications ...
 
Servers? Where we're going we don't need servers.
Servers? Where we're going we don't need servers.Servers? Where we're going we don't need servers.
Servers? Where we're going we don't need servers.
 
QCon Plus From monoliths to k8s - Workshop
QCon Plus From monoliths to k8s - WorkshopQCon Plus From monoliths to k8s - Workshop
QCon Plus From monoliths to k8s - Workshop
 
TIAD : In a chocolate factory
TIAD : In a chocolate factoryTIAD : In a chocolate factory
TIAD : In a chocolate factory
 
TIAD : Full stack automation
TIAD : Full stack automationTIAD : Full stack automation
TIAD : Full stack automation
 
Gwt training presentation
Gwt training presentationGwt training presentation
Gwt training presentation
 
Create A Mapping Web Part
Create A Mapping Web PartCreate A Mapping Web Part
Create A Mapping Web Part
 
CloudStack User Group Overview And News - 12 feb 2015
CloudStack User Group Overview And News - 12 feb 2015CloudStack User Group Overview And News - 12 feb 2015
CloudStack User Group Overview And News - 12 feb 2015
 
The Future of Workflow Automation Is Now - Hassle-Free ARM Template Deploymen...
The Future of Workflow Automation Is Now- Hassle-Free ARM Template Deploymen...The Future of Workflow Automation Is Now- Hassle-Free ARM Template Deploymen...
The Future of Workflow Automation Is Now - Hassle-Free ARM Template Deploymen...
 
Cloud Automation with ProActive
Cloud Automation with ProActiveCloud Automation with ProActive
Cloud Automation with ProActive
 
Meetup 23 - 03 - Application Delivery on K8S with GitOps
Meetup 23 - 03 - Application Delivery on K8S with GitOpsMeetup 23 - 03 - Application Delivery on K8S with GitOps
Meetup 23 - 03 - Application Delivery on K8S with GitOps
 
Extending and Integrating QlikView
Extending and Integrating QlikViewExtending and Integrating QlikView
Extending and Integrating QlikView
 

Similar to Manage Big Data Clusters in the Cloud with Cloudman

Introduction to GCP
Introduction to GCPIntroduction to GCP
Introduction to GCPKnoldus Inc.
 
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdfServerless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdfDhaval Nagar
 
Making Money in the Cloud
Making Money in the CloudMaking Money in the Cloud
Making Money in the CloudGravitant, Inc.
 
2015 cloud trend and cloud DR
2015 cloud trend and cloud DR2015 cloud trend and cloud DR
2015 cloud trend and cloud DRbizmerce
 
Creating Velocity in Data Centre Migrations to AWS
Creating Velocity in Data Centre Migrations to AWSCreating Velocity in Data Centre Migrations to AWS
Creating Velocity in Data Centre Migrations to AWSAmazon Web Services
 
Cloud Migration Services | Mindtree
Cloud Migration Services | MindtreeCloud Migration Services | Mindtree
Cloud Migration Services | MindtreeAnikeyRoy
 
Smart Integration to the Cloud - Kellton Tech Webinar
Smart Integration to the Cloud - Kellton Tech WebinarSmart Integration to the Cloud - Kellton Tech Webinar
Smart Integration to the Cloud - Kellton Tech WebinarKellton Tech Solutions Ltd
 
VisiQuate: Azure cloud migration case study
VisiQuate: Azure cloud migration case studyVisiQuate: Azure cloud migration case study
VisiQuate: Azure cloud migration case studyLeonid Nekhymchuk
 
An Introduction to Talend Integration Cloud
An Introduction to Talend Integration CloudAn Introduction to Talend Integration Cloud
An Introduction to Talend Integration CloudTalend
 
We are Net3 Technology
We are Net3 TechnologyWe are Net3 Technology
We are Net3 TechnologyKate Bissinger
 
Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...
Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...
Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...SAP Technology
 
What serverless means for enterprise apps
What serverless means for enterprise appsWhat serverless means for enterprise apps
What serverless means for enterprise appsSumit Sarkar
 
7 Myths about Cloud Computing
7 Myths about Cloud Computing7 Myths about Cloud Computing
7 Myths about Cloud ComputingNUS-ISS
 
Tips For Building Private Cloud Architecture With Virtualization
Tips For Building Private Cloud Architecture With Virtualization Tips For Building Private Cloud Architecture With Virtualization
Tips For Building Private Cloud Architecture With Virtualization Aventis Systems, Inc.
 
Getting Started: What Should My Enterprise Do in the First 90 Days?
Getting Started: What Should My Enterprise Do in the First 90 Days?Getting Started: What Should My Enterprise Do in the First 90 Days?
Getting Started: What Should My Enterprise Do in the First 90 Days?Amazon Web Services
 
Right scale enterprise solution
Right scale enterprise solution Right scale enterprise solution
Right scale enterprise solution Brad , Yun Lee
 
Right scale enterprise solution
Right scale enterprise solution Right scale enterprise solution
Right scale enterprise solution Brad , Yun Lee
 
Informatica Online Training - Informatica Training Online.pptx
Informatica Online Training - Informatica Training Online.pptxInformatica Online Training - Informatica Training Online.pptx
Informatica Online Training - Informatica Training Online.pptxeshwarvisualpath
 

Similar to Manage Big Data Clusters in the Cloud with Cloudman (20)

Introduction to GCP
Introduction to GCPIntroduction to GCP
Introduction to GCP
 
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdfServerless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdf
 
Making Money in the Cloud
Making Money in the CloudMaking Money in the Cloud
Making Money in the Cloud
 
2015 cloud trend and cloud DR
2015 cloud trend and cloud DR2015 cloud trend and cloud DR
2015 cloud trend and cloud DR
 
Creating Velocity in Data Centre Migrations to AWS
Creating Velocity in Data Centre Migrations to AWSCreating Velocity in Data Centre Migrations to AWS
Creating Velocity in Data Centre Migrations to AWS
 
Cloud Migration Services | Mindtree
Cloud Migration Services | MindtreeCloud Migration Services | Mindtree
Cloud Migration Services | Mindtree
 
Smart Integration to the Cloud - Kellton Tech Webinar
Smart Integration to the Cloud - Kellton Tech WebinarSmart Integration to the Cloud - Kellton Tech Webinar
Smart Integration to the Cloud - Kellton Tech Webinar
 
VisiQuate: Azure cloud migration case study
VisiQuate: Azure cloud migration case studyVisiQuate: Azure cloud migration case study
VisiQuate: Azure cloud migration case study
 
SRE & Kubernetes
SRE & KubernetesSRE & Kubernetes
SRE & Kubernetes
 
An Introduction to Talend Integration Cloud
An Introduction to Talend Integration CloudAn Introduction to Talend Integration Cloud
An Introduction to Talend Integration Cloud
 
We are Net3 Technology
We are Net3 TechnologyWe are Net3 Technology
We are Net3 Technology
 
Serverless solutions on GCF
Serverless solutions on GCFServerless solutions on GCF
Serverless solutions on GCF
 
Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...
Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...
Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...
 
What serverless means for enterprise apps
What serverless means for enterprise appsWhat serverless means for enterprise apps
What serverless means for enterprise apps
 
7 Myths about Cloud Computing
7 Myths about Cloud Computing7 Myths about Cloud Computing
7 Myths about Cloud Computing
 
Tips For Building Private Cloud Architecture With Virtualization
Tips For Building Private Cloud Architecture With Virtualization Tips For Building Private Cloud Architecture With Virtualization
Tips For Building Private Cloud Architecture With Virtualization
 
Getting Started: What Should My Enterprise Do in the First 90 Days?
Getting Started: What Should My Enterprise Do in the First 90 Days?Getting Started: What Should My Enterprise Do in the First 90 Days?
Getting Started: What Should My Enterprise Do in the First 90 Days?
 
Right scale enterprise solution
Right scale enterprise solution Right scale enterprise solution
Right scale enterprise solution
 
Right scale enterprise solution
Right scale enterprise solution Right scale enterprise solution
Right scale enterprise solution
 
Informatica Online Training - Informatica Training Online.pptx
Informatica Online Training - Informatica Training Online.pptxInformatica Online Training - Informatica Training Online.pptx
Informatica Online Training - Informatica Training Online.pptx
 

Recently uploaded

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

Manage Big Data Clusters in the Cloud with Cloudman

  • 1. 2015 Cloudman: Cluster Management for Big Data in the Cloud Swati Singhi December 3, 2015 #GHCI15 2015
  • 2. 2015 ▪Fixed pre-provisioned capacity ▪Variable and Unpredictable workloads ▪Do not scale well ▪Expensive ▪On-site IT team Challenges of On-premise Big Data Infra
  • 3. 2015 Cloud offers salvation... ▪Stretches with the workload ▪Pay-as-you-go ...but brings its own challenges ▪Moving data to the cloud ▪Security/Privacy Big Data in the Cloud
  • 4. 2015 Qubole as Big Data Service ▪Enables Big Data on the cloud ▪Enterprise ready deployments ▪On major public clouds ▪Simple and Fast
  • 5. 2015 Cloudman ▪Qubole’s Cluster management software ▪Launches half a million nodes per month ▪Works across AWS, GCE and Azure ▪Provides higher level APIs
  • 6. 2015 Cloudman Goals ▪Automated cluster provisioning ▪Configure Big Data Stack ▪Manage cluster lifecycle ▪Highly optimized cost of compute
  • 7. 2015 UI SDK API Cloudman Layers of Big Data as a Service
  • 9. 2015 Challenges ▪ Autoscale based on workload ▪ Abstractions to address differences in behaviors of each cloud provider Examples − Image creation and registration − Configuring clusters
  • 10. 2015 ▪ Launched automatically when needed ▪ Expands automatically if the load is high ▪ Terminate the cluster with no running jobs ▪ Remove nodes at billing boundary Autoscaling Clusters
  • 11. 2015 insert overwrite table dest select … from ads join campaigns on …group by …; Map Tasks ReduceTasks Demand Supply Progress Master Slaves Job Tracker Cloudman Cloudman: AutoScaling
  • 12. 2015 Image registration in AWS vs. Azure Image creation and registration
  • 13. 2015 ▪ Image creation ▪ Public images in AWS ▪ Not well supported in Azure ▪ Images copied to user’s account in Azure Image creation and registration
  • 14. 2015 ▪ Configure credentials −Storage and Compute keys ▪Configure the big data stack −Start appropriate s/w, example JobTracker and NameNode on Master and TaskTracker and DataNode on Slaves Cluster Configuration
  • 15. 2015 Optimizing cost of compute in Cloud ▪ Utilize ephemeral compute instances to lower cost − AWS Spot Instances − GCE Preemptible VMs ▪ Challenges − Data loss − Big data job failures
  • 17. 2015
  • 18. 2015 Key Takeaways ▪ Highly efficient cluster management system ▪ Proven at scale in production ▪ Works on multiple clouds
  • 19. 2015 Got Feedback? Rate and review the session on our mobile app – Convene For all details visit: http://ghcindia.anitaborg.org
  • 21. 2015 Architecture ▪ ​QDS has a user interface, Python and Java SDKs and APIs that allows users to talk to QDS and analyze data sets without knowing cluster management. ▪ A QDS user can submit primitive commands to logical clusters. ▪ The middleware layer communicates to the cloud orchestration layer called Cloudman ▪ Cloudman is responsible for spinning up clusters in the concerned cloud
  • 22. 2015 ▪ One such example is Image creation and registration ▪ Procedure ▪ Precreate a machine image with all the the softwares to be deployed baked into it ▪ We start the cluster machines using this as the underlying image ▪ Saves us the time in deploying the softwares on the nodes after they are up ▪ This process is very different in all the cloud providers Image creation and registration
  • 23. 2015 Cluster Configuration ▪Another operation that had to be implemented differently for each cloud ▪Startup scripts are used for to programmatically customize virtual machine instances ▪AWS and Google cloud had support for this ▪Azure did not support automatic execution of this script at the VM boot up time in the Centos VMs
  • 24. 2015 ▪Hadoop clusters in QDS come up automatically when applications that require them are launched ▪If the load on the cluster is high, then the cluster automatically expands. ▪Cloudman automatically launches additional nodes which eventually join the running cluster and are able to pick up part of the workload Autoscaling Clusters