SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
1
Multi Tenant HBase
Introducing Use cases and available solutions
Bhupendra Kumar Jain
bhupendra.jain@huawei.com
2
$ whoami
Senior System Architect @ Huawei India
Tech Lead for Huawei HBase, Zookeeper & Hadoop
~13 years of experience in Big Data and BI Domain
3
Agenda
Multi Tenancy
What it is and Why
Use cases
Goals & Challenges
Achieving Multi Tenancy
Single HBase cluster
Tenant specific cluster
HBase on YARN using Apache Slider
Future Work
4
Agenda
Multi Tenancy
What it is and Why
Use cases
Goals & Challenges
Achieving Multi Tenancy
Single HBase cluster
Tenant specific cluster
HBase on YARN using Apache Slider
Future Work
5
Tenants
Tenant can be thought as
 a single role, a single user, administrator, super user, …
 a group of users (a department in an organization …)
 Share some common objective
 Work on the same set of data
 Hierarchy within tenant is possible
 developers, analysts, data scientists etc.
6
Hadoop/HBase
Backup
Disaster
Recovery
Replication
Real time monitoring
/ CEP
Offline Data Analysis /
Data Mining
Network Optimization /
Billing Dept
Call / Mobile Data Records
Network probe data
Backup
Use Cases
7
Real time monitoring / CEP
 faster data processing
 high priority over other request
 mostly access latest data ( get / short scan )
Use Cases
Offline Data Analysis /
Data Mining
 hourly / daily / monthly data processing
 multiple longer scans
 medium priority
Network Optimization/
Billing Department
 require service only for shorter duration
 need High load and Scan performance
 resource required is dynamic based on data load
Admin
 high priority over all other requests
 security, data access control
 user management
8
Business Goals
For each tenant / tenant group
- Store the data
- Allow to access the data
- Allow to process the data
- Do everything in fully Secure manner
- Guarantee a certain portion of cluster
- Data Governance
9
Challenges
 Security
 Data should not be accessible by other tenant groups
 Within tenant group, different level of data access
 Performance
 Isolate tenants workload – no impact on other tenant workload
 Resource guarantee
 Cost efficiency
 Effective resource utilization
 Maintenance effort
 Support tenant specific configurations
 Different version needs
 Monitoring workload
 Dynamic scaling / resizing
 Priority
 Resource fairness
 Admin / Super user requests
10
Agenda
Multi Tenancy
What it is and Why
Use cases
Goals & Challenges
Achieving Multi Tenancy
Single HBase cluster
Tenant specific cluster
HBase on YARN using Apache Slider
Future Work
11
Single HBase cluster for all
 Namespaces, Quota (HBASE-8015, HBASE-8410)
A namespace is a logical grouping of tables analogous to a database in relation database systems.
 Each tenant has their own workspace to deal with.
 Permissions at Namespace level.
 Quota to restrict the Number of tables and regions.
 HDFS Space Quota for Namespace / table
 RPC Throttling (HBASE-11598)
Provides a mechanism to control
- The Number of request /time for a user / table
- The bandwidth consumed by a user / per table
 Can throttle the analytical user workload
 Un-Throttle the real time user workload
12
 Multiple RPC Queues (HBASE-10993, HBASE-10994)
- Different RPC Queues for Replication, Read / Write request, Meta request
- Give priority to Get request over Long running scan request
 longer running scans from analytical tenant gets lower priority
 Real time tenant’s Get request serves better.
 Admin / Super user request gets higher priority.
 Security
- Authentication
- Authorization: ACLs at all levels (Namespace, table, cf, cq, cell)
 tenant have access to only his required data.
 restrictive permissions for shared data. (admin-write, others-read)
Single HBase cluster for all (contd …)
13
 Region Sever Grouping (HBASE-6721)
- Logical grouping of RS.
- One RS group handles regions of
one or more similar tenant.
- Namespace /Table level group binding
 Complete tenant workload Isolation
 Better Qos for each tenant
 Configure group as per tenant work load
 FileSystem Quotas (HBASE-16961)
- track how HBase namespaces and tables use filesystem space
- impose limitations via centralized policies (disable table if violation …)
- includes space used by archive, snapshot etc. (future …)
 Better File system space management at HBase level
 Not limited to any file system
Single HBase cluster for all (contd …)
14
Is this sufficient ??
 Security - Yes
 Resource Isolation
 Disk space: Space Quota
 Disk I/O: Not available
 CPU: Yes with Region grouping
 Memory: Yes with Region grouping
 Network I/O : Yes with RPC throttling
 Workload isolation ? – Yes with RS Group
 Too many tenants and regions ? – Single Master may not handle
 Tenant specific configurations ? – Yes with RS Group
 Need for different versions ? – Not possible
 Short running services ? – Not possible
 Priority – User based priority is not available
 Effective resource utilization ? – Not completely
Single HBase cluster for all (contd …)
15
Agenda
Multi Tenancy
What it is and Why
Use cases
Goals & Challenges
Achieving Multi Tenancy
Single HBase cluster
Tenant specific cluster
HBase on YARN using Apache Slider
Future Work
16
Tenant have their own cluster
Tenant X Tenant Y Tenant Z
Let each tenant have their own cluster for HBase and share HDFS, YARN, ZK.
 Complete Isolation of workload
 Tenant specific configurations, Versions
 Short running services
 Resource Isolation
17
Is this sufficient ??
 Maintenance ? – Very High (Need to manage many cluster now)
 Effective resource utilization ? – No, Free resources not used by others
 Tenant workload isolation – Possible
 Isolate the HBase service from other running services ? -- Custom solution (using
cgroup or others )
 Cost ? – Very high
 Tenant specific configurations, Versions – Easy
 Shared data ? – Need to replicate
Tenant have their own cluster
18
HBase on YARN using Apache Slider
Slider is a YARN application to deploy non-YARN-enabled applications in a YARN cluster
Slider client: Communicates with YARN and the Slider AM via remote procedure calls and/or REST requests
Slider AM: Application master to manage containers( actual application)
Slider Agent: Communicate with AM and mange one container instance.
Component : HBase application component. [ HM,RS etc.]
https://slider.incubator.apache.org/
http://events.linuxfoundation.org/sites/events/files/slides/apachecon-slider-2015.pdf
19
HBase on YARN using Apache Slider
Feature Remarks
Ease of Deployment install, start, stop, configure, rolling upgrade
Container allocation and placement anti-affinity feature ( do not place containers together in same node) - Work in progress.
Dynamic Resizing Increase / Decrease containers at run time
Data Locality Best effort to restarts failed RS in same node
Resource Allocation and Control Same capability as YARN.
Fault Tolerance
Slider AM Failure recovery
Container (HM/RS) failure recovery
Maintenance
Supports Log aggregation
Provides the API to monitor the running containers status
Integration RPC / REST API (Work in progress)
20
Agenda
Multi Tenancy
What it is and Why
Use cases
Goals & Challenges
Achieving Multi Tenancy
Single HBase cluster
Tenant specific cluster
HBase on YARN using Apache Slider
Future Work
21
Summary
Need Single Cluster Tenant specific cluster using slider
Security Yes Yes
Resource Isolation Yes Yes
Effective resource utilization Partial Better comparatively
Cost Optimal High comparatively
Tenant specific conf / version Partial Yes
Ease of Maintenance Easy Requires High
Priority Partial Partial
Dynamic resizing Not Easy Easy
Short running services No Yes
- Mix of both the solution may be needed based on the use cases
22
Future Work
 Quota Sharing : Share the quota with other tenant if possible
 Effective Resource utilization: Share the Region server group with other tenant
 HMaster Federation : For large number of regions, Distribute heavy workload from Master to other nodes
 Region Labels for Heterogenous cluster : Tenant can choose to assign regions to particular Region Server
with label ( i.e. SSD, HighMemory ..)
 System table partitioning: Region group level System Table partitions
 Disk I/O control : File system level throttling for disk I/O control ( for large compaction …)
 Separate Block cache for each tenant : Each tenant have their own share for block cache
 RPC priority based on tenant: Priority user’s request always can get high priority then others.
23
mail to: bhupendra.jain@huawei.com
Thank You!

Contenu connexe

Plus de HBaseCon

hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon
 
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon
 
HBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at XiaomiHBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at XiaomiHBaseCon
 
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon
 
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon
 
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...HBaseCon
 
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon
 

Plus de HBaseCon (20)

hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
 
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
 
HBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at XiaomiHBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at Xiaomi
 
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
 
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraph
 
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
 
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
 

Dernier

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Dernier (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

hbaseconasia2017: HBase Multi tenancy use cases and various solution

  • 1. 1 Multi Tenant HBase Introducing Use cases and available solutions Bhupendra Kumar Jain bhupendra.jain@huawei.com
  • 2. 2 $ whoami Senior System Architect @ Huawei India Tech Lead for Huawei HBase, Zookeeper & Hadoop ~13 years of experience in Big Data and BI Domain
  • 3. 3 Agenda Multi Tenancy What it is and Why Use cases Goals & Challenges Achieving Multi Tenancy Single HBase cluster Tenant specific cluster HBase on YARN using Apache Slider Future Work
  • 4. 4 Agenda Multi Tenancy What it is and Why Use cases Goals & Challenges Achieving Multi Tenancy Single HBase cluster Tenant specific cluster HBase on YARN using Apache Slider Future Work
  • 5. 5 Tenants Tenant can be thought as  a single role, a single user, administrator, super user, …  a group of users (a department in an organization …)  Share some common objective  Work on the same set of data  Hierarchy within tenant is possible  developers, analysts, data scientists etc.
  • 6. 6 Hadoop/HBase Backup Disaster Recovery Replication Real time monitoring / CEP Offline Data Analysis / Data Mining Network Optimization / Billing Dept Call / Mobile Data Records Network probe data Backup Use Cases
  • 7. 7 Real time monitoring / CEP  faster data processing  high priority over other request  mostly access latest data ( get / short scan ) Use Cases Offline Data Analysis / Data Mining  hourly / daily / monthly data processing  multiple longer scans  medium priority Network Optimization/ Billing Department  require service only for shorter duration  need High load and Scan performance  resource required is dynamic based on data load Admin  high priority over all other requests  security, data access control  user management
  • 8. 8 Business Goals For each tenant / tenant group - Store the data - Allow to access the data - Allow to process the data - Do everything in fully Secure manner - Guarantee a certain portion of cluster - Data Governance
  • 9. 9 Challenges  Security  Data should not be accessible by other tenant groups  Within tenant group, different level of data access  Performance  Isolate tenants workload – no impact on other tenant workload  Resource guarantee  Cost efficiency  Effective resource utilization  Maintenance effort  Support tenant specific configurations  Different version needs  Monitoring workload  Dynamic scaling / resizing  Priority  Resource fairness  Admin / Super user requests
  • 10. 10 Agenda Multi Tenancy What it is and Why Use cases Goals & Challenges Achieving Multi Tenancy Single HBase cluster Tenant specific cluster HBase on YARN using Apache Slider Future Work
  • 11. 11 Single HBase cluster for all  Namespaces, Quota (HBASE-8015, HBASE-8410) A namespace is a logical grouping of tables analogous to a database in relation database systems.  Each tenant has their own workspace to deal with.  Permissions at Namespace level.  Quota to restrict the Number of tables and regions.  HDFS Space Quota for Namespace / table  RPC Throttling (HBASE-11598) Provides a mechanism to control - The Number of request /time for a user / table - The bandwidth consumed by a user / per table  Can throttle the analytical user workload  Un-Throttle the real time user workload
  • 12. 12  Multiple RPC Queues (HBASE-10993, HBASE-10994) - Different RPC Queues for Replication, Read / Write request, Meta request - Give priority to Get request over Long running scan request  longer running scans from analytical tenant gets lower priority  Real time tenant’s Get request serves better.  Admin / Super user request gets higher priority.  Security - Authentication - Authorization: ACLs at all levels (Namespace, table, cf, cq, cell)  tenant have access to only his required data.  restrictive permissions for shared data. (admin-write, others-read) Single HBase cluster for all (contd …)
  • 13. 13  Region Sever Grouping (HBASE-6721) - Logical grouping of RS. - One RS group handles regions of one or more similar tenant. - Namespace /Table level group binding  Complete tenant workload Isolation  Better Qos for each tenant  Configure group as per tenant work load  FileSystem Quotas (HBASE-16961) - track how HBase namespaces and tables use filesystem space - impose limitations via centralized policies (disable table if violation …) - includes space used by archive, snapshot etc. (future …)  Better File system space management at HBase level  Not limited to any file system Single HBase cluster for all (contd …)
  • 14. 14 Is this sufficient ??  Security - Yes  Resource Isolation  Disk space: Space Quota  Disk I/O: Not available  CPU: Yes with Region grouping  Memory: Yes with Region grouping  Network I/O : Yes with RPC throttling  Workload isolation ? – Yes with RS Group  Too many tenants and regions ? – Single Master may not handle  Tenant specific configurations ? – Yes with RS Group  Need for different versions ? – Not possible  Short running services ? – Not possible  Priority – User based priority is not available  Effective resource utilization ? – Not completely Single HBase cluster for all (contd …)
  • 15. 15 Agenda Multi Tenancy What it is and Why Use cases Goals & Challenges Achieving Multi Tenancy Single HBase cluster Tenant specific cluster HBase on YARN using Apache Slider Future Work
  • 16. 16 Tenant have their own cluster Tenant X Tenant Y Tenant Z Let each tenant have their own cluster for HBase and share HDFS, YARN, ZK.  Complete Isolation of workload  Tenant specific configurations, Versions  Short running services  Resource Isolation
  • 17. 17 Is this sufficient ??  Maintenance ? – Very High (Need to manage many cluster now)  Effective resource utilization ? – No, Free resources not used by others  Tenant workload isolation – Possible  Isolate the HBase service from other running services ? -- Custom solution (using cgroup or others )  Cost ? – Very high  Tenant specific configurations, Versions – Easy  Shared data ? – Need to replicate Tenant have their own cluster
  • 18. 18 HBase on YARN using Apache Slider Slider is a YARN application to deploy non-YARN-enabled applications in a YARN cluster Slider client: Communicates with YARN and the Slider AM via remote procedure calls and/or REST requests Slider AM: Application master to manage containers( actual application) Slider Agent: Communicate with AM and mange one container instance. Component : HBase application component. [ HM,RS etc.] https://slider.incubator.apache.org/ http://events.linuxfoundation.org/sites/events/files/slides/apachecon-slider-2015.pdf
  • 19. 19 HBase on YARN using Apache Slider Feature Remarks Ease of Deployment install, start, stop, configure, rolling upgrade Container allocation and placement anti-affinity feature ( do not place containers together in same node) - Work in progress. Dynamic Resizing Increase / Decrease containers at run time Data Locality Best effort to restarts failed RS in same node Resource Allocation and Control Same capability as YARN. Fault Tolerance Slider AM Failure recovery Container (HM/RS) failure recovery Maintenance Supports Log aggregation Provides the API to monitor the running containers status Integration RPC / REST API (Work in progress)
  • 20. 20 Agenda Multi Tenancy What it is and Why Use cases Goals & Challenges Achieving Multi Tenancy Single HBase cluster Tenant specific cluster HBase on YARN using Apache Slider Future Work
  • 21. 21 Summary Need Single Cluster Tenant specific cluster using slider Security Yes Yes Resource Isolation Yes Yes Effective resource utilization Partial Better comparatively Cost Optimal High comparatively Tenant specific conf / version Partial Yes Ease of Maintenance Easy Requires High Priority Partial Partial Dynamic resizing Not Easy Easy Short running services No Yes - Mix of both the solution may be needed based on the use cases
  • 22. 22 Future Work  Quota Sharing : Share the quota with other tenant if possible  Effective Resource utilization: Share the Region server group with other tenant  HMaster Federation : For large number of regions, Distribute heavy workload from Master to other nodes  Region Labels for Heterogenous cluster : Tenant can choose to assign regions to particular Region Server with label ( i.e. SSD, HighMemory ..)  System table partitioning: Region group level System Table partitions  Disk I/O control : File system level throttling for disk I/O control ( for large compaction …)  Separate Block cache for each tenant : Each tenant have their own share for block cache  RPC priority based on tenant: Priority user’s request always can get high priority then others.