SlideShare une entreprise Scribd logo
1  sur  11
Configuring and
Deploying mongoDB
Sharded Cluster in 30
minutes
Prepared by Sudheer Kondla
Solutions Architect
 The presentation is based on VMs created using VMWare’s ESXi/Vsphere and RHEL/Oracle
Linux.
 This presentation is based on setting up mongoDB sharded cluster
 with 6 VMs in the MongoDB cluster (RedHat Linux).
 Each VM consists of 2 vCPUs and 4 GB of RAM
 Each VM is created with 80 GB of disk space. No special mounts/ file systems are used.
 Linux version used: 6.5
Sharded Cluster VIRTUAL MACHINES
VM Configuration
Installing mongdb
 The following steps guide you through installing mongodb software on Linux.
 set yum repository and download packages using yum package installer.
 [root@bigdata1 ~]# cd /etc/yum.repos.d/
 [root@bigdata1 yum.repos.d]# cat mongodb.repo
[mongodb]
name=MongoDB Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/
gpgcheck=0
enabled=1
 Before you run “yum install”, be sure to check internet is working.
 [root@bigdata6 yum.repos.d]# yum install -y mongodb-org-2.6.1 mongodb-org-server-2.6.1
mongodb-org-shell-2.6.1 mongodb-org-mongos-2.6.1 mongodb-org-tools-2.6.1
Setting up mongodb
 Make sure to create /data/configdb and /data/db directories on each servers
 The above directories should be owned by mongod user and mongod group.
 Change ownership to mongod
 As a root user run “chown mongod:mongod /data/configdb” and “chown mongod:mongod
/data/db”
 Without above directories mongod process will not start
 When you start mongo daemon process it will create “/data/configdb/mongod.lock” file
 You can also start mongod process with service option as root. For example “service mongod
start”
 You can also configure mongo daemon to start at system boot.
MongoDB Sharding Architecture
 Sharding is a method for storing data across multiple machines. MongoDB uses sharding
to support deployments with very large data sets and high throughput operations.
 Database systems with large data sets and high throughput applications can challenge
the capacity of a single server. High query rates can exhaust the CPU capacity of the
server. Larger data sets exceed the storage capacity of a single machine. Finally,
working set sizes larger than the system’s RAM stress the I/O capacity of disk drives.
 Sharding addresses the challenge of scaling to support high throughput and large data
sets:
 Sharding reduces the number of operations each shard handles. Each shard processes fewer
operations as the cluster grows. As a result, a cluster can increase capacity and throughput
horizontally.
 For example, to insert data, the application only needs to access the shard responsible for
that record.
 Sharding reduces the amount of data that each server needs to store. Each shard stores less
data as the cluster grows.
 For example, if a database has a 1 terabyte data set, and there are 4 shards, then each shard
might hold only 256GB of data. If there are 40 shards, then each shard might hold only 25GB
of data.
Sharding Architecture
Deploying a sharded cluster
 Start the Config Server Database Instances.
 The config server processes are mongod instances that store the cluster’s metadata.
 Designate a mongod as a config server using the --configsvr option.
e.g: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.131 –v
 Each config server stores a complete copy of the cluster’s metadata.
 Production deployments require three config server instances, each of them running on different servers to ensure
high availability and data safety.
 All members of a sharded cluster must be able to connect to all other members of a sharded cluster.
 Make sure to setup proper network connectivity, firewall rules.
 Start the three config server instances. For example
 Node1: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.131 -v
 Node2: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.132 –v
 Node3: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.133 -v
 Start the mongos Instances:
 The mongos instances are lightweight and do not require data directories.
 You can run a mongos instance on a system that runs other cluster components, such as on an application server or a
server running a mongod process.
 By default, a mongos instance runs on port 27017. Specify the hostnames of the three config servers, either in the
configuration file or as command line parameters.
 For example: mongos --configdb bigdata1:27019,bigdata2:27019,bigdata3:27019
Add Shards to the Cluster
 Enable Sharding for a Database
 Enabling sharding for a database does not redistribute data but make it possible to shard the
collections in that database.
 From a mongo shell, connect to the mongos instance. Issue a command using the following
[hdfs@bigdata3 ~]$ mongo --host bigdata2 --port 27017
MongoDB shell version: 2.6.1
connecting to: bigdata2:27017/test
rs0:PRIMARY>
 sh.addShard("rs0/bigdata1:27017")
 sh.addShard("rs0/bigdata2:27017")
 sh.addShard("rs0/bigdata3:27017")
 sh.addShard("rs0/bigdata4:27017")
 sh.addShard("rs0/bigdata5:27017")
 sh.addShard("rs0/bigdata6:27017")
Enable Sharding for a Collection
 Before you can shard a collection, you must enable sharding for the collection’s database.
 Enabling sharding for a database does not redistribute data but make it possible to shard the
collections in that database.
 Once you enable sharding for a database, MongoDB assigns a primary shard for that database
where MongoDB stores all data before sharding begins.
 Issue the sh.enableSharding() method.
 rs0:PRIMARY> show dbs
 sh.enableSharding(" customer”) OR
 db.runCommand( { enableSharding: “customer” } )
 Enable sharding on a per-collection basis.
 sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } )
 sh.shardCollection("people.addresses", { "state": 1, "_id": 1 } )
 sh.shardCollection("assets.chairs", { "type": 1, "_id": 1 } )
 sh.shardCollection("events.alerts", { "_id": "hashed" } )

Contenu connexe

Tendances

금융권을 위한 AWS Direct Connect 기반 하이브리드 구성 방법 - AWS Summit Seoul 2017
금융권을 위한 AWS Direct Connect 기반 하이브리드 구성 방법 - AWS Summit Seoul 2017금융권을 위한 AWS Direct Connect 기반 하이브리드 구성 방법 - AWS Summit Seoul 2017
금융권을 위한 AWS Direct Connect 기반 하이브리드 구성 방법 - AWS Summit Seoul 2017Amazon Web Services Korea
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMongoDB
 
OpenSearch
OpenSearchOpenSearch
OpenSearchhchen1
 
Monitoramento de serviços com Zabbix + Grafana + Python - Marcelo Santoto - D...
Monitoramento de serviços com Zabbix + Grafana + Python - Marcelo Santoto - D...Monitoramento de serviços com Zabbix + Grafana + Python - Marcelo Santoto - D...
Monitoramento de serviços com Zabbix + Grafana + Python - Marcelo Santoto - D...Felipe Blini
 
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)Amazon Web Services Korea
 
워크플로우 기반의 AWS 미디어서비스 활용하기::이상오::AWS Summit Seoul 2018
워크플로우 기반의 AWS 미디어서비스 활용하기::이상오::AWS Summit Seoul 2018워크플로우 기반의 AWS 미디어서비스 활용하기::이상오::AWS Summit Seoul 2018
워크플로우 기반의 AWS 미디어서비스 활용하기::이상오::AWS Summit Seoul 2018Amazon Web Services Korea
 
[AWS Techshift] 파트너 사업을 준비하기 위해 기억해야 할 5가지 - 양승호, AWS 파트너 사업 개발 담당 이사
[AWS Techshift] 파트너 사업을 준비하기 위해 기억해야 할 5가지 - 양승호, AWS 파트너 사업 개발 담당 이사[AWS Techshift] 파트너 사업을 준비하기 위해 기억해야 할 5가지 - 양승호, AWS 파트너 사업 개발 담당 이사
[AWS Techshift] 파트너 사업을 준비하기 위해 기억해야 할 5가지 - 양승호, AWS 파트너 사업 개발 담당 이사Amazon Web Services Korea
 
Automating your Infrastructure Deployment with AWS CloudFormation and AWS Ops...
Automating your Infrastructure Deployment with AWS CloudFormation and AWS Ops...Automating your Infrastructure Deployment with AWS CloudFormation and AWS Ops...
Automating your Infrastructure Deployment with AWS CloudFormation and AWS Ops...Amazon Web Services
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon Web Services Korea
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architectureBishal Khanal
 
Azure Storage
Azure StorageAzure Storage
Azure StorageMustafa
 
Understand AWS Pricing
Understand AWS PricingUnderstand AWS Pricing
Understand AWS PricingLynn Langit
 
AWS Finance Symposium_금융권을 위한 hybrid 클라우드 도입 첫걸음
AWS Finance Symposium_금융권을 위한 hybrid 클라우드 도입 첫걸음AWS Finance Symposium_금융권을 위한 hybrid 클라우드 도입 첫걸음
AWS Finance Symposium_금융권을 위한 hybrid 클라우드 도입 첫걸음Amazon Web Services Korea
 
Amazon's Simple Storage Service (S3)
Amazon's Simple Storage Service (S3)Amazon's Simple Storage Service (S3)
Amazon's Simple Storage Service (S3)James Gray
 
[AWS Builders] AWS와 함께하는 클라우드 컴퓨팅
[AWS Builders] AWS와 함께하는 클라우드 컴퓨팅[AWS Builders] AWS와 함께하는 클라우드 컴퓨팅
[AWS Builders] AWS와 함께하는 클라우드 컴퓨팅Amazon Web Services Korea
 
AEM Sightly Deep Dive
AEM Sightly Deep DiveAEM Sightly Deep Dive
AEM Sightly Deep DiveGabriel Walt
 

Tendances (20)

금융권을 위한 AWS Direct Connect 기반 하이브리드 구성 방법 - AWS Summit Seoul 2017
금융권을 위한 AWS Direct Connect 기반 하이브리드 구성 방법 - AWS Summit Seoul 2017금융권을 위한 AWS Direct Connect 기반 하이브리드 구성 방법 - AWS Summit Seoul 2017
금융권을 위한 AWS Direct Connect 기반 하이브리드 구성 방법 - AWS Summit Seoul 2017
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
OpenSearch
OpenSearchOpenSearch
OpenSearch
 
Monitoramento de serviços com Zabbix + Grafana + Python - Marcelo Santoto - D...
Monitoramento de serviços com Zabbix + Grafana + Python - Marcelo Santoto - D...Monitoramento de serviços com Zabbix + Grafana + Python - Marcelo Santoto - D...
Monitoramento de serviços com Zabbix + Grafana + Python - Marcelo Santoto - D...
 
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
 
워크플로우 기반의 AWS 미디어서비스 활용하기::이상오::AWS Summit Seoul 2018
워크플로우 기반의 AWS 미디어서비스 활용하기::이상오::AWS Summit Seoul 2018워크플로우 기반의 AWS 미디어서비스 활용하기::이상오::AWS Summit Seoul 2018
워크플로우 기반의 AWS 미디어서비스 활용하기::이상오::AWS Summit Seoul 2018
 
[AWS Techshift] 파트너 사업을 준비하기 위해 기억해야 할 5가지 - 양승호, AWS 파트너 사업 개발 담당 이사
[AWS Techshift] 파트너 사업을 준비하기 위해 기억해야 할 5가지 - 양승호, AWS 파트너 사업 개발 담당 이사[AWS Techshift] 파트너 사업을 준비하기 위해 기억해야 할 5가지 - 양승호, AWS 파트너 사업 개발 담당 이사
[AWS Techshift] 파트너 사업을 준비하기 위해 기억해야 할 5가지 - 양승호, AWS 파트너 사업 개발 담당 이사
 
Intro to AWS: Database Services
Intro to AWS: Database ServicesIntro to AWS: Database Services
Intro to AWS: Database Services
 
Automating your Infrastructure Deployment with AWS CloudFormation and AWS Ops...
Automating your Infrastructure Deployment with AWS CloudFormation and AWS Ops...Automating your Infrastructure Deployment with AWS CloudFormation and AWS Ops...
Automating your Infrastructure Deployment with AWS CloudFormation and AWS Ops...
 
Elastic serach
Elastic serachElastic serach
Elastic serach
 
Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Azure Storage
Azure StorageAzure Storage
Azure Storage
 
Understand AWS Pricing
Understand AWS PricingUnderstand AWS Pricing
Understand AWS Pricing
 
AWS Finance Symposium_금융권을 위한 hybrid 클라우드 도입 첫걸음
AWS Finance Symposium_금융권을 위한 hybrid 클라우드 도입 첫걸음AWS Finance Symposium_금융권을 위한 hybrid 클라우드 도입 첫걸음
AWS Finance Symposium_금융권을 위한 hybrid 클라우드 도입 첫걸음
 
Amazon's Simple Storage Service (S3)
Amazon's Simple Storage Service (S3)Amazon's Simple Storage Service (S3)
Amazon's Simple Storage Service (S3)
 
Intro to AWS Lambda
Intro to AWS Lambda Intro to AWS Lambda
Intro to AWS Lambda
 
[AWS Builders] AWS와 함께하는 클라우드 컴퓨팅
[AWS Builders] AWS와 함께하는 클라우드 컴퓨팅[AWS Builders] AWS와 함께하는 클라우드 컴퓨팅
[AWS Builders] AWS와 함께하는 클라우드 컴퓨팅
 
AEM Sightly Deep Dive
AEM Sightly Deep DiveAEM Sightly Deep Dive
AEM Sightly Deep Dive
 

Similaire à Setting up mongodb sharded cluster in 30 minutes

Setting up mongo replica set
Setting up mongo replica setSetting up mongo replica set
Setting up mongo replica setSudheer Kondla
 
MongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster TutorialMongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster TutorialJason Terpko
 
MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017Antonios Giannopoulos
 
Percona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialAntonios Giannopoulos
 
Getting started with replica set in MongoDB
Getting started with replica set in MongoDBGetting started with replica set in MongoDB
Getting started with replica set in MongoDBKishor Parkhe
 
Configuring MongoDB HA Replica Set on AWS EC2
Configuring MongoDB HA Replica Set on AWS EC2Configuring MongoDB HA Replica Set on AWS EC2
Configuring MongoDB HA Replica Set on AWS EC2ShepHertz
 
MongoDB Replication and Sharding
MongoDB Replication and ShardingMongoDB Replication and Sharding
MongoDB Replication and ShardingTharun Srinivasa
 
Percona Cluster with Master_Slave for Disaster Recovery
Percona Cluster with Master_Slave for Disaster RecoveryPercona Cluster with Master_Slave for Disaster Recovery
Percona Cluster with Master_Slave for Disaster RecoveryRam Gautam
 
Mongo db sharding_cluster_installation_guide
Mongo db sharding_cluster_installation_guideMongo db sharding_cluster_installation_guide
Mongo db sharding_cluster_installation_guidePhilip Zhong
 
How to Become Cloud Backup Provider
How to Become Cloud Backup ProviderHow to Become Cloud Backup Provider
How to Become Cloud Backup ProviderCloudian
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Shardinguzzal basak
 
MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for BeginnersEnoch Joshua
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimizationLouis liu
 
How to become cloud backup provider
How to become cloud backup providerHow to become cloud backup provider
How to become cloud backup providerCLOUDIAN KK
 
Building Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scaleBuilding Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scaleAlex Thompson
 
MariaDB10.7_install_Ubuntu.docx
MariaDB10.7_install_Ubuntu.docxMariaDB10.7_install_Ubuntu.docx
MariaDB10.7_install_Ubuntu.docxNeoClova
 

Similaire à Setting up mongodb sharded cluster in 30 minutes (20)

Setting up mongo replica set
Setting up mongo replica setSetting up mongo replica set
Setting up mongo replica set
 
Sharded cluster tutorial
Sharded cluster tutorialSharded cluster tutorial
Sharded cluster tutorial
 
MongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster TutorialMongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster Tutorial
 
MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017
 
Percona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorial
 
Getting started with replica set in MongoDB
Getting started with replica set in MongoDBGetting started with replica set in MongoDB
Getting started with replica set in MongoDB
 
Configuring MongoDB HA Replica Set on AWS EC2
Configuring MongoDB HA Replica Set on AWS EC2Configuring MongoDB HA Replica Set on AWS EC2
Configuring MongoDB HA Replica Set on AWS EC2
 
MongoDB Replication and Sharding
MongoDB Replication and ShardingMongoDB Replication and Sharding
MongoDB Replication and Sharding
 
MongoDB Shard Cluster
MongoDB Shard ClusterMongoDB Shard Cluster
MongoDB Shard Cluster
 
Percona Cluster with Master_Slave for Disaster Recovery
Percona Cluster with Master_Slave for Disaster RecoveryPercona Cluster with Master_Slave for Disaster Recovery
Percona Cluster with Master_Slave for Disaster Recovery
 
Mongo db sharding_cluster_installation_guide
Mongo db sharding_cluster_installation_guideMongo db sharding_cluster_installation_guide
Mongo db sharding_cluster_installation_guide
 
How to Become Cloud Backup Provider
How to Become Cloud Backup ProviderHow to Become Cloud Backup Provider
How to Become Cloud Backup Provider
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for Beginners
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
 
How to become cloud backup provider
How to become cloud backup providerHow to become cloud backup provider
How to become cloud backup provider
 
MySQL Cluster Basics
MySQL Cluster BasicsMySQL Cluster Basics
MySQL Cluster Basics
 
Building Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scaleBuilding Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scale
 
Mdb dn 2016_11_ops_mgr
Mdb dn 2016_11_ops_mgrMdb dn 2016_11_ops_mgr
Mdb dn 2016_11_ops_mgr
 
MariaDB10.7_install_Ubuntu.docx
MariaDB10.7_install_Ubuntu.docxMariaDB10.7_install_Ubuntu.docx
MariaDB10.7_install_Ubuntu.docx
 

Plus de Sudheer Kondla

MongoDB cluster_on_aws_example
MongoDB cluster_on_aws_exampleMongoDB cluster_on_aws_example
MongoDB cluster_on_aws_exampleSudheer Kondla
 
AWS multi-region DB design and deployment
AWS multi-region DB design and deploymentAWS multi-region DB design and deployment
AWS multi-region DB design and deploymentSudheer Kondla
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architectureSudheer Kondla
 
Digital transformation is not about technology
Digital transformation is not about technologyDigital transformation is not about technology
Digital transformation is not about technologySudheer Kondla
 
Cloudera cluster setup and configuration
Cloudera cluster setup and configurationCloudera cluster setup and configuration
Cloudera cluster setup and configurationSudheer Kondla
 

Plus de Sudheer Kondla (7)

MongoDB cluster_on_aws_example
MongoDB cluster_on_aws_exampleMongoDB cluster_on_aws_example
MongoDB cluster_on_aws_example
 
No sql
No sqlNo sql
No sql
 
AWS multi-region DB design and deployment
AWS multi-region DB design and deploymentAWS multi-region DB design and deployment
AWS multi-region DB design and deployment
 
Aws aurora scaling
Aws aurora scalingAws aurora scaling
Aws aurora scaling
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
Digital transformation is not about technology
Digital transformation is not about technologyDigital transformation is not about technology
Digital transformation is not about technology
 
Cloudera cluster setup and configuration
Cloudera cluster setup and configurationCloudera cluster setup and configuration
Cloudera cluster setup and configuration
 

Dernier

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Dernier (20)

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Setting up mongodb sharded cluster in 30 minutes

  • 1. Configuring and Deploying mongoDB Sharded Cluster in 30 minutes Prepared by Sudheer Kondla Solutions Architect
  • 2.  The presentation is based on VMs created using VMWare’s ESXi/Vsphere and RHEL/Oracle Linux.  This presentation is based on setting up mongoDB sharded cluster  with 6 VMs in the MongoDB cluster (RedHat Linux).  Each VM consists of 2 vCPUs and 4 GB of RAM  Each VM is created with 80 GB of disk space. No special mounts/ file systems are used.  Linux version used: 6.5
  • 5. Installing mongdb  The following steps guide you through installing mongodb software on Linux.  set yum repository and download packages using yum package installer.  [root@bigdata1 ~]# cd /etc/yum.repos.d/  [root@bigdata1 yum.repos.d]# cat mongodb.repo [mongodb] name=MongoDB Repository baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/ gpgcheck=0 enabled=1  Before you run “yum install”, be sure to check internet is working.  [root@bigdata6 yum.repos.d]# yum install -y mongodb-org-2.6.1 mongodb-org-server-2.6.1 mongodb-org-shell-2.6.1 mongodb-org-mongos-2.6.1 mongodb-org-tools-2.6.1
  • 6. Setting up mongodb  Make sure to create /data/configdb and /data/db directories on each servers  The above directories should be owned by mongod user and mongod group.  Change ownership to mongod  As a root user run “chown mongod:mongod /data/configdb” and “chown mongod:mongod /data/db”  Without above directories mongod process will not start  When you start mongo daemon process it will create “/data/configdb/mongod.lock” file  You can also start mongod process with service option as root. For example “service mongod start”  You can also configure mongo daemon to start at system boot.
  • 7. MongoDB Sharding Architecture  Sharding is a method for storing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.  Database systems with large data sets and high throughput applications can challenge the capacity of a single server. High query rates can exhaust the CPU capacity of the server. Larger data sets exceed the storage capacity of a single machine. Finally, working set sizes larger than the system’s RAM stress the I/O capacity of disk drives.  Sharding addresses the challenge of scaling to support high throughput and large data sets:  Sharding reduces the number of operations each shard handles. Each shard processes fewer operations as the cluster grows. As a result, a cluster can increase capacity and throughput horizontally.  For example, to insert data, the application only needs to access the shard responsible for that record.  Sharding reduces the amount of data that each server needs to store. Each shard stores less data as the cluster grows.  For example, if a database has a 1 terabyte data set, and there are 4 shards, then each shard might hold only 256GB of data. If there are 40 shards, then each shard might hold only 25GB of data.
  • 9. Deploying a sharded cluster  Start the Config Server Database Instances.  The config server processes are mongod instances that store the cluster’s metadata.  Designate a mongod as a config server using the --configsvr option. e.g: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.131 –v  Each config server stores a complete copy of the cluster’s metadata.  Production deployments require three config server instances, each of them running on different servers to ensure high availability and data safety.  All members of a sharded cluster must be able to connect to all other members of a sharded cluster.  Make sure to setup proper network connectivity, firewall rules.  Start the three config server instances. For example  Node1: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.131 -v  Node2: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.132 –v  Node3: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.133 -v  Start the mongos Instances:  The mongos instances are lightweight and do not require data directories.  You can run a mongos instance on a system that runs other cluster components, such as on an application server or a server running a mongod process.  By default, a mongos instance runs on port 27017. Specify the hostnames of the three config servers, either in the configuration file or as command line parameters.  For example: mongos --configdb bigdata1:27019,bigdata2:27019,bigdata3:27019
  • 10. Add Shards to the Cluster  Enable Sharding for a Database  Enabling sharding for a database does not redistribute data but make it possible to shard the collections in that database.  From a mongo shell, connect to the mongos instance. Issue a command using the following [hdfs@bigdata3 ~]$ mongo --host bigdata2 --port 27017 MongoDB shell version: 2.6.1 connecting to: bigdata2:27017/test rs0:PRIMARY>  sh.addShard("rs0/bigdata1:27017")  sh.addShard("rs0/bigdata2:27017")  sh.addShard("rs0/bigdata3:27017")  sh.addShard("rs0/bigdata4:27017")  sh.addShard("rs0/bigdata5:27017")  sh.addShard("rs0/bigdata6:27017")
  • 11. Enable Sharding for a Collection  Before you can shard a collection, you must enable sharding for the collection’s database.  Enabling sharding for a database does not redistribute data but make it possible to shard the collections in that database.  Once you enable sharding for a database, MongoDB assigns a primary shard for that database where MongoDB stores all data before sharding begins.  Issue the sh.enableSharding() method.  rs0:PRIMARY> show dbs  sh.enableSharding(" customer”) OR  db.runCommand( { enableSharding: “customer” } )  Enable sharding on a per-collection basis.  sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } )  sh.shardCollection("people.addresses", { "state": 1, "_id": 1 } )  sh.shardCollection("assets.chairs", { "type": 1, "_id": 1 } )  sh.shardCollection("events.alerts", { "_id": "hashed" } )