The document discusses the evolution of MongoDB sharding from versions 3.6 to 5.0. It provides an overview of MongoDB sharding architecture including shards, mongos, and config servers. It describes the different types of sharding such as hashed sharding and ranged sharding. It also discusses important sharding concepts like choosing an appropriate shard key and the use of zones. Finally, it outlines major new features introduced in each version related to modifying shard keys, refining shard keys, and resharding collections.
Sharding allows you to distribute load across multiple servers and keep your data balanced across those servers. This session will review MongoDB’s sharding support, including an architectural overview, design principles, and automation.
MongoDB sharded cluster. How to design your topology ?Mydbops
This slides was presented at Mydbops Database Meetup 4 on Aug-03 2019 by Vinodh Krishnaswamy ( Percona ). This talk focuses on when to go for sharing topology in MongoDB and their benefits and impact.
Redundancy and high availability are the basis for all production deployments. With MongoDB this can be achieved by deploying replica set. In this slides we are exploring how the replication works with MongoDB, why you should use replication, what are the features and go over different deployment use cases. At the end we are comparing some features with MySQL replication and what are the differences between the two
Sharding allows you to distribute load across multiple servers and keep your data balanced across those servers. This session will review MongoDB’s sharding support, including an architectural overview, design principles, and automation.
MongoDB sharded cluster. How to design your topology ?Mydbops
This slides was presented at Mydbops Database Meetup 4 on Aug-03 2019 by Vinodh Krishnaswamy ( Percona ). This talk focuses on when to go for sharing topology in MongoDB and their benefits and impact.
Redundancy and high availability are the basis for all production deployments. With MongoDB this can be achieved by deploying replica set. In this slides we are exploring how the replication works with MongoDB, why you should use replication, what are the features and go over different deployment use cases. At the end we are comparing some features with MySQL replication and what are the differences between the two
MongoDB was designed for humongous amounts of data, with the ability to scale horizontally via sharding. In this session, we’ll look at MongoDB’s approach to partitioning data, and the architecture of a sharded system. We’ll walk you through configuration of a sharded system, and look at how data is balanced across servers and requests are routed.
MongoDB's architecture features built-in support for horizontal scalability, and high availability through replica sets. Auto-sharding allows users to easily distribute data across many nodes. Replica sets enable automatic failover and recovery of database nodes within or across data centers. This session will provide an introduction to scaling with MongoDB by one of MongoDB's early adopters.
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB
Sharding allows you to distribute load across multiple servers and keep your data balanced across those servers. This session will review MongoDB’s sharding support, including an architectural overview, design principles, and automation.
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldAjay Gupte
In analytics world, when you need to process many millions or billions of documents to generate a single report. Novel techniques have been developed for exploiting modern processor architecture (larger on-chip cache, SIMD processing, compression, vector processing, columnar approach). Now, this technology is available to process your large JSON data. This talk will discuss analysis of JSON data using advanced data warehousing techniques and make it simple and seamless for the application/tool developer.
Development to Production with Sharded MongoDB ClustersSeveralnines
Severalnines presentation at MongoDB Stockholm Conference.
Presentation covers:
- mongoDB sharding/clustering concepts
- recommended dev/test/prod setups
- how to verify your deployment
- how to avoid downtime
- what MongoDB metrics to watch
- when to scale
Sharding allows you to distribute load across multiple servers and keep your data balanced across those servers. This session will review MongoDB’s sharding support, including an architectural overview, design principles, and automation.
Learn about the various approaches to sharding your data with MongoDB. This presentation will help you answer questions such as when to shard and how to choose a shard key.
Determining the root cause of performance issues is a critical task for Operations. In this webinar, we'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
Scaling MongoDB with Horizontal and Vertical Sharding Mydbops
How to effectively scale your MongoDB database using horizontal and vertical sharding techniques in this presentation discover the differences, choose the right strategy, and optimize your configuration for maximum performance and scalability. The presentation presented by Manosh Malai, CTO at Mydbops
Insightful session at Mydbops Opensource Database Meetup 14 in Bangalore as our Chief Technology Officer, Manosh Malai, delves deep into the world of MongoDB optimization. In this engaging presentation, Manosh explores the two primary sharding strategies - Vertical and Horizontal, providing valuable insights and real-world use cases. Gain a comprehensive understanding of the fundamentals of MongoDB sharding, including the pros, cons, and practical applications of both Vertical and Horizontal strategies. Explore real-world case studies and performance benchmarks to optimize your MongoDB deployments.
MongoDB was designed for humongous amounts of data, with the ability to scale horizontally via sharding. In this session, we’ll look at MongoDB’s approach to partitioning data, and the architecture of a sharded system. We’ll walk you through configuration of a sharded system, and look at how data is balanced across servers and requests are routed.
MongoDB's architecture features built-in support for horizontal scalability, and high availability through replica sets. Auto-sharding allows users to easily distribute data across many nodes. Replica sets enable automatic failover and recovery of database nodes within or across data centers. This session will provide an introduction to scaling with MongoDB by one of MongoDB's early adopters.
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB
Sharding allows you to distribute load across multiple servers and keep your data balanced across those servers. This session will review MongoDB’s sharding support, including an architectural overview, design principles, and automation.
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldAjay Gupte
In analytics world, when you need to process many millions or billions of documents to generate a single report. Novel techniques have been developed for exploiting modern processor architecture (larger on-chip cache, SIMD processing, compression, vector processing, columnar approach). Now, this technology is available to process your large JSON data. This talk will discuss analysis of JSON data using advanced data warehousing techniques and make it simple and seamless for the application/tool developer.
Development to Production with Sharded MongoDB ClustersSeveralnines
Severalnines presentation at MongoDB Stockholm Conference.
Presentation covers:
- mongoDB sharding/clustering concepts
- recommended dev/test/prod setups
- how to verify your deployment
- how to avoid downtime
- what MongoDB metrics to watch
- when to scale
Sharding allows you to distribute load across multiple servers and keep your data balanced across those servers. This session will review MongoDB’s sharding support, including an architectural overview, design principles, and automation.
Learn about the various approaches to sharding your data with MongoDB. This presentation will help you answer questions such as when to shard and how to choose a shard key.
Determining the root cause of performance issues is a critical task for Operations. In this webinar, we'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
Scaling MongoDB with Horizontal and Vertical Sharding Mydbops
How to effectively scale your MongoDB database using horizontal and vertical sharding techniques in this presentation discover the differences, choose the right strategy, and optimize your configuration for maximum performance and scalability. The presentation presented by Manosh Malai, CTO at Mydbops
Insightful session at Mydbops Opensource Database Meetup 14 in Bangalore as our Chief Technology Officer, Manosh Malai, delves deep into the world of MongoDB optimization. In this engaging presentation, Manosh explores the two primary sharding strategies - Vertical and Horizontal, providing valuable insights and real-world use cases. Gain a comprehensive understanding of the fundamentals of MongoDB sharding, including the pros, cons, and practical applications of both Vertical and Horizontal strategies. Explore real-world case studies and performance benchmarks to optimize your MongoDB deployments.
For the first time this year, 10gen will be offering a track completely dedicated to Operations at MongoSV, 10gen's annual MongoDB user conference on December 4. Learn more at MongoSV.com
What We Need to Unlearn about Persistent StorageScyllaDB
System software engineers have long been taught that disks are slow and sequential I/O is key to performance. With SSD drives I/O really got much faster but not simpler. In this brave new world of rocket-speed throughputs an engineer has to distinguish sustained workload from bursts, (still) take care about I/O buffer sizes, account for disks' internal parallelism and study mixed I/O characteristics in advance.
In this talk we will share some key performance measurements of the modern hardware we're taking at ScyllaDB and our opinion about the implications for the database and system software design.
There are two key choices when scaling a NoSQL data store:
choosing between a hash or a range based sharding and choosing the right sharding key. Any choice is a trade-off between scalability of read, append, and update workloads.
In this talk I will present the standard scaling techniques,
some non-universal sharding tricks, less obvious reasons for
hotspots, as well as techniques to avoid them.
How sitecore depends on mongo db for scalability and performance, and what it...Antonios Giannopoulos
Percona Live 2017 - How sitecore depends on mongo db for scalability and performance, and what it can teach you by Antonios Giannopoulos and Grant Killian
Efficient MySQL Indexing and what's new in MySQL ExplainMydbops
Efficient MySQL Indexing & What's New in MySQL Explain - Mydbops MyWebinar Edition 32
This session will delve into:
• Strategic indexing techniques: Learn how to optimize your MySQL database by implementing effective indexing strategies, including when to avoid fulltext indexes to prevent wasted resources.
• Demystifying the new MySQL Explain: We'll explore the latest enhancements to the MySQL Explain plan's JSON output format. Discover how to store the output in a variable for further analysis – a valuable addition introduced in MySQL 8.3. You'll also learn about the explain_json_format_version variable, which empowers you to choose between different JSON output versions for greater flexibility.
• Live Chat Engagement: We encourage you to actively participate throughout the webinar! Use the chat functionality to ask questions and share your experiences with indexing and Explain.
This webinar is perfect for:
• Database administrators (DBAs)
• Developers
• Anyone seeking to optimize MySQL performance and streamline database queries
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Scale your database traffic with Read & Write split using MySQL RouterMydbops
Scale your database traffic with Read & Write split using MySQL Router
This webinar recording dives into the world of MySQL Router and its capabilities for effectively managing high database traffic loads.
You'll learn:
• The challenges of scaling database traffic
• How MySQL Router facilitates read/write splitting
• The benefits of implementing read/write splitting
• Step-by-step demonstrations for configuring MySQL Router for:
1. Static read/write routing for standalone servers
2. Dynamic read/write split for InnoDB Cluster & Replica Set
• A comparison of popular load balancers (MySQL Router, ProxySQL, Maxscale)
Mydbops is a trusted database management and consultancy provider, helping businesses achieve optimal database performance and scalability.
Connect with Mydbops!
Website: https://www.mydbops.com/
Email: info@mydbops.com
PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024Mydbops
Title: PostgreSQL Schema Changes with Minimal Downtime using pg_osc
Speaker: Aakash M, Mydbops
Event: PGConf India, 2024
Description:
This presentation explores pg_osc, a tool that enables efficient schema changes in PostgreSQL tables with minimal downtime and locking. It addresses the challenges of traditional ALTER statements and provides a smoother alternative.
Key points covered:
• Introduction to pg_osc and its benefits.
• Limitations of ALTER statements and how pg_osc overcomes them.
• Step-by-step explanation of the pg_osc process.
• Prominent features and considerations for using pg_osc.
• References and resources for further exploration.
Target Audience:
• Database administrators
• Developers working with PostgreSQL
• Anyone interested in optimizing schema changes
This presentation provides valuable insights for anyone seeking to streamline schema modifications in PostgreSQL while minimizing disruptions.
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Mydbops
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applications by Bhanu Jamwal, Head of Solution Engineering, PingCAP at the Mydbops Opensource Database Meetup 14.
This presentation discusses the challenges in choosing the right database for modern applications, focusing on MySQL alternatives. It highlights the growth of new applications, the need to improve infrastructure, and the rise of cloud-native architecture.
The presentation explores alternatives to MySQL, such as MySQL forks, database clustering, and distributed SQL. It introduces TiDB as a distributed SQL database for modern applications, highlighting its features and top use cases.
Case studies of companies benefiting from TiDB are included. The presentation also outlines TiDB's product roadmap, detailing upcoming features and enhancements.
Mastering Aurora PostgreSQL Clusters for Disaster RecoveryMydbops
The presentation "Mastering Aurora PostgreSQL Clusters for Disaster Recovery" by Bhuvanesh, Co-Founder & CTO of ShellKode, at the Mydbops OpenSource Database Meetup 14 covers advanced topics in managing Aurora PostgreSQL clusters for disaster recovery purposes.
Bhuvanesh discusses key features of Aurora, such as its decoupled storage and compute layers, auto scaling capabilities, and native replication, highlighting its benefits over traditional RDS instances. He also explores Aurora Global Databases, explaining how they enable replication of data across regions for geo-span applications with low latency.
The presentation includes architecture details, such as physical and log replication, and managed failover options for ensuring high availability. Bhuvanesh shares real-world experiences and best practices for managing Aurora clusters, including handling replication lag and TLS certificate management.
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...Mydbops
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open Source Database Meetup 15
Shivji explores the evolution of transactions, implementation challenges, and insights into distributed database environments. Whether you're a database enthusiast or a tech enthusiast, this presentation offers valuable insights into the world of database management.
Contents:
• Historical perspective of transactions
• Implementing transactions
• Challenges and trade-offs in ACID properties
• Distributed transactions in modern databases like Amazon Aurora, DynamoDB, and Google Spanner
Key Takeaways:
• Understanding the evolution of transactions in databases
• Insights into the challenges of implementing ACID properties
• Exploration of distributed transaction models in leading database systems
AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15Mydbops
Discover the latest developments in the AWS RDS MySQL ecosystem with Vinoth Kanna, Founding Partner at Mydbops LLP. Explore the exciting new features and enhancements introduced in RDS MySQL in 2023, including support for Group Replication, snapshot upgrades, dedicated log volumes, and performance insights export to CloudWatch. Gain valuable insights into the introduction of new instance types and version releases throughout the year. Stay ahead of the curve by learning about the end-of-life dates for MySQL RDS 5.7 and extended support pricing considerations. Don't miss out on this informative session to deepen your understanding of AWS RDS MySQL and its evolving capabilities.
Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE EventMydbops
Explore the world of TiDB with Kabilesh PR, Co-Founder of Mydbops, as he unveils the potential of this open-source distributed SQL database. Dive into the architecture, scalability solutions, and production readiness of TiDB, and discover how it addresses MySQL scalability bottlenecks through sharding. Gain insights into its stateless SQL interface, transactional storage with TiKV, and analytical capabilities with TiFlash. Learn about TiDB's native sharding features, use cases across various industries, and its readiness for production environments. Delve into its limitations and discover how TiDB can transform your data management landscape.
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...Mydbops
Discover how Mydbops achieved an impressive 80% cost savings and ensured uninterrupted availability through a transformative MySQL database case study. Join Vinoth Kanna RS, Co-Founder of Mydbops, as he shares insights into optimizing infrastructure, enhancing observability, and navigating critical technology decisions. Learn from real-world challenges, innovative solutions, and valuable takeaways for your own database management endeavors.
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...Mydbops
Diving deep into the essentials of MongoDB Atlas diagnostics and debugging, helps you ensure optimal performance for your cloud-based databases. Join us as we explore key strategies and best practices for effective database management in the cloud environment. Get ready to elevate your MongoDB Atlas experience and unlock the full potential of your cloud databases.
Data Organisation: Table Partitioning in PostgreSQLMydbops
Mohammad Zaid Patel from Mydbops, embarked on a journey through PostgreSQL table partitioning.
✅ Why Data Organization?
Understand the importance and benefits of organized data in databases.
✅ Advantages of Organizing Your Data:
Better retrieval, improved performance, data integrity, and efficient storage.
✅ Data Organization Techniques:
Index creation, data archival, schemas, functional naming, and relationships.
✅ Table Partitioning in PostgreSQL:
Dive into the design technique of dividing large tables for efficient data management.
✅ Types of Table Partitioning:
Range, List, and Hash methods for unique data organization.
✅ Partitioning Techniques in PostgreSQL:
Manual and using pg_partman extension for streamlined partition creation.
✅ Limitations of Table Partitioning:
Considerations and challenges associated with this technique.
✅ Best Practices for Partitioned Table Maintenance:
Tips on choosing the right partition key, understanding query patterns, and more.
#mydbops #postgresql #mywebinar #webinar #data #database #partitioning #dataorganization #queryperformance #indexing #dataarchival #scalability #dataanalysis #pg_partman #databaseperformance #maintenance #dbms #dba #opensource #highavailability
Navigating MongoDB's Queryable Encryption for Ultimate Security - MydbopsMydbops
Explore MongoDB's Queryable Encryption in this in-depth webinar presentation. Learn about CSFLE, Queryable Encryption, and their mechanisms. Dive into DEKs, Key Vault Collections, Cryptographic Tokens, and more. Discover how MongoDB ensures robust security and flexibility in data encryption.
Explore TiDB's architecture, high availability features, and its ability to handle both transactional and analytical workloads.
Discover the role of the Raft consensus algorithm in ensuring data replication and fault tolerance within the system. Learn about practical use cases in SAAS applications, IoT data management, e-commerce, logistics, gaming, fintech, and more.
Get to know the limitations and advantages of TiDB and how it can revolutionize your data management strategy.
Join us on this knowledge-packed journey!
Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0...Mydbops
Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0) with Cloning Best Practices
Explore the slides from our recent webinar on 'Mastering Database Migration: MySQL Replication to InnoDB Cluster Using Cloning.' Dive into the world of database migration, InnoDB Cluster, and the power of cloning. Discover best practices and insights shared by experts in the field. Stay updated with the latest trends in database management
Watch the webinar recording https://youtu.be/sc9TYXKAQWw
Visit our Mydbops blog https://www.mydbops.com/blog/ for further insights.
Enhancing Security of MySQL Connections using SSL certificatesMydbops
Enhancing Security of MySQL Connections using SSL certificates
Mydbops MyWebinar Edition 26
In this informative presentation by Mydbops, explore the world of database security as we delve into the steps to fortify your MySQL connections using SSL certificates. Learn about the working of SSL, the benefits of SSL/TLS encryption, the types of certificates available, and the evolution of SSL/TLS in MySQL. Discover why securing your remote connections and data confidentiality is crucial. Plus, find out how to enable SSL connections in MySQL 8.0. Don't miss this opportunity to bolster your MySQL security knowledge.
Watch the webinar recording https://youtu.be/aMSUtQVdFks
Visit our Mydbops blog https://www.mydbops.com/blog/ for further insights.
Exploring the Fundamentals of YugabyteDB - Mydbops Mydbops
Exploring the Fundamentals of YugabyteDB - Mydbops MyWebinar Edition 25
Join us for an enlightening journey into the world of YugabyteDB, a distributed SQL database revolutionizing data management. In this webinar presentation, we delve into the challenges faced by traditional databases, explore the architecture and unique features of YugabyteDB, and showcase its seamless scalability and fault tolerance.
Watch the full recording: https://youtu.be/QtvK-apLBwQ
Visit Mydbops Blogs: https://www.mydbops.com/blog/
Time series in MongoDB - Mydbops Mywebinar Edition 24. - Explore the fascinating world of time series data management in MongoDB with our insightful webinar presentation. Join us as we dive into the intricacies of leveraging MongoDB for time series use cases, discussing best practices, performance optimization techniques, and real-world examples. Discover how MongoDB can empower your applications to efficiently handle time-based data and unlock valuable insights. Don't miss out on this opportunity to enhance your knowledge and stay ahead in the evolving field of data management. Dive into our speaker deck presentation now!
Watch the webinar recording here: https://youtu.be/rwjHRLGZ7pg
Mydbops Blogs: https://www.mydbops.com/blog/
TiDB in a Nutshell - Power of Open-Source Distributed SQL Database - MydbopsMydbops
TiDB in a Nutshell - Open-Source Distributed SQL Database
Immerse yourself in the world of TiDB Architecture with our captivating presentation. Dive deep into the intricacies of TiDB, the distributed SQL database that has redefined data management. Join us as we unravel the architectural brilliance behind TiDB, exploring its key components, data flow, and design principles. Uncover the secrets to exceptional performance, elastic scalability, and rock-solid data consistency. Prepare to be enlightened by the groundbreaking TiDB Architecture that is revolutionizing the industry.
Watch the full webinar here https://youtu.be/aMSUtQVdFks for webinar recording
Mydbops Blogs: https://www.mydbops.com/blog/
High availability is critical for PostgreSQL database systems, especially for organizations that depend on their databases to support their operations. In this presentation, we will explore the different options available for achieving high availability in PostgreSQL.
This presentation covers MySQL data encryption at disk. How to encrypt all tablespaces and MySQL related files for the compliances ? The new releases in MySQL 8.0 take care of the encryption of the system tablespace and supporting tables unlike MySQL 5.7.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
1. Evolution of MongoDB Sharding and Its Best Practices
Presenter by
Ranjith A
Database Engineer @ Mydbops
Mydbops 11th Webinar
www.mydbops.com info@mydbops.com
3. Mydbops at a Glance
● Founded in 2015, HQ in Bangalore India, 70+ Employees.
● Mydbops is on Database Consulting with core specialization on MySQL, MongoDB and PostgreSQL
Administration and Support.
● Mydbops was created with a motto of developing a DevOPS model for Database Administration.
● We help organisations to scale in MySQL/Mongodb/postgresql and implement the advanced technologies in
MySQL/Mongodb/PostgreSQL.
5. Agenda
● Intro
● Sharding Overview
● Sharding Architecture
● Types of Sharding
● Things need to be taken care before choosing shard key
● Evolution of sharding from 3.6 to 5.0
● Q/A
7. Sharding Overview
● Sharding is a method for distributing data across multiple machines.
● By using MongoDB sharding, We can handle very large data sets and high throughput operations.
● Database systems with large data sets or high throughput applications can challenge the capacity of a single server.
● For example, If our system is a heavy read sensitive it will exhaust the CPU capacity of the server. Working set sizes
larger than the system's RAM stress the I/O capacity of disk drives.
8. Sharding Overview
We can scale the system in two engineering approach.
Vertical Scaling Horizontal sharding
Achieved by MongoDB replica set / standalone server Achieved by MongoDB Sharding
Data set in Single server Distribute the data set to Multiple servers
Increasing the server capacity of a single server. Adding additional servers to increase capacity as required.
There is a limitation for increasing the server capacity We can easily add or remove additional servers as required.
10. Sharding Architecture - (Data Shard)
● MongoDB supports horizontal scaling through MongoDB Sharded Cluster.
● MongoDB shards data at the collection level, distributing the collection data across the shards in the cluster.
● MongoDB Sharded Cluster consists of the following components:
1. Shards
2. Mongos
3. Config server
Shard:
● Each shard contains a subset of the sharded data.
● Each shard can be deployed as a replica set.
11. Sharding Architecture-(Mongos)
Mongos:
● The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
● Applications never connect or communicate directly with the shards.
● The mongos tracks what data is on which shard by caching the metadata from the config servers.
● The mongos uses the metadata to route operations from applications and clients to the mongod instances.
● The mongos receives responses from all shards, it merges the data and returns the result document.
12. Sharding Architecture-(Config)
Config Server:
● Config servers store metadata and configuration settings for the cluster. As of MongoDB 3.4, config servers must
be deployed as a replica set (CSRS).
● If your cluster has a single config server, then the config server is a single point of failure.
● If the config server is inaccessible, the cluster is not accessible.
● If you cannot recover the data on a config server, the cluster will be inoperable.
● Always use three config servers for production deployments
13. Types of Sharding-(Hashed Sharding)
MongoDB supports two sharding methods for distributing data across sharded clusters.
● Hashed sharding
● Ranged Sharding
Hashed sharding:
14. Hashed sharding
● Hashed Sharding involves computing a hash of the shard key field's value. We can use either a single field hashed index
or a compound hashed index (New in 4.4) as the shard key.
● MongoDB automatically computes the hashes when resolving queries using hashed indexes.
● Hashed sharding provides a more even data distribution across the sharded cluster.
● The fields which we chose the hashed shard key should have a good cardinality (more unique values).
● Default _id field is the best example for good cardinality (Objectid values).
16. Hashed sharding
Command to enable Sharding: (Collection Level)
sh.shardCollection( "databasename.collectionname", { "field" : "hashed" } )
● Make sure before enabling sharding for a particular collection database must be sharded also the Index must be
available for the fields which we are going to use as a shard key.
18. Ranged sharding
● In Range-based sharding, data's are splitted into contiguous ranges determined by the shard key values.
● Data with "close" shard key values are likely to be in the same chunk or shard.
● This will improve the performance of the read queries (target documents) within a contiguous range.
● Poor sharded key selection will affect both read and write performance.
Command to enable Sharding: (Collection Level)
sh.shardCollection( "databasename.collectionname", { "field" : 1 } )
● Make sure before enabling sharding for a particular collection, database must be sharded also the Index must be
available for the fields which we are going to use as a shard key.
19. Zone sharding
● In sharded clusters, you can create single or multiple zones in single shard as well as multiple shards.
● Zones represent a group of shards and associate one or more ranges of shard key values to that zone.
● Zone ranges are always inclusive of the lower boundary and exclusive of the upper boundary.
● From MongoDB 4.0.2, dropping a collection deletes its associated zone/tag ranges.
● Zones information are stored in config.shards & config.tags collection
● Starting from MongoDB 4.4 brings we can shard a collection and determine zones by compound keys, including
mixing a hashed key with non-hashed keys.
21. Zone sharding
Command to create Zone range:
sh.updateZoneKeyRange("dbname.collectionname", { fieldname: "minkey" }, { fieldname: "maxkey" },
"zonename")
22. Sharded key
Shard key is the key to evenly distribute the data among all shards. Good shared key always satisfies the below points.
● High Cardinality
● High Frequency
● Non Monotonically Changing Shard Keys
● Sharding Query patterns
23. Sharded key
High Cardinality:
● High cardinality shard key - More no. of chunks & evenly distributed data
● Low cardinality shard key - Less no. of chunks & low distributed data
● Each unique shard key value can exist on no more than a single chunk at any given time.
High Frequency:
● High frequency shard key - More evenly distributed data
● Low frequency shard key - Low distributed data
● Shard key cardinality & monotonically changing shard key also contribute to the distribution of the data.
24. Sharded key
Non Monotonically Changing Shard Keys:-
● monotonically increases or decreases shard key tends to distribute the data to a single chunk within the cluster.
● Each chunk has its own min & Max value.
25. Evolution of sharding from 3.6 to 5.0
Mongo
version
Modify shard
key field value
Refining
shard key
Change
shard key
New variables New features
3.6 NO NO NO orphanCleanupDelaySecs Shard must be a replica set
All shard members have chunk
metadata
4.2 YES NO NO sh.setBalancerState(true)
sh.setBalancerState(false)
Modify shard key field value except
immutable _id field
4.4 YES YES NO Hedged Reads Refinable shard keys
Hedged Reads
compound shard keys with a hashed
field
Remove multiple shard at a time
Remove shard key size limit
5.0 YES YES YES reshardCollection Change the shard key
Change the name of a sharded
collection
26. Sharding Features in 3.6
● Shards must be replica sets.
● All members of the shard replica set maintain the metadata regarding chunk metadata. This prevents reads from
the secondaries from returning orphaned data.
● Based on the orphanCleanupDelaySecs (New in 3.6) variable migrated chunk is deleted from the source shard.
● orphanCleanupDelaySecs - Default 900 (15 min)
Set the orphanCleanupDelaySecs value to 20 min during the mongo service start
mongod --setParameter orphanCleanupDelaySecs=1200 (20 min)
setParameter command:
db.adminCommand( { setParameter: 1, orphanCleanupDelaySecs: 1200 } )
27. Sharding Features in 4.2
● We can update a document's shard key value except the shard key field is the immutable _id field.
● In earlier version we can't change the document's shard key value.
28. Sharding Features in 4.2
New variables in 4.2:
● sh.startBalancer() - sh.setBalancerState(true) (Enable auto-splitting for the sharded cluster)
● sh.stopBalancer() - sh.setBalancerState(false) (Disable auto-splitting for the sharded cluster)
● sh.enableAutoSplit() - Enable auto-splitting when the balancer is disabled
29. Sharding Features in 4.4
● Refinable Shard Keys - refine a collection's shard key by adding a suffix field or fields to the existing key
db.employee.createIndex({"employeeid" : 1, "mailid": 1})
db.adminCommand( {refineCollectionShardKey: "mydbops.employee",
key: { "employeeid" : 1, "mailid": 1 }} )
● In 4.4, Shard key field can be missing in a sharded collection.
● In earlier versions, shard key fields must exist in every document for a sharded collection.
● Support Hedged Reads - To minimize latencies
● Support compound shard keys with a hashed field.
● More than one removeShard operation at a time.
● MongoDB removes the 512-byte limit on the shard key size.
30. Sharding Features in 5.0
● We have a option to change the shard key by using reshardCollection command.
● To change the name of a sharded collection by using renameCollection command.
Things need to be taken care before resharding:
● Initially MongoDB block writes to two seconds and begins the resharding operation.
● Available space should be 1.2x the size of the collection that you want to reshard.
● Ensure the Disk (50%) & CPU (80%) utilisation will be minimal.
● The new shard key cannot have a uniqueness constraint
● If a collection having uniqueness constraint is not supported for Resharding
31. Sharding Features in 5.0
The following commands are not supported on the collection, while the resharding operation is in progress.
● collMod
● convertToCapped
● createIndexes
● createIndex()
● drop()
● dropIndexes
● dropIndex()
● renameCollection
● renameCollection()
32. Sharding Features in 5.0
The following methods are not supported on the cluster, while the resharding operation is in progress.
● addShard
● removeShard
● db.createCollection()
● dropDatabase
Resharded commands:
● db.adminCommand({ reshardCollection: "mydbops.client", key: {"cperiod": 1} } )