SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
Data Reduction for Gluster with VDO
October 27th 2017
Presented by:
Jered Floyd
Member of Technical Staff
jered@redhat.com
VDO | PRESENTATION2
What is Virtual Data Optimizer?
• VDO - is a Linux device mapper driver that provides deduplication and compression
• VDO has two kernel modules that work together to provide data efficiency
• The kvdo module - controls block layer and compression
• The uds module - maintains deduplication index and provides advice
• VDO can be used under:
• Block storage
• File storage - (Gluster!)
• Object storage
• Utilities:
• vdo - Manage VDO volumes (create, remove, modify) and view configurations
• vdostats - Provides information on statistic and health
● Open Source and GPL as of Monday! (~165k LOC)
https://github.com/dm-vdo
● Will ship with RHEL 7.5 as an out-of-tree module
● Working with upstream community to address concerns
(Tabs! Spaces! camelCase! under_scores! And some real
stuff too)
● Hope to be upstream in 9-12 months
Open Source and Upstream Plans
4
User Benefits
• Deduplication and compression have been core user requests for Gluster
• VDO increases storage efficiency, reduces costs
• VDO makes high-performance storage more affordable
• Because data reduction is applied at the OS layer, VDO works anywhere the
OS lives
• OS-based data reduction benefits all software layers above
• Less data to store translates to less data to move - device level replication
can be vastly accelerated
5
Deployment
● Works directly with block devices
and filesystems
● Layer other dm modules as desired
(dm-crypt, dm-raid)
● In the cloud or on premises
● Addresses all storage deployment
models
Ceph OSD LIO Local
Gluster,
NFS,
Samba
Filesystem LVM
Filesystem
LVM
Filesystem
LVM
VDO VDOVDOVDO
Block Dev Block Dev Block Dev Block Dev
Local or Networked Storage
Object Block Compute File
kernel
6
Install and Configure VDO
• Clone repo, build and load modules (this should work…)
• Create VDO volume:
• vdo create --name=vdo1 --device=/dev/sdb --vdoLogicalSize=2T
• Create a file system:
• mkfs.xfs -K /dev/mapper/vdo1
• Mount the file system:
• mkdir /mnt/vdo1
• mount -o discard /dev/mapper/vdo1 /mnt/vdo1
• Set up a Gluster volume using filesystem as a brick
Install and Configure VDO
● Example vdostats and df results:
# vdostats
Device 1K-blocks Used Available Use% Space saving%
/dev/mapper/vdo1 860463104 1214924 859248180 0%
99%
# df /mnt/vdo
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vdo1 10735331288 33256 10735298032 1% /mnt/vdo1
8
How Much Can Be Saved?
Redundant Workflow
● Backups
● Virtual Desktops
● Virtual Servers
● Containers
● Shared Home Directories
Compressible Data
● Databases (textual content)
● Messaging
● Monitoring, alerting, tracing
● Systems and application
logging
75% (4X)50% (2X) 66% (3X) 80% (5X) 83% (6X) +
GoodCandidates
Savings Potential
Dependent on data type and workflow
9
Real-world Results
Source: Red Hat/Permabit testing by Dennis Keefe
10
Real(er)-world Results
Source: Red Hat testing by Huamin Chen
● Results from 900 container images: 259 GB reduced to 49 GB
11
Thin Provisioning Deduplication Compression
How It Works
12
● Inline data deduplication, 4 KB granularity
● Inline compression
● Zero-block elimination
● Thin provisioning
● Extend physical storage up to 256 TB (online)
● Extend logical storage up to 4 PB (online)
● Sync and async modes
Key Features
13
VDO Requirements
• OS Requirements
• RHEL 7 (and surely others)
• RAM
• ~500 MB plus 268 MB / TB
physical storage
■ 4 TB physical requires
~1.5 GB RAM
■ 16 TB physical requires
~4.5 GB RAM
• CPU
• x86_64
• Other arch coming soon
• Software Requirements
• Python 2.7+
• libc 2.7+, libz, zlib1g,
PyYAML
14
● Free space monitoring (vdostats, dmevent)
● Enhancements to DHT for better dedupe at scale
● Deduplication enhancements for replication
● Documentation / Best Practices / Automation
● Questions? Complaints? I’m Jered Floyd <jered@redhat.com>
● Join us at vdo-devel@redhat.com:
https://www.redhat.com/mailman/listinfo/vdo-devel
Next Steps, Q&A

Contenu connexe

Tendances

Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouseAltinity Ltd
 
Open shift 4 infra deep dive
Open shift 4    infra deep diveOpen shift 4    infra deep dive
Open shift 4 infra deep diveWinton Winton
 
In-depth Troubleshooting on NetScaler using Command Line Tools
In-depth Troubleshooting on NetScaler using Command Line ToolsIn-depth Troubleshooting on NetScaler using Command Line Tools
In-depth Troubleshooting on NetScaler using Command Line ToolsDavid McGeough
 
Introduction to Git
Introduction to GitIntroduction to Git
Introduction to Gitatishgoswami
 
News And Development Update Of The CloudStack Tungsten Fabric SDN Plug-in
News And Development Update Of The CloudStack Tungsten Fabric SDN Plug-inNews And Development Update Of The CloudStack Tungsten Fabric SDN Plug-in
News And Development Update Of The CloudStack Tungsten Fabric SDN Plug-inShapeBlue
 
Understanding GIT and Version Control
Understanding GIT and Version ControlUnderstanding GIT and Version Control
Understanding GIT and Version ControlSourabh Sahu
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Brian Brazil
 
Git slides
Git slidesGit slides
Git slidesNanyak S
 
Caching Data in OutSystems: A Tale of Gains Without Pain
Caching Data in OutSystems: A Tale of Gains Without PainCaching Data in OutSystems: A Tale of Gains Without Pain
Caching Data in OutSystems: A Tale of Gains Without PainCatarinaPereira64715
 
MinIO January 2020 Briefing
MinIO January 2020 BriefingMinIO January 2020 Briefing
MinIO January 2020 BriefingJonathan Symonds
 
Seastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephSeastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephScyllaDB
 
Advanced Tools and Techniques for Troubleshooting NetScaler Appliances
Advanced Tools and Techniques for Troubleshooting NetScaler AppliancesAdvanced Tools and Techniques for Troubleshooting NetScaler Appliances
Advanced Tools and Techniques for Troubleshooting NetScaler AppliancesDavid McGeough
 
WTF is GitOps and Why You Should Care?
WTF is GitOps and Why You Should Care?WTF is GitOps and Why You Should Care?
WTF is GitOps and Why You Should Care?Weaveworks
 
Introducing GitLab (September 2018)
Introducing GitLab (September 2018)Introducing GitLab (September 2018)
Introducing GitLab (September 2018)Noa Harel
 
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...ShapeBlue
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to PrometheusJulien Pivotto
 
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache JamesRoom 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache JamesVietnam Open Infrastructure User Group
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatAmazon Web Services
 

Tendances (20)

Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouse
 
Open shift 4 infra deep dive
Open shift 4    infra deep diveOpen shift 4    infra deep dive
Open shift 4 infra deep dive
 
In-depth Troubleshooting on NetScaler using Command Line Tools
In-depth Troubleshooting on NetScaler using Command Line ToolsIn-depth Troubleshooting on NetScaler using Command Line Tools
In-depth Troubleshooting on NetScaler using Command Line Tools
 
Introduction to Git
Introduction to GitIntroduction to Git
Introduction to Git
 
News And Development Update Of The CloudStack Tungsten Fabric SDN Plug-in
News And Development Update Of The CloudStack Tungsten Fabric SDN Plug-inNews And Development Update Of The CloudStack Tungsten Fabric SDN Plug-in
News And Development Update Of The CloudStack Tungsten Fabric SDN Plug-in
 
Understanding GIT and Version Control
Understanding GIT and Version ControlUnderstanding GIT and Version Control
Understanding GIT and Version Control
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
Git slides
Git slidesGit slides
Git slides
 
Caching Data in OutSystems: A Tale of Gains Without Pain
Caching Data in OutSystems: A Tale of Gains Without PainCaching Data in OutSystems: A Tale of Gains Without Pain
Caching Data in OutSystems: A Tale of Gains Without Pain
 
MinIO January 2020 Briefing
MinIO January 2020 BriefingMinIO January 2020 Briefing
MinIO January 2020 Briefing
 
Seastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephSeastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for Ceph
 
Advanced Tools and Techniques for Troubleshooting NetScaler Appliances
Advanced Tools and Techniques for Troubleshooting NetScaler AppliancesAdvanced Tools and Techniques for Troubleshooting NetScaler Appliances
Advanced Tools and Techniques for Troubleshooting NetScaler Appliances
 
WTF is GitOps and Why You Should Care?
WTF is GitOps and Why You Should Care?WTF is GitOps and Why You Should Care?
WTF is GitOps and Why You Should Care?
 
Introducing GitLab (September 2018)
Introducing GitLab (September 2018)Introducing GitLab (September 2018)
Introducing GitLab (September 2018)
 
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache JamesRoom 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
 
Gitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCDGitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCD
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red Hat
 
Zuul @ Netflix SpringOne Platform
Zuul @ Netflix SpringOne PlatformZuul @ Netflix SpringOne Platform
Zuul @ Netflix SpringOne Platform
 

Similaire à Data Reduction for Gluster with VDO

The Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelThe Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelYasunori Goto
 
Windows 8 dddd (beekelaar)
Windows 8 dddd (beekelaar)Windows 8 dddd (beekelaar)
Windows 8 dddd (beekelaar)hypervnu
 
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and HadoopIOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and HadoopLeons Petražickis
 
Tuning DB2 in a Solaris Environment
Tuning DB2 in a Solaris EnvironmentTuning DB2 in a Solaris Environment
Tuning DB2 in a Solaris EnvironmentJignesh Shah
 
Implement SQL Server on an Azure VM
Implement SQL Server on an Azure VMImplement SQL Server on an Azure VM
Implement SQL Server on an Azure VMJames Serra
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes vty
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdoseGluster.org
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdoseGluster.org
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresmkorremans
 
Fluentd and docker monitoring
Fluentd and docker monitoringFluentd and docker monitoring
Fluentd and docker monitoringVinay Krishna
 
Docker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los AngelesDocker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los AngelesJérôme Petazzoni
 
MongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseMongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseFITC
 
ContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
ContainerCon EU 2016 - Software-Defined Storage and Container SchedulersContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
ContainerCon EU 2016 - Software-Defined Storage and Container SchedulersDavid vonThenen
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudLeons Petražickis
 
Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned  Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned RightScale
 
Introduction to Docker and Monitoring with InfluxData
Introduction to Docker and Monitoring with InfluxDataIntroduction to Docker and Monitoring with InfluxData
Introduction to Docker and Monitoring with InfluxDataInfluxData
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterPatrick Quairoli
 
Discussing the difference between docker dontainers and virtual machines
Discussing the difference between docker dontainers and virtual machinesDiscussing the difference between docker dontainers and virtual machines
Discussing the difference between docker dontainers and virtual machinesSteven Grzbielok
 

Similaire à Data Reduction for Gluster with VDO (20)

The Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelThe Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux Kernel
 
Windows 8 dddd (beekelaar)
Windows 8 dddd (beekelaar)Windows 8 dddd (beekelaar)
Windows 8 dddd (beekelaar)
 
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and HadoopIOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
 
Deployment Day Session 1: Introduction to MDT 2012
Deployment Day Session 1: Introduction to MDT 2012Deployment Day Session 1: Introduction to MDT 2012
Deployment Day Session 1: Introduction to MDT 2012
 
Tuning DB2 in a Solaris Environment
Tuning DB2 in a Solaris EnvironmentTuning DB2 in a Solaris Environment
Tuning DB2 in a Solaris Environment
 
Implement SQL Server on an Azure VM
Implement SQL Server on an Azure VMImplement SQL Server on an Azure VM
Implement SQL Server on an Azure VM
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-features
 
Fluentd and docker monitoring
Fluentd and docker monitoringFluentd and docker monitoring
Fluentd and docker monitoring
 
Docker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los AngelesDocker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los Angeles
 
MongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseMongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL Database
 
ContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
ContainerCon EU 2016 - Software-Defined Storage and Container SchedulersContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
ContainerCon EU 2016 - Software-Defined Storage and Container Schedulers
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
 
Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned  Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned
 
Introduction to Docker and Monitoring with InfluxData
Introduction to Docker and Monitoring with InfluxDataIntroduction to Docker and Monitoring with InfluxData
Introduction to Docker and Monitoring with InfluxData
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
Discussing the difference between docker dontainers and virtual machines
Discussing the difference between docker dontainers and virtual machinesDiscussing the difference between docker dontainers and virtual machines
Discussing the difference between docker dontainers and virtual machines
 
EVA_Navigator_Presentation.ppt
EVA_Navigator_Presentation.pptEVA_Navigator_Presentation.ppt
EVA_Navigator_Presentation.ppt
 

Plus de Gluster.org

Automating Gluster @ Facebook - Shreyas Siravara
Automating Gluster @ Facebook - Shreyas SiravaraAutomating Gluster @ Facebook - Shreyas Siravara
Automating Gluster @ Facebook - Shreyas SiravaraGluster.org
 
nfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
nfusr: a new userspace NFS client based on libnfs - Shreyas Siravaranfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
nfusr: a new userspace NFS client based on libnfs - Shreyas SiravaraGluster.org
 
Facebook’s upstream approach to GlusterFS - David Hasson
Facebook’s upstream approach to GlusterFS  - David HassonFacebook’s upstream approach to GlusterFS  - David Hasson
Facebook’s upstream approach to GlusterFS - David HassonGluster.org
 
Throttling Traffic at Facebook Scale
Throttling Traffic at Facebook ScaleThrottling Traffic at Facebook Scale
Throttling Traffic at Facebook ScaleGluster.org
 
GlusterFS w/ Tiered XFS
GlusterFS w/ Tiered XFS  GlusterFS w/ Tiered XFS
GlusterFS w/ Tiered XFS Gluster.org
 
Gluster Metrics: why they are crucial for running stable deployments of all s...
Gluster Metrics: why they are crucial for running stable deployments of all s...Gluster Metrics: why they are crucial for running stable deployments of all s...
Gluster Metrics: why they are crucial for running stable deployments of all s...Gluster.org
 
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)Gluster.org
 
Releases: What are contributors responsible for
Releases: What are contributors responsible forReleases: What are contributors responsible for
Releases: What are contributors responsible forGluster.org
 
RIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
RIO Distribution: Reconstructing the onion - Shyamsundar RanganathanRIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
RIO Distribution: Reconstructing the onion - Shyamsundar RanganathanGluster.org
 
Gluster and Kubernetes
Gluster and KubernetesGluster and Kubernetes
Gluster and KubernetesGluster.org
 
Native Clients, more the merrier with GFProxy!
Native Clients, more the merrier with GFProxy!Native Clients, more the merrier with GFProxy!
Native Clients, more the merrier with GFProxy!Gluster.org
 
Gluster: a SWOT Analysis
Gluster: a SWOT Analysis Gluster: a SWOT Analysis
Gluster: a SWOT Analysis Gluster.org
 
GlusterD-2.0: What's Happening? - Kaushal Madappa
GlusterD-2.0: What's Happening? - Kaushal MadappaGlusterD-2.0: What's Happening? - Kaushal Madappa
GlusterD-2.0: What's Happening? - Kaushal MadappaGluster.org
 
Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6Gluster.org
 
What Makes Us Fail
What Makes Us FailWhat Makes Us Fail
What Makes Us FailGluster.org
 
Gluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and futureGluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and futureGluster.org
 
Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2Gluster.org
 
Hands On Gluster with Jeff Darcy
Hands On Gluster with Jeff DarcyHands On Gluster with Jeff Darcy
Hands On Gluster with Jeff DarcyGluster.org
 
Architecture of the High Availability Solution for Ganesha and Samba with Kal...
Architecture of the High Availability Solution for Ganesha and Samba with Kal...Architecture of the High Availability Solution for Ganesha and Samba with Kal...
Architecture of the High Availability Solution for Ganesha and Samba with Kal...Gluster.org
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightGluster.org
 

Plus de Gluster.org (20)

Automating Gluster @ Facebook - Shreyas Siravara
Automating Gluster @ Facebook - Shreyas SiravaraAutomating Gluster @ Facebook - Shreyas Siravara
Automating Gluster @ Facebook - Shreyas Siravara
 
nfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
nfusr: a new userspace NFS client based on libnfs - Shreyas Siravaranfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
nfusr: a new userspace NFS client based on libnfs - Shreyas Siravara
 
Facebook’s upstream approach to GlusterFS - David Hasson
Facebook’s upstream approach to GlusterFS  - David HassonFacebook’s upstream approach to GlusterFS  - David Hasson
Facebook’s upstream approach to GlusterFS - David Hasson
 
Throttling Traffic at Facebook Scale
Throttling Traffic at Facebook ScaleThrottling Traffic at Facebook Scale
Throttling Traffic at Facebook Scale
 
GlusterFS w/ Tiered XFS
GlusterFS w/ Tiered XFS  GlusterFS w/ Tiered XFS
GlusterFS w/ Tiered XFS
 
Gluster Metrics: why they are crucial for running stable deployments of all s...
Gluster Metrics: why they are crucial for running stable deployments of all s...Gluster Metrics: why they are crucial for running stable deployments of all s...
Gluster Metrics: why they are crucial for running stable deployments of all s...
 
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
 
Releases: What are contributors responsible for
Releases: What are contributors responsible forReleases: What are contributors responsible for
Releases: What are contributors responsible for
 
RIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
RIO Distribution: Reconstructing the onion - Shyamsundar RanganathanRIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
RIO Distribution: Reconstructing the onion - Shyamsundar Ranganathan
 
Gluster and Kubernetes
Gluster and KubernetesGluster and Kubernetes
Gluster and Kubernetes
 
Native Clients, more the merrier with GFProxy!
Native Clients, more the merrier with GFProxy!Native Clients, more the merrier with GFProxy!
Native Clients, more the merrier with GFProxy!
 
Gluster: a SWOT Analysis
Gluster: a SWOT Analysis Gluster: a SWOT Analysis
Gluster: a SWOT Analysis
 
GlusterD-2.0: What's Happening? - Kaushal Madappa
GlusterD-2.0: What's Happening? - Kaushal MadappaGlusterD-2.0: What's Happening? - Kaushal Madappa
GlusterD-2.0: What's Happening? - Kaushal Madappa
 
Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6
 
What Makes Us Fail
What Makes Us FailWhat Makes Us Fail
What Makes Us Fail
 
Gluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and futureGluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and future
 
Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2
 
Hands On Gluster with Jeff Darcy
Hands On Gluster with Jeff DarcyHands On Gluster with Jeff Darcy
Hands On Gluster with Jeff Darcy
 
Architecture of the High Availability Solution for Ganesha and Samba with Kal...
Architecture of the High Availability Solution for Ganesha and Samba with Kal...Architecture of the High Availability Solution for Ganesha and Samba with Kal...
Architecture of the High Availability Solution for Ganesha and Samba with Kal...
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan Lambright
 

Dernier

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Dernier (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Data Reduction for Gluster with VDO

  • 1. Data Reduction for Gluster with VDO October 27th 2017 Presented by: Jered Floyd Member of Technical Staff jered@redhat.com
  • 2. VDO | PRESENTATION2 What is Virtual Data Optimizer? • VDO - is a Linux device mapper driver that provides deduplication and compression • VDO has two kernel modules that work together to provide data efficiency • The kvdo module - controls block layer and compression • The uds module - maintains deduplication index and provides advice • VDO can be used under: • Block storage • File storage - (Gluster!) • Object storage • Utilities: • vdo - Manage VDO volumes (create, remove, modify) and view configurations • vdostats - Provides information on statistic and health
  • 3. ● Open Source and GPL as of Monday! (~165k LOC) https://github.com/dm-vdo ● Will ship with RHEL 7.5 as an out-of-tree module ● Working with upstream community to address concerns (Tabs! Spaces! camelCase! under_scores! And some real stuff too) ● Hope to be upstream in 9-12 months Open Source and Upstream Plans
  • 4. 4 User Benefits • Deduplication and compression have been core user requests for Gluster • VDO increases storage efficiency, reduces costs • VDO makes high-performance storage more affordable • Because data reduction is applied at the OS layer, VDO works anywhere the OS lives • OS-based data reduction benefits all software layers above • Less data to store translates to less data to move - device level replication can be vastly accelerated
  • 5. 5 Deployment ● Works directly with block devices and filesystems ● Layer other dm modules as desired (dm-crypt, dm-raid) ● In the cloud or on premises ● Addresses all storage deployment models Ceph OSD LIO Local Gluster, NFS, Samba Filesystem LVM Filesystem LVM Filesystem LVM VDO VDOVDOVDO Block Dev Block Dev Block Dev Block Dev Local or Networked Storage Object Block Compute File kernel
  • 6. 6 Install and Configure VDO • Clone repo, build and load modules (this should work…) • Create VDO volume: • vdo create --name=vdo1 --device=/dev/sdb --vdoLogicalSize=2T • Create a file system: • mkfs.xfs -K /dev/mapper/vdo1 • Mount the file system: • mkdir /mnt/vdo1 • mount -o discard /dev/mapper/vdo1 /mnt/vdo1 • Set up a Gluster volume using filesystem as a brick
  • 7. Install and Configure VDO ● Example vdostats and df results: # vdostats Device 1K-blocks Used Available Use% Space saving% /dev/mapper/vdo1 860463104 1214924 859248180 0% 99% # df /mnt/vdo Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vdo1 10735331288 33256 10735298032 1% /mnt/vdo1
  • 8. 8 How Much Can Be Saved? Redundant Workflow ● Backups ● Virtual Desktops ● Virtual Servers ● Containers ● Shared Home Directories Compressible Data ● Databases (textual content) ● Messaging ● Monitoring, alerting, tracing ● Systems and application logging 75% (4X)50% (2X) 66% (3X) 80% (5X) 83% (6X) + GoodCandidates Savings Potential Dependent on data type and workflow
  • 9. 9 Real-world Results Source: Red Hat/Permabit testing by Dennis Keefe
  • 10. 10 Real(er)-world Results Source: Red Hat testing by Huamin Chen ● Results from 900 container images: 259 GB reduced to 49 GB
  • 11. 11 Thin Provisioning Deduplication Compression How It Works
  • 12. 12 ● Inline data deduplication, 4 KB granularity ● Inline compression ● Zero-block elimination ● Thin provisioning ● Extend physical storage up to 256 TB (online) ● Extend logical storage up to 4 PB (online) ● Sync and async modes Key Features
  • 13. 13 VDO Requirements • OS Requirements • RHEL 7 (and surely others) • RAM • ~500 MB plus 268 MB / TB physical storage ■ 4 TB physical requires ~1.5 GB RAM ■ 16 TB physical requires ~4.5 GB RAM • CPU • x86_64 • Other arch coming soon • Software Requirements • Python 2.7+ • libc 2.7+, libz, zlib1g, PyYAML
  • 14. 14 ● Free space monitoring (vdostats, dmevent) ● Enhancements to DHT for better dedupe at scale ● Deduplication enhancements for replication ● Documentation / Best Practices / Automation ● Questions? Complaints? I’m Jered Floyd <jered@redhat.com> ● Join us at vdo-devel@redhat.com: https://www.redhat.com/mailman/listinfo/vdo-devel Next Steps, Q&A