SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Linux Cluster next generation




                                                      Vladislav Bogdanov

Copyright © 2013. SaM Solutions
Heartbeat
       ●   Simple (mostly two-node) clusters
       ●   IP (UDP: unicast, broadcast, multicast) or serial communication
       ●   Limited functionality (esp. haresources mode), no clustered
           storage support
       ●   Later: added crm
       ●   Then split:
             –   Heartbeat (messaging + membership)
             –   Resource agents (OCF)
             –   Cluster-glue (lrmd, stonith, “plumbing”, IPC, logging)
             –   Cluster Resource Manager -> pacemaker
       ●   Now: deprecated (except resource-agents and pacemaker)


Copyright © 2013. SaM Solutions
RHCS
       ●   CMAN – messaging + quorum + API (then plugin to OpenAIS, so
           only quorum + API)
       ●   DLM – kernel + userspace
       ●   GFS (then GFS2) – kernel + userspace
       ●   CLVM
       ●   Fencing framework
       ●   Rgmanager - cluster resource manager
       ●   Later: pacemaker support for CMAN
       ●   Later: fencing via pacemaker
       ●   Then: project split, DLM and GFS-utils are now separate
           packages
       ●   Now: everything except them is deprecated after RHEL6
Copyright © 2013. SaM Solutions
OpenAIS
       ●   Totem protocol messaging – virtual synchrony
       ●   SA Forum APIs – CPG, AMF, CKPT, EVT, LCK
       ●   UDP multicast and broadcast communication
       ●   CMAN became a plugin to OpenAIS
       ●   Pacemaker supports OpenAIS via plugin
       ●   DLM, GFS2 (and OCFS2) work with OpenAIS via CMAN only
       ●   CLVM (mostly) works with OpenAIS
       ●   DLM, GFS2 and OCFS2 do not work reliably without CMAN
       ●   Later: project split, Totem and CPG go to corosync (1.x)
       ●   Now: discontinued


Copyright © 2013. SaM Solutions
Pacemaker (2-3 years ago)
        ●   Runs on top of heartbeat, openais (corosync 1.x)
        ●   Provides quorum
        ●   Supports:
              –   Resource
              –   Clone
                   ●   Anonymous
                   ●   Globally-unique
              –   Master-slave (with multi-master support)
              –   Resource migration
              –   Order and colocation constraints
              –   Groups
              –   Node attributes (volatile and non-volatile)
        ●   Fencing (STONITH) – heartbeat agents
        ●   Later: support for CMAN (rather ugly) and RHCS fence agents
Copyright © 2013. SaM Solutions
Corosync 2.x
       ●   Totem + CPG
       ●   UDP unicast in addition to multicast and broadcast
       ●   Multi-ring (active and passive)
       ●   Libqb based
       ●   Quorum (votequorum)
       ●   Single-threaded
       ●   No plugins anymore (OpenAIS, CMAN and pacemaker via plugin are
           not compatible)
       ●   CMAP – dynamic cluster reconfiguration
            –   Nodelist
       ●   Dynamic nodelist (UDPU) members via CMAP
       ●   More info – corosync_overview(8), cmap_keys(8), corosync.conf(5)

Copyright © 2013. SaM Solutions
Votequorum (corosync 2.x)
       ●    Configurable node votes
       ●    Expected votes (cluster-wide)
       ●    Special features
             –    Two-node mode
             –    WFA (wait-for-all) – no quorum until all configured nodes are seen
                  simultaneously
             –    LMS (last-man-standing) – dynamic expected_votes and quorum
                  recalculation (down)
             –    ATB (auto-tie-breaker) – partition with node with lowest known id
                  remains quorate even with 50% of votes
             –    AD (allow-downscale) – decrease expected_votes and quorum on
                  clean shutdown, down to configured expected_votes
       ●    More info - votequorum(5)
Copyright © 2013. SaM Solutions
DLM
       ●   Kernel component:
            –   TCP/SCTP communications for locking itself
            –   Now: in-kernel interface for fs-control
            –   Now: Additional sysctls instead of user-space configuration
       ●   User-space component (dlm_controld)
            –   Before:
                 ●   CMAN for both membership and quorum
                 ●   RHCS (fenced) for fencing
            –   Then:
                 ●   CPG for membership (but CPG_NODE_DOWN event is not handled)
                 ●   CMAN for quorum
                 ●   RHCS (fenced) API for fencing
            –   Now:
                 ●   CPG for membership
                 ●   Corosync quorum
                 ●   Stonithd (pacemaker) for fencing
                 ●   FS-control support is deprecated but still exists
Copyright © 2013. SaM Solutions
GFS2
        ●    Kernel component:
               –     DLM for locking
               –     Before: FS-control via user-space
               –     Now (Fedora 17+ and RHEL7): In-kernel FS-control
        ●    User-space component (gfs_controld)
               –     Before:
                        ●    DLM for locking control
                        ●    CMAN for membership
                        ●    RHCS (fenced) for fencing
               –     Now (Fedora 17+ and RHEL7): obsolete
Copyright © 2013. SaM Solutions
CLVM
       ●     Locking support: DLM
       ●     Membership support:
              –    Before:
                      ●   corosync (1.x)
                      ●   OpenAIS (with buggy LCK service)
                      ●   CMAN
              –    Now:
                      ●   CPG (corosync 2.x)
                      ●   Legacy
       ●     Quorum: missing (!!!), global stop on graceful shutdown
             of cluster node. Developers are insane. Patch exists.

Copyright © 2013. SaM Solutions
Pacemaker (now)
       ●   Support for corosync 2.x (messaging, quorum, parts of configuration
           via CMAP)
       ●
           CPG membership.
       ●
           CMAN, heartbeat, openais (with plugin) support is deprecated.
       ●
           Libqb IPC
       ●   Order-sets (A then (B and C) then D)
       ●   Colocation sets (A (B C) D)
            –    Sequential yes/no
       ●   Full stonithd (fencing daemon) rewrite
            –    Client API
            –    Fencing topology (A or (B and C))
            –    Used by DLM 4.x
            –    Can work without the rest of pacemaker
Copyright © 2013. SaM Solutions
Pacemaker (cont)
       ●    Full LRMD rewrite
       ●    Systemd resources
       ●    Nagios checks for container resources (VM)
       ●    Tickets
             –    GEO-clustering (with booth – PAXOS algorithm implementation)
             –    Tickets are volatile cluster attributes
       ●    Still missing:
             –    Non-volatile cluster attributes (can be implemented with tickets and
                  migratable 'Ticketer' resource-agent)
             –    Migration of 'first' resource causes 'then' resources to restart
       ●    Being developed
             –    Remote LRMD execution
Copyright © 2013. SaM Solutions
Management tools
        ●    Console:
               –     crmsh (from SUSE). Actively developed.
               –     pcs (from RedHat). New development.
        ●    Web-based:
               –     hawk (from SUSE)
               –     pcs (from RedHat)
        ●    GUI:
               –     Pacemaker-GUI (almost not supported)
               –     LCMC (Java)
Copyright © 2013. SaM Solutions
Linux cluster (Fedora17 and RHEL7)
       ●   Corosync 2.x
           –   Provides CPG and quorum
       ●   Fence-agents from RHCS
       ●   OCF resource-agents from heartbeat
       ●   Pacemaker:
           –   Uses:
               ●   CPG and quorum from corosync
               ●   Fence-agents for fencing
               ●   Resource-agents for resource management
           –   Provides:
               ●   CIB (Cluster information Base)
               ●   CRMd (Cluster Resource Manager daemon)
               ●   Stonith (fencing) daemon and API
       ●   DLM 4.x
           –   Uses CPG and quorum from corosync
           –   Uses fencing via stonith API (pacemaker)
       ●   CLVM
           –   Uses CPG (and quorum if patched) from corosync
       ●   GFS2 (with in-kernel FS-control)
           –   Uses DLM kernel component directly

Copyright © 2013. SaM Solutions
Questions?
                                     www.sam-solutions.com




                                  Value of Talent. Delivered.




Copyright © 2013. SaM Solutions

Contenu connexe

Tendances

High Availability With DRBD & Heartbeat
High Availability With DRBD & HeartbeatHigh Availability With DRBD & Heartbeat
High Availability With DRBD & HeartbeatChris Barber
 
brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2Nick Wang
 
Introduction to DRBD
Introduction to DRBDIntroduction to DRBD
Introduction to DRBDdawnlua
 
Percona XtraDB Cluster
Percona XtraDB ClusterPercona XtraDB Cluster
Percona XtraDB ClusterKenny Gryp
 
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxyBo-Yi Wu
 
Percona XtraDB Cluster - Small Presentation
Percona XtraDB Cluster - Small PresentationPercona XtraDB Cluster - Small Presentation
Percona XtraDB Cluster - Small PresentationJavier Tomas Zon
 
LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganetikawamuray
 
Breda Development Meetup 2016-06-08 - High Availability
Breda Development Meetup 2016-06-08 - High AvailabilityBreda Development Meetup 2016-06-08 - High Availability
Breda Development Meetup 2016-06-08 - High AvailabilityBas Peters
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
How to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepHow to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepSadique Puthen
 
Debugging Distributed Systems - Velocity Santa Clara 2016
Debugging Distributed Systems - Velocity Santa Clara 2016Debugging Distributed Systems - Velocity Santa Clara 2016
Debugging Distributed Systems - Velocity Santa Clara 2016Donny Nadolny
 
Spil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLSpil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLThijs Terlouw
 
Percon XtraDB Cluster in a nutshell
Percon XtraDB Cluster in a nutshellPercon XtraDB Cluster in a nutshell
Percon XtraDB Cluster in a nutshellFrederic Descamps
 
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)Wei Shan Ang
 

Tendances (20)

High Availability With DRBD & Heartbeat
High Availability With DRBD & HeartbeatHigh Availability With DRBD & Heartbeat
High Availability With DRBD & Heartbeat
 
Drbd
DrbdDrbd
Drbd
 
brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2
 
Introduction to DRBD
Introduction to DRBDIntroduction to DRBD
Introduction to DRBD
 
Percona XtraDB Cluster
Percona XtraDB ClusterPercona XtraDB Cluster
Percona XtraDB Cluster
 
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
 
Percona XtraDB Cluster - Small Presentation
Percona XtraDB Cluster - Small PresentationPercona XtraDB Cluster - Small Presentation
Percona XtraDB Cluster - Small Presentation
 
Fail over fail_back
Fail over fail_backFail over fail_back
Fail over fail_back
 
LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganeti
 
Breda Development Meetup 2016-06-08 - High Availability
Breda Development Meetup 2016-06-08 - High AvailabilityBreda Development Meetup 2016-06-08 - High Availability
Breda Development Meetup 2016-06-08 - High Availability
 
Speeding up ps and top
Speeding up ps and topSpeeding up ps and top
Speeding up ps and top
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
How to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepHow to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing Sleep
 
Debugging Distributed Systems - Velocity Santa Clara 2016
Debugging Distributed Systems - Velocity Santa Clara 2016Debugging Distributed Systems - Velocity Santa Clara 2016
Debugging Distributed Systems - Velocity Santa Clara 2016
 
Spil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLSpil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NL
 
Kgdb kdb modesetting
Kgdb kdb modesettingKgdb kdb modesetting
Kgdb kdb modesetting
 
92 grand prix_2013
92 grand prix_201392 grand prix_2013
92 grand prix_2013
 
Percon XtraDB Cluster in a nutshell
Percon XtraDB Cluster in a nutshellPercon XtraDB Cluster in a nutshell
Percon XtraDB Cluster in a nutshell
 
Tuning Linux for MongoDB
Tuning Linux for MongoDBTuning Linux for MongoDB
Tuning Linux for MongoDB
 
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
 

En vedette

Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)
Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)
Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)Ontico
 
Whither cman
Whither cmanWhither cman
Whither cmanjorg_marq
 
Totem协议(SRP/RRP)讲解
Totem协议(SRP/RRP)讲解Totem协议(SRP/RRP)讲解
Totem协议(SRP/RRP)讲解chinainvent
 
Cluster pitfalls recommand
Cluster pitfalls recommandCluster pitfalls recommand
Cluster pitfalls recommandsprdd
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to RealitySriram Subramanian
 

En vedette (6)

Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)
Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)
Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)
 
Whither cman
Whither cmanWhither cman
Whither cman
 
Totem协议(SRP/RRP)讲解
Totem协议(SRP/RRP)讲解Totem协议(SRP/RRP)讲解
Totem协议(SRP/RRP)讲解
 
Cluster pitfalls recommand
Cluster pitfalls recommandCluster pitfalls recommand
Cluster pitfalls recommand
 
Totem
TotemTotem
Totem
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to Reality
 

Similaire à Linux Cluster next generation

Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Roger Zhou 周志强
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE
 
AMP Kynetics - ELC 2018 Portland
AMP  Kynetics - ELC 2018 PortlandAMP  Kynetics - ELC 2018 Portland
AMP Kynetics - ELC 2018 PortlandKynetics
 
Asymmetric Multiprocessing - Kynetics ELC 2018 portland
Asymmetric Multiprocessing - Kynetics ELC 2018 portlandAsymmetric Multiprocessing - Kynetics ELC 2018 portland
Asymmetric Multiprocessing - Kynetics ELC 2018 portlandNicola La Gloria
 
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13Gosuke Miyashita
 
Red Hat Global File System (GFS)
Red Hat Global File System (GFS)Red Hat Global File System (GFS)
Red Hat Global File System (GFS)Schubert Zhang
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfhik_lhz
 
Running OpenStack in Production - Barcamp Saigon 2016
Running OpenStack in Production - Barcamp Saigon 2016Running OpenStack in Production - Barcamp Saigon 2016
Running OpenStack in Production - Barcamp Saigon 2016Thang Man
 
µCLinux on Pluto 6 Project presentation
µCLinux on Pluto 6 Project presentationµCLinux on Pluto 6 Project presentation
µCLinux on Pluto 6 Project presentationedlangley
 
Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Kynetics
 
Shall we play a game
Shall we play a gameShall we play a game
Shall we play a gamejackpot201
 
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOS
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOSBuilding a QT based solution on a i.MX7 processor running Linux and FreeRTOS
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOSFernando Luiz Cola
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) 동현 김
 
Tizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
Tizen RT: A Lightweight RTOS Platform for Low-End IoT DevicesTizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
Tizen RT: A Lightweight RTOS Platform for Low-End IoT DevicesSamsung Open Source Group
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonSage Weil
 
High Availability Storage (susecon2016)
High Availability Storage (susecon2016)High Availability Storage (susecon2016)
High Availability Storage (susecon2016)Roger Zhou 周志强
 
Using CloudStack With Clustered LVM
Using CloudStack With Clustered LVMUsing CloudStack With Clustered LVM
Using CloudStack With Clustered LVMMarcus L Sorensen
 

Similaire à Linux Cluster next generation (20)

Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
 
AMP Kynetics - ELC 2018 Portland
AMP  Kynetics - ELC 2018 PortlandAMP  Kynetics - ELC 2018 Portland
AMP Kynetics - ELC 2018 Portland
 
Asymmetric Multiprocessing - Kynetics ELC 2018 portland
Asymmetric Multiprocessing - Kynetics ELC 2018 portlandAsymmetric Multiprocessing - Kynetics ELC 2018 portland
Asymmetric Multiprocessing - Kynetics ELC 2018 portland
 
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
 
Red Hat Global File System (GFS)
Red Hat Global File System (GFS)Red Hat Global File System (GFS)
Red Hat Global File System (GFS)
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmf
 
Running OpenStack in Production - Barcamp Saigon 2016
Running OpenStack in Production - Barcamp Saigon 2016Running OpenStack in Production - Barcamp Saigon 2016
Running OpenStack in Production - Barcamp Saigon 2016
 
µCLinux on Pluto 6 Project presentation
µCLinux on Pluto 6 Project presentationµCLinux on Pluto 6 Project presentation
µCLinux on Pluto 6 Project presentation
 
Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7
 
Shall we play a game
Shall we play a gameShall we play a game
Shall we play a game
 
Shall we play a game?
Shall we play a game?Shall we play a game?
Shall we play a game?
 
0507 057 01 98 * Adana Klima Servisleri
0507 057 01 98 * Adana Klima Servisleri0507 057 01 98 * Adana Klima Servisleri
0507 057 01 98 * Adana Klima Servisleri
 
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOS
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOSBuilding a QT based solution on a i.MX7 processor running Linux and FreeRTOS
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOS
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
 
Tizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
Tizen RT: A Lightweight RTOS Platform for Low-End IoT DevicesTizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
Tizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
 
High Availability Storage (susecon2016)
High Availability Storage (susecon2016)High Availability Storage (susecon2016)
High Availability Storage (susecon2016)
 
Using CloudStack With Clustered LVM
Using CloudStack With Clustered LVMUsing CloudStack With Clustered LVM
Using CloudStack With Clustered LVM
 

Plus de samsolutionsby

Презентация проекта E-Learning
Презентация проекта E-LearningПрезентация проекта E-Learning
Презентация проекта E-Learningsamsolutionsby
 
Профилирование и оптимизация производительности Ruby-кода
Профилирование и оптимизация производительности Ruby-кодаПрофилирование и оптимизация производительности Ruby-кода
Профилирование и оптимизация производительности Ruby-кодаsamsolutionsby
 
Пакетирование для Debian/Ubuntu
Пакетирование для Debian/UbuntuПакетирование для Debian/Ubuntu
Пакетирование для Debian/Ubuntusamsolutionsby
 
Lvee winter-2013-corporate linux-education_pynkin_shakhov
Lvee winter-2013-corporate linux-education_pynkin_shakhovLvee winter-2013-corporate linux-education_pynkin_shakhov
Lvee winter-2013-corporate linux-education_pynkin_shakhovsamsolutionsby
 
Лінуксоўка ў новым фармаце
Лінуксоўка ў новым фармацеЛінуксоўка ў новым фармаце
Лінуксоўка ў новым фармацеsamsolutionsby
 

Plus de samsolutionsby (6)

Презентация проекта E-Learning
Презентация проекта E-LearningПрезентация проекта E-Learning
Презентация проекта E-Learning
 
Профилирование и оптимизация производительности Ruby-кода
Профилирование и оптимизация производительности Ruby-кодаПрофилирование и оптимизация производительности Ruby-кода
Профилирование и оптимизация производительности Ruby-кода
 
Пакетирование для Debian/Ubuntu
Пакетирование для Debian/UbuntuПакетирование для Debian/Ubuntu
Пакетирование для Debian/Ubuntu
 
Penetration testing
Penetration testingPenetration testing
Penetration testing
 
Lvee winter-2013-corporate linux-education_pynkin_shakhov
Lvee winter-2013-corporate linux-education_pynkin_shakhovLvee winter-2013-corporate linux-education_pynkin_shakhov
Lvee winter-2013-corporate linux-education_pynkin_shakhov
 
Лінуксоўка ў новым фармаце
Лінуксоўка ў новым фармацеЛінуксоўка ў новым фармаце
Лінуксоўка ў новым фармаце
 

Linux Cluster next generation

  • 1. Linux Cluster next generation Vladislav Bogdanov Copyright © 2013. SaM Solutions
  • 2. Heartbeat ● Simple (mostly two-node) clusters ● IP (UDP: unicast, broadcast, multicast) or serial communication ● Limited functionality (esp. haresources mode), no clustered storage support ● Later: added crm ● Then split: – Heartbeat (messaging + membership) – Resource agents (OCF) – Cluster-glue (lrmd, stonith, “plumbing”, IPC, logging) – Cluster Resource Manager -> pacemaker ● Now: deprecated (except resource-agents and pacemaker) Copyright © 2013. SaM Solutions
  • 3. RHCS ● CMAN – messaging + quorum + API (then plugin to OpenAIS, so only quorum + API) ● DLM – kernel + userspace ● GFS (then GFS2) – kernel + userspace ● CLVM ● Fencing framework ● Rgmanager - cluster resource manager ● Later: pacemaker support for CMAN ● Later: fencing via pacemaker ● Then: project split, DLM and GFS-utils are now separate packages ● Now: everything except them is deprecated after RHEL6 Copyright © 2013. SaM Solutions
  • 4. OpenAIS ● Totem protocol messaging – virtual synchrony ● SA Forum APIs – CPG, AMF, CKPT, EVT, LCK ● UDP multicast and broadcast communication ● CMAN became a plugin to OpenAIS ● Pacemaker supports OpenAIS via plugin ● DLM, GFS2 (and OCFS2) work with OpenAIS via CMAN only ● CLVM (mostly) works with OpenAIS ● DLM, GFS2 and OCFS2 do not work reliably without CMAN ● Later: project split, Totem and CPG go to corosync (1.x) ● Now: discontinued Copyright © 2013. SaM Solutions
  • 5. Pacemaker (2-3 years ago) ● Runs on top of heartbeat, openais (corosync 1.x) ● Provides quorum ● Supports: – Resource – Clone ● Anonymous ● Globally-unique – Master-slave (with multi-master support) – Resource migration – Order and colocation constraints – Groups – Node attributes (volatile and non-volatile) ● Fencing (STONITH) – heartbeat agents ● Later: support for CMAN (rather ugly) and RHCS fence agents Copyright © 2013. SaM Solutions
  • 6. Corosync 2.x ● Totem + CPG ● UDP unicast in addition to multicast and broadcast ● Multi-ring (active and passive) ● Libqb based ● Quorum (votequorum) ● Single-threaded ● No plugins anymore (OpenAIS, CMAN and pacemaker via plugin are not compatible) ● CMAP – dynamic cluster reconfiguration – Nodelist ● Dynamic nodelist (UDPU) members via CMAP ● More info – corosync_overview(8), cmap_keys(8), corosync.conf(5) Copyright © 2013. SaM Solutions
  • 7. Votequorum (corosync 2.x) ● Configurable node votes ● Expected votes (cluster-wide) ● Special features – Two-node mode – WFA (wait-for-all) – no quorum until all configured nodes are seen simultaneously – LMS (last-man-standing) – dynamic expected_votes and quorum recalculation (down) – ATB (auto-tie-breaker) – partition with node with lowest known id remains quorate even with 50% of votes – AD (allow-downscale) – decrease expected_votes and quorum on clean shutdown, down to configured expected_votes ● More info - votequorum(5) Copyright © 2013. SaM Solutions
  • 8. DLM ● Kernel component: – TCP/SCTP communications for locking itself – Now: in-kernel interface for fs-control – Now: Additional sysctls instead of user-space configuration ● User-space component (dlm_controld) – Before: ● CMAN for both membership and quorum ● RHCS (fenced) for fencing – Then: ● CPG for membership (but CPG_NODE_DOWN event is not handled) ● CMAN for quorum ● RHCS (fenced) API for fencing – Now: ● CPG for membership ● Corosync quorum ● Stonithd (pacemaker) for fencing ● FS-control support is deprecated but still exists Copyright © 2013. SaM Solutions
  • 9. GFS2 ● Kernel component: – DLM for locking – Before: FS-control via user-space – Now (Fedora 17+ and RHEL7): In-kernel FS-control ● User-space component (gfs_controld) – Before: ● DLM for locking control ● CMAN for membership ● RHCS (fenced) for fencing – Now (Fedora 17+ and RHEL7): obsolete Copyright © 2013. SaM Solutions
  • 10. CLVM ● Locking support: DLM ● Membership support: – Before: ● corosync (1.x) ● OpenAIS (with buggy LCK service) ● CMAN – Now: ● CPG (corosync 2.x) ● Legacy ● Quorum: missing (!!!), global stop on graceful shutdown of cluster node. Developers are insane. Patch exists. Copyright © 2013. SaM Solutions
  • 11. Pacemaker (now) ● Support for corosync 2.x (messaging, quorum, parts of configuration via CMAP) ● CPG membership. ● CMAN, heartbeat, openais (with plugin) support is deprecated. ● Libqb IPC ● Order-sets (A then (B and C) then D) ● Colocation sets (A (B C) D) – Sequential yes/no ● Full stonithd (fencing daemon) rewrite – Client API – Fencing topology (A or (B and C)) – Used by DLM 4.x – Can work without the rest of pacemaker Copyright © 2013. SaM Solutions
  • 12. Pacemaker (cont) ● Full LRMD rewrite ● Systemd resources ● Nagios checks for container resources (VM) ● Tickets – GEO-clustering (with booth – PAXOS algorithm implementation) – Tickets are volatile cluster attributes ● Still missing: – Non-volatile cluster attributes (can be implemented with tickets and migratable 'Ticketer' resource-agent) – Migration of 'first' resource causes 'then' resources to restart ● Being developed – Remote LRMD execution Copyright © 2013. SaM Solutions
  • 13. Management tools ● Console: – crmsh (from SUSE). Actively developed. – pcs (from RedHat). New development. ● Web-based: – hawk (from SUSE) – pcs (from RedHat) ● GUI: – Pacemaker-GUI (almost not supported) – LCMC (Java) Copyright © 2013. SaM Solutions
  • 14. Linux cluster (Fedora17 and RHEL7) ● Corosync 2.x – Provides CPG and quorum ● Fence-agents from RHCS ● OCF resource-agents from heartbeat ● Pacemaker: – Uses: ● CPG and quorum from corosync ● Fence-agents for fencing ● Resource-agents for resource management – Provides: ● CIB (Cluster information Base) ● CRMd (Cluster Resource Manager daemon) ● Stonith (fencing) daemon and API ● DLM 4.x – Uses CPG and quorum from corosync – Uses fencing via stonith API (pacemaker) ● CLVM – Uses CPG (and quorum if patched) from corosync ● GFS2 (with in-kernel FS-control) – Uses DLM kernel component directly Copyright © 2013. SaM Solutions
  • 15. Questions? www.sam-solutions.com Value of Talent. Delivered. Copyright © 2013. SaM Solutions