Distributed Monitoring and Redundancy with Merlin

op5

Agenda
About op5 and op5 Monitor

Need for distributed monitoring

Concept and implementation

Examples and scenarios

Customer cases

More Info

op5 Monitor

op5 Monitor
op5 Monitor Delivers
Delivers
 Reports & Trend analysis
Infrastructure
 Service levels
 Availability
Storage  Easy configuration
 Data status visualization
Application  SLA reports, maps, etc
 Monitor virtual, cloud and
Network
outsourced services
 Distributed and load-
balanced setups
Status
 Automated intelligent
alarms and action on
events
VM's

Need For Distributed Monitoring


Reliability
Reliability

 24x7 SLA for Mission-Critical Services
 Availability and SLA Reporting


Performance
Performance

 High Number of Active System checks
 Limitations of the Operating System
 Growth


Distributed Monitoring
Distributed Monitoring

 Locations
 Combinations of Network
 IP Address Conflicts
 Security

Why Merlin?

 Other solutions for redundancy are clunky at best
 Redundancy is important
 Load-balancing and automatic fail-over
 Network functionality (as its users see it) is hardly ever
measurable from a single place

Key Features

Redundancy
– Ensure availability
– Ensure availability

Performance
– Handle larger networks
– Handle larger networks

Load balancing
– Share the workload
– Share the workload

Distributed monitoring
– Geographical coverage
– Geographical coverage

Our solution for a scalable monitoring refers to an:
 Easy to use system
 Capable of constantly changing to fit the needs of your business
 Give stability and performance

op5 Merlin Open Source Project

Merlin - Module for Effortless Redundancy
and Load balancing in Nagios

For setting up distributed Nagios
installations

Brief project info

 Started 2006 as a prototype for a huge installation
 First used as redundancy engine 2009
 Used in production at +800 installations
 Largest production installation has 3 masters and 14
pollers
 v2.0.0 (with Nagios 4 support) to be released officially
next week
 Current bleeding edge is v2.0.0-beta2-p10

Key design concepts

 Peer loadbalancing is 100% transparent
 Pollers take care of one or more hostgroups
 Pollers can be (and often are) peered
 Binary protocol for extreme performance
 32-bit and 64-bit machines can't play together :-/
 Object config of two peers must be identical
 Pollers must never know about objects they're not
responsible for

Merlin System Design

Database

Backlog
Backlog

dbi
Socket

Backlog

Merlin Module
Config File Merlin Daemon

Merlin System Design

 Peer + Peer System Design

Database Database

Backlog Backlog

dbi dbi
Socket Socket

Backlog Backlog

Merlin Module Merlin Module
Config file Config file
Merlin Daemon Merlin Daemon

Peered Setup

Scalability / High Availability

The backend allows a variety of Peer
high availability setups and allows
almost infinite scalability by adding Peer
more "peers"

Config
Peer
Check results

Poll/check
Monitored objects

Master/Poller Setup

Remote Modules
Master
Remote modules allow the
monitoring of individual services
and devices using a dedicated, but
centrally managed monitoring
system
Poller Poller Cloud Poller

Config
Check results

Poll/check
Monitored objects Monitored objects Monitored objects

Combined Setup

Master/Peer

Poller Peer

Poller Peer Poller Peer


Configuration And Management

 Merlin automatically distributes object config
 Split and Push Config-in master / poller
configurations
 Straight-up sync for peers when needed

Early Adopters

 Mogul Services AB
 Hosts critical services for operators, call-centers
banking, online media and emergency broadcast
channels
 Very early implementation (beta-stage in POC-deal)
 Quite complex setup (peered masters, multiple
pollers)
 Very high availability demands

Performance

Peer
 Add more if needed to scale out
Peer
performance monitoring rather than to
scale up on hardware
 Growth of the monitoring system with
the requirements of the company
Peer Peer

Peer

Peer

Reliability

Peer
 Through dynamic distribution of
Peer
service checks the individual nodes are
peered. This setup also provides
redundancy

Peer Peer

Peer

Peer

Security

 Safety zones in the network
Master  Monitoring as a Service
 DMZ
 Branch offices with “one-way”
availability

Poller Poller Poller


DMZ Customer Network Secure Network

Cloud Monitoring

Master

 Monitoring of publicly
available services
Cloud Poller
 Services outside their own Poller

network monitor

Monitored objects Monitored objects

DMZ

Customer case study: Merlin Ahoy!

Company
Since late 1959, the Viking Line ships sail daily from
Finland to Sweden. The shipping company Viking Line
Abp based in Mariehamn, the capital of the autonomous
Åland Islands in Finland.

Challenge: “There have been improvements in functionality
Unreliable network uplinks to the core system, the when we communicate via satellite links.
It is easier then before to have the server on
change between cable connections, wireless and board to communicate with our main servers
satellite networks, depending on the location of the ship on shore”
Jonas Lindroos, IT department at Viking Line
make it difficult to monitor all on-board IP services such
as IPTV, VoIP, WiFi hotspot and infotainment.

Solution:
On each ship an op5 Monitor instance was installed. It
allows distributed monitoring of Viking Line, monitoring
all services on all vessels and provides centrally
managed monitoring.

Questions?

 http://git.op5.org
 http://www.op5.org

Distributed Monitoring and Redundancy with Merlin

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Distributed Monitoring and Redundancy with Merlin

Similaire à Distributed Monitoring and Redundancy with Merlin (20)

Plus de Nagios

Plus de Nagios (20)

Dernier

Dernier (20)

Distributed Monitoring and Redundancy with Merlin

Notes de l'éditeur