Marcelo Perazolo, Lead Software Architect, IBM Corporation - In this session, Marcelo will describe how Nagios can be
integrated and extended for the monitoring of a typical
power-based converged infrastructure, and how it interfaces with existing element managers to provide a single point of integration for passive and active monitoring purposes.
2. About Me
• Software Architect with IBM
– Worked on different IBM divisions:
Tivoli, WebSphere, Systems
– 25 years experience with:
Management of “anything under the Sun”
(Systems, Network, Storage, Middleware, Applications, Cloud, etc.)
– Emphasis on:
Open Source software
Power Systems / OpenPOWER
– Small previous exposure to Open Source community.
Trying to convert to the “light” side of the Force,
be a good user, contributor and open community citizen !
3. Why Power ?
MariaDB on POWER8 S822L delivers1.87X performance per core and up to
40% better price-performance than Intel Xeon E5-2660 v3 Haswell
Reduce operating costs with less systems at a lower acquisition cost
13293
10267
5467
6406
0
2000
4000
6000
8000
10000
12000
14000
16000
PO
W
ER
8
E5-2660
v3
PO
W
ER
8
E5-2660
v3
TransactionsperMinute
READ 90/10
Rd/Wr
0.48
0.38
0.63
0.45
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
PO
W
ER
8
E5-2660
v3
PO
W
ER
8
E5-2660
v3
TpM/$
READ 90/10
Rd/Wr
• Results are based on IBM internal testing of single system image systems running Sysbench OLTP version.05 @ 32M and are current as of May 29,
2015. Performance improvement figures are based on multiple G2 processes running a 32 million record workload . Individual results will vary
depending on individual workloads, configurations and conditions.
• IBM Power System S822L; 20 cores / 160 threads, POWER8; 3.4GHz, 128 GB memory, MariaDB 10.1, RHEL 7.1, RHEV
Competitive stack: Dell R730; 20 cores / 40 threads; Intel E5-2660 v3; 2.6 GHz; 128 GB; , MariaDB 10.1, RHEL 7.1, RHEV
Continuous
data load
Massive IO
bandwidth
Flash for extreme
performance
Parallel processing Large-scale
memory processing
IBM POWER server is the first server that has made its systems, processor, and chip
design and architecture fully available to an open development alliance for comprehensive
licensing and collaborative design allowing third parties to co-innovate.
Innovation
4. Why Power ?
Porting Linux applications to Power Systems is quick and easyMost applications port with a simple recompile and test
• 95% of Linux on x86 applications written in C/C++ port to Linux on Power
with no source code change, just a simple recompile and test1
– Canonical reported an average of 250 open source applications ported
per day on Ubuntu. 95% of the Ubuntu 14.04 LTS compiled software ported
with a simple recompile and test
• 100% of hardware agnostic Linux on x86 applications written in scripting
(Java) or interpretive languages will run as is with no changes2
• IBM is committed to further simplifying porting and development
on Linux on Power
– Embrace open standards and partner with open communities such as OpenPOWER, OpenStack, Ubuntu,
and Cloud Foundry
– New tooling and function such as BlueMix
– Provide easier means to build apps leveraging existing code in the open communities
1. Includes C/C++ and other compiled languages. Assumes 16 hours of dedicated time and prior experience with the application code and its dependencies
(e.g. language, libraries, web application, database) and that dependencies already ported and installed. Assumes no platform or device specific dependencies.
2. Interpretive languages include PHP, Python, Perl, Ruby, Java, etc. Assumes 8 hours of dedicated time and prior experience with the application code and its dependencies
(e.g. language, libraries, web application, database) and that dependencies already ported and installed. Assumes no platform or device specific dependencies.
5. Converged Infrastructure ?
Award Winning
Hardware Design
Open Linux
Environment
• 3000+ Applications
• Little Endian Support
Consolidated Support
Unified Console
Single Point of Access
Competitive Pricing &
Financing
• PowerVM
• PowerVC
• Nagios
Management
Stack
• Shareable Compute resources
• Intra-rack Networking
• Storage Fabric / Distributed
Elastic Storage
• Upward Integration
• Virtualization / Cloud – capable
Decreased maintenance
Increased flexibility & control
“Data Center
in-a-Box”
6. PureMgr
(rhel7.1)
Hypervisor: KVM
(rhel7.1)
PowerVC
(rhel7.1)
Service
(rhel7.1)
UI integration
Inventory
Configuration
Monitoring (Nagios)
Virtualization
Management
Service
Troubleshooting
Bring-Up
Host
Thin
RHEL/KVM
Vanilla
Future components
Future function
Backed by OSS
to be added here
Secondary/HMC V 1.0 only
Virtualized later
PurePower
Integrated
Manager
…
Future failover function
Management Node(s) Overview
Increased OpenStack
adoption later:
- ICM with OpenStack
- BlueBox
- Etc.
Direction to move to
Power-based nodes
Primary
7. Nagios
Core
Nagios
UI
PureMgr
UI
PureMgr
Core
Activation and configuration
scripts
virtual appliance image
Deliverables to
Manufacturing
Virtual Appliance
• Automated virtual image deployment and configuration
• Capability to conserve customizations on image update
Hardware Inventory & Monitoring
• Automated configuration of all rack devices
• Automated configuration of Nagios & SNMP monitoring infrastructure
• Capability to discover and auto-configure new devices
UI/CLI support
• UI to serve as integration point to all rack element managers
• CLI for management and integration operations
Integrated
Management
PureMgr
PureMgr Architecture
11. Machine Type Model
Serial Number
IP Address
Type of resource/device
Label for easy identification
Rack number and EIA location
Administration user id
Hardware Inventory
12. • Drives reconfiguration
of all resources in the
management network
• Subnet mask
Gateway address
(for routing)
• IP-Addresses
can be individually
changed
(validated with mask)
• Changing subnet
automatically changes
all device IPs
• Supports devices with
multiple management
interfaces
Immediate on-premises
network integration !!!!!!
(Time-To-Value)
Network Integration
13. Monitoring of Power nodes
Example below shows monitoring of LED status for Power compute nodes and VIOS RMC connection availability
VIOS
• Warning event shows the 1st power compute node can be used for LPAR deployment, but no failover capabilities
• Critical events show 3 power compute nodes without LPAR deployment capabilities
LED
• All compute power node LEDs are in attention state
14. HMC services
• Metrics for CPU/Memory/Swap
• Availability of SSH daemon (for HMC CLI usage)
• Number of RMC process (connections)
• Trap service shows any SNMP Traps sent by the HMC (or the Power compute nodes it monitors)
V7000 services
• Status of Pools, Nodes, Pool Capacity, FC Ports
• Availability of HTTPS server (for V7000 UI usage)
• Trap service shows any SNMP Traps sent by the V7000 (or the Storage expansions it manages)
Monitoring of HMC / Storwize
16. PurePower racks are equipped with Smart PDUs
Nagios leverages SNMP monitoring capabilities of Smart PDUs and show a full rack view of energy/power consumed
Data show – per PDU:
• Cumulative KWh
• Total Power in W
• Total VA utilization (for more accurate capacity planning)
• Total Energy consumed in Wh
+ Whole rack energy consumption summary
Monitoring Energy Consumption
17. Possible Future Directions
Integrated
Management
PureMgr
Add more management functions (e.g. Updates/Compliance)
Leverage more open source (e.g. Ganglia)
Support Analytics space (e.g. Hadoop)
Support additional OpenStack certified applications
Hybrid Cloud, On/Off-Premises, Cloud Management
Support PowerKVM
Lower-cost distributed Storage (e.g. Ceph, GPFS)
New OSs (Ubuntu, SLES, IBMi)
Applications patterns, ready for deployment & monitoring
Integrate additional element managers
OpenPOWER, OpenStack drivers, Calamari, etc.
New Hardware, mainstream / lower cost
OpenPOWER, iSCSI Storage, etc.
18. References
Converged Infrastructure offering:
http://www-03.ibm.com/systems/power/hardware/purepower/
Nagios Plugins contributed to open source:
https://exchange.nagios.org/directory/Plugins/Hardware/Others/SX1710-monitoring-plugin/details
https://exchange.nagios.org/directory/Plugins/Hardware/Network-Gear/Others/G8052-2FG8264-monitoring-plugin/details
https://exchange.nagios.org/directory/Plugins/Hardware/Storage-Systems/SAN-and-NAS/IBM-Brocade/Check-IBM-2498-
2DF48-status/details
(others under submission, more to come in the near future)
Please use and vote !!!