What's New in Moab 8.0

© 2014 ADAPTIVE COMPUTING, INC.
HPC, Cloud & Big Workflow:
What’s New in Moab 8.0
Trev Harmon
Adaptive Computing
ISC'14

© 2014 ADAPTIVE COMPUTING, INC.© 2014 ADAPTIVE COMPUTING, INC.
Adaptive Computing Highlights
▪ Innovating world-class HPC solutions for over 12 years
▪ Pioneers of HPC schedulers, grid, power management, HPC-Cloud,
optimization, scale, dynamic provisioning, Big Workflow and more
▪ 50+ patents issued or pending
▪ Backed by top-tier investors
▪ Many customers in the Top 100 and Fortune 500
▪ Top systems including: #2 Titan, Cascade, Cielo, Hopper, & Bluewaters
▪ Major multi-nationals including: DOW, Exxon, & Boeing
▪ Largest provider of HPC workload management software to HPC sites*
▪ Global partnerships include Intel, HP, IBM, Cray, SGI, & Microsoft
Cloud System
Management
Innovator

Broad HPC Customer Base
Oil and Gas, Financial, Manufacturing,
Research and Government

Expanding HPC Value & Use
Greater access to technical
computing resources
Expanding HPC applicability
across industries
Demand for more simplified
access & management (e.g.
SLAs)

Growth in System Size & Complexity
100 cores --> 1 Million+ cores
Diversity of environments and
processing needs
Growing organizational complexity

Greater Need for Alignment with
Organization / Business Directives
Increased tracking & accountability
Increasing global competition
Ability to quickly adapt is vital
Growing Collaboration

Themes for Accelerating Insights
▪ Unify data center resources
▪ As a single, adaptive ecosystem
▪ Technical computing (HPC & Big Data)
▪ Public and private cloud
▪ Bare metal & virtual machines
▪ Optimize the analysis process
▪ Increase throughput and productivity
▪ Ensure SLAs, maximize uptime
▪ Reduce cost, complexity and errors
▪ Guarantee service to the business
▪ Policies that model your organization
▪ Prove services were delivered
▪ Job completion in spite of failures
▪ Verify resources were allocated fairly

What’s New in 8.0
Enhancing Big Workflow

What’s new in 8.0 - Unify
▪ OpenStack
▪ Breaks down siloed environments
▪ Offers virtual and physical resource
provisioning for IaaS and PaaS
▪ Select Beta Customers
▪ Intersect360
▪ Moab and TORQUE - top two job
management packages
▪ Received 40% of the mentions
“Adaptive’s Big Workflow…is to provide a way for big data, HPC, and cloud environments to
interoperate, and do so dynamically based on what applications are running. With the added
benefits of a unified platform, OpenStack is a promising platform to interoperate multiple
environments.”
-Addison Snell, CEO Intersect360

What’s New in 8.0 – Optimize
▪ Moab Performance Boost
▪ 2-3x overall performance improvements
▪ 100K Job Submission
▪ High Throughput Computing with Nitro
▪ Advanced Data Staging
▪ Multi-job workflow
▪ Staging job runtime prediction
▪ Improved cluster utilization
▪ Multiple transfer methods
▪ Advanced Power Management
▪ New power states options
▪ Suspend
▪ Hybernate
▪ Shutdown
▪ Clock Frequency Control
N
O
D
E
S
Input Output
Compute

What’s New in 8.0 – Guarantee
▪ Next Generation Viewpoint
▪ Enhanced Web-based UI
▪ Next Generation dashboard
▪ Today monitors and reports
workload and resource utilization
▪ Cray 3D Torus topology awareness

Ascent Project

Performance Boost with Operation Ascent
▪ 3x the Performance Boost
▪ Reduce Command Latency
▪ Decrease Scheduling Cycle Time
▪ Improve Multi-Threading
▪ Faster Moab/TORQUE Communication
▪ Advanced High-throughput Computing with
Nitro

Nitro – High Throughput Computing
▪ Removes launch speed bottlenecks
▪ Achieves exascale computing
▪ Localizes decision making
▪ Up to 100x faster throughput on
short jobs
▪ Launches 10 jobs per node per
second
▪ Reduces latency
▪ Runs on Moab/TORQUE
environments

How Does Nitro Work?
▪ Ultra high-speed message queue
▪ Different approach to scheduling
▪ Combines small, alike jobs
▪ Creates policies for the entire batch job
▪ Schedules the batch as one job
▪ Incur scheduling overhead only once
▪ Not once per individual small job
▪ Limitations
▪ Speed of your processor & job size
▪ Nitro sacrifices some granularity in management
▪ i.e. individual tasks in a large batch cannot be cancelled or pre-empted in
isolation
▪ The batch is the unit of management and reporting

Topology-aware
Node Allocation

Topology-aware Node Allocation
▪ Cray Gemini 3D Torus
▪ Network characteristics-aware
▪ 3D torus
▪ Y-dimension bandwidth
▪ Dateline zones
▪ Shape-fitting
▪ Six shapes
▪ Built-in Moab node allocation
policy

Data-staging

Data-staging Refactor
▪ Data-staging using Moab “system” jobs
▪ Input and output data-staging system jobs
▪ System jobs separately scheduled by Moab
▪ Dependencies between system jobs and user job
▪ Calculate system jobs’ data-staging wall time estimates
▪ Support additional file transfer utilities
▪ Linux rsync in addition to scp utility
▪ Commercial data-transfer products (e.g. Aspera)

▪ Node Allocation Timing Exception
▪ If data staged to local file system, compute nodes allocated
during data-staging system jobs
▪ Why? Preserve job execution time consistency!
▪ Grid data-staging
▪ Part of data-staging initiative
▪ Grid Moab chooses cluster
▪ Grid Moab stages data
▪ Can run data-staging system jobs on dedicated
data transfer servers
Data-staging Refactor

Power Management /
Green Computing

Power/Performance Profiles
▪ Minimizing energy consumption requires application-
specific optimal clock frequency

CPU Clock Frequency Control
▪ New cpuclock= job submission option
▪ Absolute Clock Frequency Number
▪ Example: cpuclock=2200 or cpuclock=1800mhz
▪ Linux Power Governor Policy
▪ Example: cpuclock=conservative
▪ Relative P-state Number
▪ Values 0-15
▪ 0=“turbo” frequency
▪ 15=slowest frequency
▪ Example: cpuclock=0 or cpuclock=P2
▪ Can set in job templates

CPU Clock Frequency Control (continued)
▪ TORQUE pbs_mom sets clock frequency
▪ Logs clock frequency changes in pbs_mom log
▪ Moab records
▪ Job’s requested clock frequency in job record
▪ Nodes’ clock frequency in node statistics
▪ Uses
▪ Energy conservation for lower operational costs
▪ Power/performance profile generation
▪ Diagnostics

Green Policy Configuration
▪ New Moab Web Services
RM Plug-in
▪ Contains power management
logic
▪ Specifies power state Moab
should place a compute node in
when applying “green” policy
▪ Standby
▪ Suspend
▪ Hibernate
▪ Shutdown
▪ Off
▪ Multi-threaded
▪ New power management
“reference” scripts

Administrator Portal

Admin Portal (8.0 - 2014)
▪ Exciting Features
▪ Dashboards: Workload and Resource Views
▪ Simplified management of credentials
▪ Easy to use Policies
▪ Priority
▪ Fairshare
▪ Backfill
▪ Node Policies
▪ Cluster Management
▪ Historical Database
▪ HPC Web Services

Persistent Database
▪ Relational database for historical data!
▪ Published View Schema
▪ Easily extract reports using standard reporting
tools / frameworks
▪ Prepopulated Views
API

Dashboard

Credential Management - Details

Graphical Policy Management

Resource Management – Zoom levels

Resource Management – Historic Utilization

Submitting a job…

Questions?

What's New in Moab 8.0

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to What's New in Moab 8.0

Similar to What's New in Moab 8.0 (20)

More from inside-BigData.com

More from inside-BigData.com (20)

Recently uploaded

Recently uploaded (20)

What's New in Moab 8.0

Editor's Notes