IBM continues to invest aggressively in its Java on System z technology with on-going deep vertical integration of Java with the z eco-system.
As such, the evolution of Java on Z shows consistent delivery of
deep co-design and exploitation of next generation hardware function,
industry leading runtime support with IBM J9 Virtual Machine, and
evolution of the Java language
demonstration outstanding and impressive release-to-release improvements in perfomance, consumability and function.
Language:
Java8 brings over 34 new significant language features to the Java. The two that are likely the most notable are
Lamdbas for streams and parallelism
Virtual extension functions for enabling transparent extension to existing libraries
z13:
New 5.0 GHz 8-Core Processor Chip – best single thread perf out there
480Mb L4 cache to optimize for data serving – best cache/thread ratio out there
Simultaneous Multi Threading (SMT) – two h/w threads/core allow independent execution of two software threads per-core. Provides more efficient use of core resources.
Vector Processing – Single Instruction Multiple Data (SIMD) to exploit data parallelism (array processing, strings, loops)
Crypto:
Public key function for Elliptic Curve Cryptography (ECC) accelerated by up-to 4X.
CP Assist for Cryptographic Function (CPACF) is hardware co-processor technology on the z chips. The IBM Java Crypto Engine (IBMJCE) in Java 8 leverages CPACF transparently to accelerate a significant set of crypto function.
This includes:
Block cipher algorithms: AES/DES/3DES
Block cipher modes: CBC/CFB/ECB/OFB
Secure hashing: SHA1/SHA2
JMX – Java Management Extensions
A new set of probes (Beans) have been added to enable precise CPU breakdown across
JVM System threads (JIT, GC etc)
JVM application threads
JVM monitoring threads
The importance of Java on System z
Java is a critically important language for System z. For data serving and transaction serving, which are traditional strengths of the z platform, Java has become foundational. For instance, WebSphere applications, written in Java and running on System z, provides a key advantage through co-location. The latter results in better response-times, greater throughput and reduced system complexity when driving CICS, IMS and DB2 transactions. Beyond this, as clients seek to extend and modernize their business logic, Java has become a language of choice for CICS, IMS and DB2 transactions.
Java is also critical for enabling next generation workloads in cloud, analytics, mobile and security (CAMS). Cloud and mobile applications can access data and transaction on z/OS using z/OS Connect (http://www-01.ibm.com/support/docview.wss?uid=tss1wp102439), and other WebSphere solutions – which are all inherently Java-based. IBM Operation Decision Manager (ODM) is written in Java. ODM is a platform for managing and executing business rules and business events to help make decisions faster, improve responsiveness, minimize risks and seize opportunities. The IBM MobileFirst Platform Developer Edition, formerly known as IBM Worklight Developer Edition, provides developers with the tools to quickly deploy mobile solutions using Java. System z Java also provides a full set of cryptographic functions to implement secure solutions.
A key strength of Java applications is the ability to immediately benefit from the latest hardware performance improvements using the Just In-Time JIT compiler in the latest Java SDK release.
Highlights for IBM Java 8 on IBM z13
IBM z13 and IBM Java 8 are taking Java performance to new heights! The combined benefits of IBM Java 8 and z13 features – including Single Instruction Multiple Data (SIMD) vector engine, Symmetric Multi-Threading (SMT) and improved CP Assist for Cryptographic Function (CPACF) -- are providing up-to 2X improvement in throughput-per-core for security-enabled applications and up-to 50% improvement for other generic applications.
Java 8 exploitation of System z CPACF is the default for System z9 and above on both z/OS and Linux on System z. The Java 8 SIMD exploitation requires z13 and z/OS 2.1 with PTFs. z13 zIIP SMT enablement also requires z/OS 2.1 with PTFs.
Java Store Inventory and Point of Sale Application
The Java Store Inventory and Point of Sale Application is a stand-alone Java application based on the IT infrastructure used by a real-world retail company. The benchmark combines mixed point-of-sale, online purchases and data-mining, and exercises many new language features as well as compression and cryptographic functions.
This figure shows the z/OS aggregate improvement in throughput from IBM Java 8 exploitation of cryptographic functions (CPACF), z13 SIMD, and z13 SMT for zIIPs.
Some highlights of the measurements show
1) a 40% improvement for zEC12 Java 8 versus Java 7 SR4, and
2) an additional 35% improvement for Java 8 z13 SMT versus Java 8 zEC12
Data presumes 3 other CPs are already configured for other workloads
References
9 million Java developers
http://www.oracle.com/technetwork/articles/java/afterglow2013-2030343.html
80% corporate data on mainframe
- http://mainframe50.tumblr.com/post/80674751403/80-of-the-worlds-corporate-data-resides-or#.VCFwFmP5xEM
DATEV performance figures – see case study
http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=AB&infotype=PM&appname=SWGE_ZS_SW_USEN&htmlfid=ZSC03135USEN&attachment=ZSC03135USEN.PDF#loaded
Quotes
http://www.techrepublic.com/article/the-mainframe-evolves-into-a-new-beast-in-the-cloud-era/
http://www.cmg.org/publications/measureit/2011-2/mit78/ (links to http://www.cmg.org/wp-content/uploads/2011/04/m_78_5.pdf)
http://enterprisesystemsmedia.com/article/three-reasons-the-mainframe-is-in-trouble-not-so-fast#&ts=undefined
Business Rules Processing
Business Rules Processing applications feature easy-to-use platform for capturing, automating and governing frequent repeatable operational business rules. The building blocks of decision services are the business rules that drive and support your decision making approach and the analytics that ensure that decision-making is accurate.
Business rules processing with Java 8 takes advantage of the z13 SIMD vector instructions and SMT for zIIPs to achieve significant improvements in throughput-per-core.
Some highlights of the measurements show
1) a 60% improvement for z13 no SMT Java 8 versus zEC12 Java 7 SR4, and
2) an additional 31% improvement from z13 SMT zIIPs with Java 8.
Data presumes 3 other CPs are already configured for other workloads
Business Rules Processing
Business Rules Processing applications feature easy-to-use platform for capturing, automating and governing frequent repeatable operational business rules. The building blocks of decision services are the business rules that drive and support your decision making approach and the analytics that ensure that decision-making is accurate.
Business rules processing with Java 8 takes advantage of SMT for IFLs to achieve significant improvements in throughput-per-core.
Some highlights of the measurements show
1) a 40% improvement for z13 no SMT Java 8 versus zEC12 Java 7 SR4, and
2) an additional 42% improvement from z13 SMT IFLs with Java 8.
A 15.8% improvement switching from Java 7.1 SR3 to Java 8, using CICS V5.3 GM candidate build, comprising Liberty 8.5.5.7 with z/OS Connect 1.2.
This graph shows the improvement in JMP performance for z13 over zEC12 over z196.
The Java component of the test is minimal, so the improvement is really a measure of the JMP region overhead and represents an expected lower bound on a real-world scenario where the Java work in the tran would typically be more representative.
This graph shows the improvement in JMP performance for z13 over zEC12 over z196.
The Java component of the test is minimal, so the improvement is really a measure of the JMP region overhead and represents an expected lower bound on a real-world scenario where the Java work in the tran would typically be more representative.
Extended zAAP support also added for type 2 connectivity across the z platform
Java Store Inventory and Point of Sale Application
The Java Store Inventory and Point of Sale Application is a stand-alone Java application based on the IT infrastructure used by a real-world retail company. The benchmark combines mixed point-of-sale, online purchases and data-mining, and exercises many new language features as well as compression and cryptographic functions.
This figure shows the aggregate improvement in throughput from IBM Java 8 exploitation of cryptographic functions (CPACF), and z13 SMT for zIIPs.
Some z/VM Linux on z highlights of the measurements show
1) a 60% improvement for z13 Java 8 versus Java 7 SR4, and
2) an additional 30% improvement for Java 8 z13 SMT versus Java 8 zEC12
Secure Application Serving
z/OS Websphere Application Server (WAS) 8.5.5.5 with Secure Socket Layers (SSL) will exploit the new Java 8 Clear Key CPACF and SIMD vector instructions for string manipulation. This graph shows improvements when using Java 8 CPACF exploitation on both zEC12 and z13. This graph also shows the Java 8 exploitation of z13 SIMD and other new z13 machine instructions. The bar on the far right shows the improvement gained from enabling SMT-2 on the specialty zIIP processing units. Although the measurements were obtained on z/OS 2.1 on z13, SSL will also exploit clear key CPACF by default with Java 8 on System z9 and higher.
Some highlights of the measurements show
a 50% improvement for zEC12 Java 8 versus Java 7 SR4. more than half of which came from cryptographic functions exploiting CPACF System z hardware instructions, and
an additional 75% improvement for Java 8 on z13 SMT versus Java 8 zEC12
Data presumes 1 other CP is already configured for other workloads
Secure Application Serving
z/VM Linux on z Websphere Application Server (WAS) 8.5.5.5 with Secure Socket Layers (SSL) will exploit the new Java 8 Clear Key CPACF. This graph shows improvements when using Java 8 CPACF exploitation on both zEC12 and z13. This graph also shows the Java 8 exploitation new z13 machine instructions. The bar on the far right shows the improvement gained from enabling SMT-2 on the z/VM IFL processing units. Although the measurements below were obtained on z13, SSL will also exploit clear key CPACF by default with Java 8 on System z9 and higher.
Some highlights of the measurements show
a 40% improvement for zEC12 Java 8 versus Java 7 SR4. more than half of which came from cryptographic functions exploiting CPACF System z hardware instructions
an additional 85% improvement for Java 8 on z13 SMT versus Java 8 zEC12
Simultaneous Multi-threading (SMT) is available on IFLs on z13. The graph shows the relative improvements with SMT-1 and SMT-2 on WAS V8.5.5.7 Liberty Profile with IBM Java 8. The results shows up to 1.5x performance improvements with SMT-2 up through 8 IFLs.
Scala based map reduce operations on Spark provide can provide up to 3x better performance on LoZ under z/VM on a z13 versus Linux on x86 Haswell. Can assume process of colocation can be similar to other SQL open source databases
Analytics oper decision mgmt
Analytics oper decision mgmt
CPU-Intensive Benchmark
The CPU-Intensive benchmark suite includes a range applications that exercise core Java functions such as compression, cryptography, scientific floating point computing, serialization, graphics and XML processing. The benchmark is by-design CPU-intensive, and hence, reflects CPU and cache/memory performance.
Key observations of the CPU-Intensive suite are
1) A 61% composite improvement from IBM Java 8 and z13 was observed
2) Some Java applications have more significant improvements than others
3) The cryptography suite observed a 4x improvement, a reflection of IBM Java 8 leveraging CPACF to accelerate the default Java Cryptography Engine
4) With the exception of MP3 Library, all other suites observed a 30% or better improvement from IBM Java8 and z13
Start at the bottom of the picture:
Thread and Port Libraries are the secret to portability and consistency.
Note the hard API boundaries between Core VM and pluggable components: secret to customization
Application code sits atop portable Java Platform API (validated by compliance suites)
Class Library agnostic: -Xj9 in 1.4.2 using Java 5 JVM.
IBM continues to invest aggressively in its Java on System z technology with on-going deep vertical integration of Java with the z eco-system.
As such, the evolution of Java on Z shows consistent delivery of
deep co-design and exploitation of next generation hardware function,
industry leading runtime support with IBM J9 Virtual Machine, and
evolution of the Java language
demonstration outstanding and impressive release-to-release improvements in perfomance, consumability and function.
* Beyond better caches and core technology for Java, we also have lots of innovative new architecture in zEC12. This includes
Hardware Transaction Memory:
IBM’s zEnterprise EC12 is the first general-purpose IBM server to incorporate transactional memory technology, first used commercially to help make the IBM Blue Gene/Q-based “Sequoia” system at Lawrence Livermore National Lab the fastest supercomputer in the world.
In zEC12, IBM adapted this technology to enable software to better support concurrent operations that use a shared set of data such as financial institutions processing transactions against the same set of accounts.
Runtime Instrumentation:
An innovative new facility used by the JVM to gather trace execution and heavy event data (d-cache-miss, branch miss etc) at runtime.
The information can be used by the JVM to adapt to the behaviour of the application more efficiently.
2 Gig Pages:
There has been a consistent noted trend for growth in Java application heap sizes. Today, it is not un-common to see heap sizes that are greater than 16G, with some customers using heaps that are on the order of 100G. 2G pages were added to help address this noted trend in growing storage.
1Meg Pageable Large Pages Using Flash Express
The JVM will exploit pageable large pages for the JIT code-cache and the object heap.
The use of 1M pageable large pages for the JIT codecache has been observed to enhance the runtime performance of some Java applications.
The use of 1M pageable pages for the object heap provides much of the same runtime performance benefits as non-pageable 1M pages, with the additional benefit of also offering better versatility for managing memory to improve system availability and responsiveness
The 1M pageable large page size is available only with the the following minimum requirements:
IBM zEnterprise EC12 with the Flash Express feature (#0402)
z/OS V1.13 with PTFs and the z/OS V1.13 RSM Enablement Offering web deliverable installed. The web deliverable is planned to be available at http://www.ibm.com/systems/z/os/zos/downloads/
Hints/directives
Enable more efficient use of the core resources for branch prediction and data/instruction cache hierarchy by the JVM
Traps
Reduce the overhead incurred by implicit checks (NULL and array-bounds) that are required by the Java language
Statement of Direction
IBM has made a statement of direction regarding exploitation of these features by the JVM
IBM plans for future maintenance roll-ups of IBM 31-bit and 64-bit SDK7 for z/OS Java(TM) Technology Edition, Version 7 (5655-W43 and 5655-W44) (IBM SDK7 for z/OS Java), to provide exploitation of new IBM zEnterprise EC12 features, including: Flash Express and pageable large pages, Transactional Execution Facility, Miscellaneous-Instruction-Extension Facility, and 2 GB pages. In addition, IBM SDK7 for z/OS Java is available for use by IBM middleware products running Java, such as IBM IMS 12 (5635-A03), IBM DB2 10 for z/OS (5605-DB2), and the Liberty profile of IBM WebSphere Application Server for z/OS v8.5 (5655-W65); and is planned for use by a future release of CICS Transaction Server for z/OS.
A multi-threaded benchmark that performs business logic for an online transaction processing framework. The benchmark incrementally increases the number of worker threads, hence increasing the system transaction processing throughput, up until the number of worker threads out-numbers the number of hardware threads (16-way).
zEC12 provides a 45% improvement in throughput to this benchmark
Java7SR3 provides an additional 13% improvement to this benchmark when exploiting the –Xaggressive and –Xlp options.
zEC12 and Java7SR3 provide an impressive 60% aggregate improvement to the Multi-threaded benchmark!
Linux on System z does not currently provide support for exploiting Flash Express for 1M paging. Performance improvements provided by Java7SR3 were a result of addition exploitation of fixed pages for backing the JIT code-cache when the –Xlp option is used.