Presented at All Things Open 2022
Presented by Jarek Gawor & Harry L. Hoots, III
Title: No Compromise - Better, Stronger, Faster Java in the Cloud
Abstract: Innovation in the cloud-era is about driving efficiencies, agility, and greater opportunities to deploy workloads to the cloud of your choice. Join us as we explore critical challenges faced by organizations in their move to cloud-native architectures along with the innovation in Java standards, including MicroProfile and Jakarta EE, and emerging technologies that help them build and deploy their applications on any cloud, faster and with better performance. Throughout, we showcase Open Liberty, the open-source, cloud-optimized runtime, that is delivering on the promise of this innovation to enable rapid delivery of highly scalable and performant applications, without compromise.
2. Who we are
Harry L. Hoots III
hlhoots@us.ibm.com
Jarek Gawor
jgawor@us.ibm.com
3. Agenda
This presentation outlines
the challenges faced by
developers, and presents
innovative solutions to
meet the demands of
cloud native Java
– The Java ecosystem
– Developer experience
– Challenges in the cloud
• Liberty InstantOn
– Demos
3
4. The Java Ecosystem
Java remains the
language of choice for
many developers. And
this is attributed to the
vast number of libraries
and tools available to
developers. Along with
reliable and performant
runtime engines.
– Widely adopted and used from 1996 to the
present in the industry
• My first Java experience was the Beta v2.0 in
college
– Vibrant open source community
• OS Libraries
• OS Tools and Frameworks
– Fast paced innovation in open standards
• Java9 – 6 month releases
– Java is here to stay, thus optimization your apps
and workloads that run in the Cloud is essential.
4
5. Developer Experience
The key to success in a
fast-moving world of cloud
computing is enhancing
developer experience.
This increases
productivity while
enabling greater
innovation.
– Monolith to Microservices in the Cloud
• Requires evolving the Dev Org and
Delivery process as Cloud-Native is much
more that replicating your traditional
setup in the Cloud.
– Developer experience improves developer
efficiency and fosters innovation.
– What you'll need for Developer efficiency:
• Open cloud-native APIs
• Quality runtime and Development tools
5
7. Cloud-native with Open Liberty
7
A lightweight framework for building fast and
efficient cloud-native Java applications. Open
Liberty is a flexible runtime that provides an
end-to-end cloud-native experience
Developer friendly:
• Just enough runtime; composable
• Zero Migration; no app code or configuration
change to move to newest release
• API Rich : Jakarta EE, Java EE, MicroProfile,
Spring Boot
• Cloud ready; production-ready container
images
• Immediate feedback with Open Liberty Dev
mode
Plan
Code
Build
Test
Release
Deploy
Run
Monitor
Security
Continuous
Integration
Continuous
Delivery
8. Open Liberty Development tools
Build
Edit
Test
Debug
Dev Mode
Local |
Containers
Hot Code
Replace
Testcontainers
Leading IDEs
Fast, iterative
development
Unit, Integration &
In-container testing
Editing support
for APIs and
configuration
Server Config
Docker/Containerfile
Deployment YAMLs
start.openliberty.io
New
Commit
Push
Code
What is Dev mode ?
• Allows you to develop apps with any
text editor or IDE by providing:
• Hot reload and deployment of
source or config changes.
• On demand testing or auto test
• Debugger support
• Local and Local Container
support
Why Dev mode ?
• Automatic compile and deploy to
your running server; easy to iterate
on your changes, and save time!
Liberty Tools:
• IntelliJ, Eclipse, VS Code IDE support
• Maven and Gradle build automation
support.
start.openliberty.io
• Starter application generation
10. Challenges in the era of
Cloud-native
The shift to cloud-native has
changed the demands placed
on the underlying JVM
technologies that drive
application frameworks and
runtimes
– Serverless computing
– Cloud economics
– Cloud native JVM
10
11. Cloud Economics
11
Cloud providers offer a pay-as-you go pricing model
based on CPU and memory usage. Businesses built
on the cloud need the ability to scale up to meet
rising demands, and scale down when demand
reduces is crucial for success.
Requirements:
• Scale-to-zero
• Low memory usage
• Minimal latency – fast startup
Cloud
Native
JVM
Legacy
JVM
Doing more with less!
12. Cloud Native Runtimes
12
The JVM was designed with portability and
flexibility in mind
• This meant that bytecodes were loaded lazily and
optimized while the application was running
• The result was slower startup times but high peak
throughput
• The cloud demands a shift in the performance
characteristics of JVMs and finding the right balance
Solutions:
• Dynamic AOT compilation and class
metadata persistence
• Static compilation – native image
Native Image
JDK
Startup time
Peak performance
Total image footprint
Fast build time
Usability
13. Liberty InstantOn
Liberty InstantOn is a holistic
solution that provides fast
startup without compromise.
• Use Checkpoint / Restore
• CRIU Project - https://criu.org
• Linux technology used to freeze a running process state
• Checkpoint state used to restore an application
– Semeru Runtimes
– Checkpoint/Restore with Semeru InstantOn
– Where to checkpoint?
– Container support
13
14. Semeru Runtimes
14
IBM Semeru Runtimes is a production-ready JDK
based on the Eclipse OpenJ9 JVM. The OpenJ9
JVM is designed with memory efficiency and startup
performance in mind. OpenJ9 continues to innovate
by introducing new features like CRIU Support
which is the basis of Semeru InstantOn and Liberty
InstantOn.
Key features:
• Broad platform support
• Tuned for the cloud
• Zero usage restrictions
Containers
Semeru InstantOn
Liberty InstantOn
MicroProfile Application
15. Semeru InstantOn
15
OpenJ9 CRIU Support provides third alternative
between static compilation and traditional JVMs in
the form of checkpoint and restore. It offers fast
startup while keeping the benefits of traditional
JVMs
- Dynamic loading
- Advanced JIT and GC
- Not closed world
- Run existing applications and libraries unchanged
Workflow:
• Checkpoint run at build phase
• Restore run in deployment
Traditional Run
JVM
startup
Application
initialization
Application
ready state
Build phase
Deployment run
Initiate
restore
checkpoint image
time
startup time application running
much faster
startup!
17. Where to checkpoint?
17
Liberty InstantOn leverages Semeru to provide a
seamless checkpoint/restore solution for
developers. With checkpoint restore, there is a
tradeoff between startup time and the complexity of
the restore.
Checkpoint phases:
• features
• deployment
• applications
Kernel
start and
feature runtime
processing
Process
application
Start
application
Accept
requests
Applications
Deployment
Features
Later
checkpoint
==
Faster
restore
time
Later
checkpoint
==
More
complexity
300 – 2200ms
100 – 3000ms
0 – ???ms
18. Container support
18
Liberty InstantOn provides tools for users to easily
create container images with their checkpointed
applications for fast startup in the cloud.
Steps:
• Build application image
build
FROM icr.io/appcafe/open-liberty:beta-checkpoint
COPY --chown=1001:0 server.xml /config/server.xml
COPY --chown=1001:0 demo.war
/config/dropins/demo.war
RUN configure.sh
Dockerfile
podman build -t demo-application .
19. Container support
19
Liberty InstantOn provides tools for users to easily
create container images with their checkpointed
applications for fast startup in the cloud.
Steps:
• Build application image
• Run application to checkpoint in container
run checkpoint
podman run
--name demo-checkpoint-container
--privileged
--env WLP_CHECKPOINT=applications
demo-application
20. Container support
20
Liberty InstantOn provides tools for users to easily
create container images with their checkpointed
applications for fast startup in the cloud.
Steps:
• Build application image
• Run application to checkpoint in container
• Commit container to checkpoint image
commit
podman commit
demo-checkpoint-container
demo
21. Container support
21
Liberty InstantOn provides tools for users to easily
create container images with their checkpointed
applications for fast startup in the cloud.
Steps:
• Build application image
• Run application to checkpoint in container
• Commit container to checkpoint image
• Restore checkpointed image
restore
podman run
--cap-add=CHECKPOINT_RESTORE
--cap-add=NET_ADMIN
--cap-add=SYS_PTRACE
demo
23. Semeru: CRIU Support API
23
org.eclipse.openj9.criu.CRIUSupport.checkpo
intJVMImpl
– Java API that allows a user to specify the various
arguments that gets used by the native CRIU API
criu_dump() and the associated setup using the
routines such as criu_init_opts()
• Allows options to be set at the CRIU level
– Examples : logging, image directory, etc.
• Helpful error codes in case something goes wrong
• Allows Liberty to register a prepare and restore
hook to perform compensations so that the user
application does not need to
• Performs several key compensations within JVM
JVM
Liberty and the
JVM stack up on
top of CRIU at
the framework
and JDK layers
to shield your
application
from
complexities of
CRIU
CRIU
Liberty
User
application
Uses the JVM’s
checkpointJVM
Impl API to
control CRIU
Uses CRIU native
APIs from within
the JVM
Does not need
awareness of
anything CRIU
related!
24. Semeru: CRIU Support
Pre-checkpoint hooks
24
JVM’s CRIU Support offers hooks for Liberty to
participate in preparing for a checkpoint taking
CRIU behaviors into account so that it works as
expected after a restore
Pre-checkpoint hooks:
• Liberty and JVM run different hook methods to
prepare for taking a checkpoint
• Example: close some file handles
• All the hooks are run on a single thread before
the checkpoint is generated
Application Threads
Hook Thread
Initiate checkpoint
Checkpoint JVM
25. Semeru: CRIU Support
Post-restore hooks
25
JVM’s CRIU Support offers hooks for Liberty to
participate in recovering application state after a
CRIU restore
Post restore Hooks:
• Modify JVM state to compensate for new env
• Example: Re-open file handles that were
closed in pre-checkpoint hook
• Run post-restore hooks in single thread mode
and then resume all application threads
afterwards from where the checkpoint was done
Application Threads
Hook Thread
Restore JVM
Initiate JVM
restore
26. Semeru: CRIU Support
Compensations
26
CRIU Support compensates for adverse effects that
may occur when the JVM is restored. This feature
offers users guarantees (security and functional)
about expectations of JVM state upon restore
Compensations that are handled by the JVM:
• Environment variables
• Time sensitive APIs
• Security stack (SSL, SecureRandom)
JVM
JVM
JVM
Checkpoint
Environment
Restore
Environment
27. Semeru: Compensation for
environment variables
27
Problem: CRIU freezes the env var space as it was
when the process was checkpointed
Consequence: Env vars set in the restore environment
are not picked up by CRIU
Compensation: JVM reads relevant env vars from a
new file specified by the user
CRIUSupport.RegisterRestoreEnvFile() on
restore
Easy to set values for env vars not used before restore
For env vars used before restore
• Mutable: pick up new values since app can cope
• Immutable: exception it changed, app cannot cope
JVM
Create env var
file with
export
ENV2=value2
Checkpoint
Environment
Restore
Environment
JVM
export
ENV1=value1
JVM
Mutable case
export
ENV1=value1
JVM
Immutable
case
export
ENV1=value1
JVM
Mutable case
Create env var
file with
export
ENV1=value2
JVM
Immutable
case
Create env var
file with
export
ENV1=value2
Unexpected
Exception
changing
immutable
env var
Works
as expected
Works
as expected
28. Semeru: Compensation for
time APIs
28
Problem: Time between checkpoint and restore is
counted towards elapsed time
Consequence: Time appears to have passed even
though the process was paused between checkpoint
and restore
Compensation: JVM compensates by subtracting
elapsed time between checkpoint and restore
• System.nanoTime, Object.wait, Unsafe.park have
all been changed to compensate
• System.currentTimeMillis cannot be compensated
similarly, so avoid using to measure elapsed time
across checkpoint restore operation or use
System.nanoTime
JVM
elapsedTime =
System.nanoTi
me() -
startTime;
Checkpoint
Environment
Restore
Environment
JVM
startTime =
System.nanoTi
me();
JVM
Object.wait(10
00); // trigger if
1 second
passes
JVM
startTime2 =
System.current
TimeMillis();
JVM
Object.wait(10
00); // trigger
after 1 second
ignoring
paused state
J
JVM
elapsedTime2
=
System.current
TimeMillis() –
startTime2;
Unexpected
Elapsed time
includes
paused time
Works
as expected
Works
as expected
29. Semeru: Compensation for
Random APIs
29
JVM
r.setSeed(s1);
val =
r.nextInt();
Checkpoint
Environment
Restore
Environment
JVM
r = new
Random();
val = r.nextInt()
JVM
r.setSeed(s2);
val =
r.nextInt();
JVM
r.setSeed(s3);
val =
r.nextInt();
Works
as expected
Works
as expected
Works
as expected
Problem: Random and SecureRandom seeds are initialized at
creation time
Consequence: Entropy guarantees from Random and
SecureRandom may not be met if restored several times
Compensation: JVM compensates by changing the random seed
on every restore
• Random and SecureRandom objects will re-seeded
independently in each restore
• Entropy guarantees will be met after restore since each
object will be re-seeded independently on restore and hence
generate random number sequences distinct from each
other
• Future: offer API for application to opt out of compensation
for some objects
30. Summary
Adapting and thriving in the cloud
requires evolving your
development, deployment and and
runtime for a cloud-native world
Liberty is a versatile runtime
for these purposes
InstantOn is a key capability
that makes Liberty a great runtime
for serverless use cases
– http://ibm.biz/InstantOn_HowToBlog
– https://openliberty.io
– https://github.com/eclipse-openj9/openj9
– https://blog.openj9.org/2022/09/26/getting-started-with-
openj9-criu-support
– https://github.com/ibmruntimes/semeru17-ea-
binaries/releases
30