Jeanie Schwenk, Jireh Semiconductor
Jireh Semiconductor bought the Hillsboro fab and its contents including the manufacturing tools, servers, and software running the fab. The previous company had been winding down for years so server and software upgrades had not been on the radar for some time. In 2011 Jireh became the proud owner of the building, the tools, and its legacy software running on servers that weren’t even made any more.
That's when I started my adventure with Jireh in September 2016 with a charter to modernize the applications running the manufacturing facility process and move them into VMs with no impact to manufacturing. That led me down a path of exploration and questions. “What’s the goal?”
The goal wasn't to move to VMs. It was to become independent of the aging PA-RISC architecture, bring forward the ~230 java 1.4.2 applications (10-15 years old), scale to allow increased the load on the software and hardware in order to ramp the factory output to numbers never seen previously. And do it without manufacturing downtime.
The solution included a transition from waterfall and silo development to agile scrum. Rather than simply migrating to VMs, it became obvious the lynch pin for a successful software transition with the required uptime, flexibility, and scalability was Docker Enterprise.
Join me for this session where I'll talk about my journey modernizing 15+ year old applications and infrastructure at Jireh.
2. • 20+ years experience in software industry
• BS in Computer Science
• MS in Computer Science
• Masters Certificate in Software Engineering
• PMP (PMI.org)
• PSM (scrum.org)
• PSPO (scrum.org)
Professional Background
9. Joining the Jireh Team
What’s your goal?
• Coach the team transform to Agile Scrum
• Modernize the ~230+ applications
• Become independent of the PA-RISC servers
• Ramp factory output to numbers never seen before
• No downtime
• No impact to manufacturing
11. The Plan to Meet the Goal
• Transition from Waterfall to Agile Scrum
• Transition from Silos to Teams
• Modernize the Hardware
• Modernize the OS
• Modernize the applications
• No downtime/impact to manufacturing while doing it
12. Where to Start
1. Team assessment / Teach agile and scrum
2. System communication. How does this system work?
3. Build, run, and understand java applications
4. Look into options for virtualization of applications
Fit in our manufacturing environment
Maturity of the product
High availability
Support
Cost
5. Look into Legacy Operating System
13. People and Processes
People resist
We like what is familiar
We don’t like change
Agile mindset
“It's a competitive world out there. If you aren't looking to
improve how you do things, and you're not willing to
change, you'll become irrelevant.”
14. We have to
test the MES
manually.
We’re
too small
We can’t do
agile or scrum, it
just won’t fit here
The Team’s Response
We’re too
interrupt
driven.
We’re too
specialized.
We can’t add
unit testing –
it’s too big.
Our PA-RISC
servers can’t be
upgraded.
No forward
path
We can’t virtualize
it. It’s tied to the
hardware.
15. Legacy Servers
MES - WorkStream
COMETS
Remotes
Database
9.4
WorkStream +
DB
HP-UX Server
Equipment Controllers
and Drivers
HP-UX Server
DBS Applications
1 of each
HP-UX Server
DBS Applications
1 of each
HP-UX Server
DBS Applications
1 of each
HP-UX Server
Equipment Controllers
and Drivers
HP-UX Server
Equipment Controllers
and Drivers
HP-UX Server
Equipment Controllers
and Drivers
HP-UX Server
WSMsrv
HP-UX Server
Shared
memory
16. Getting Legacy Monolith to Containers
• Know your system
• Set a clear goal for short-term and long-term
- pull apart and isolate each component
- be able to migrate or replace any single piece
without impact to a running factory
• Create a transition plan
• Risk analysis (deal breakers)
• Proof of concept to obtain buy-in
17. Systems and Communication
WSMsrv (20)
Job ontroler
Job Setup
(3)
Fab View
(3)
GUI (n)
Job Controller
(3)
CMMSG
VFEI
VFEI
Equipment
Controller
Driver
Equipment
SECS II
VFEI
3 servers
9 applications
4 servers
~240 EQCs
~240 drivers
Shared memory
MES - WorkStream
COMETS
Remotes
Database
9.4
WorkStream +
DB
HP-UX Server
18. The Plan for Enterprise buy-in
1. Docker Engine (CE) on my PC
2. Obtain VM running Ubuntu
3. Get main three applications to run in the VM
4. Install Docker Engine (CE) on the VM
5. Get three main applications to run in a container (test VM)
Docker Engine (CE)
on PC
A
Docker Engine (CE)
on VM
B
Docker Enterprise
on 10 VMs
C
19. Eliminate deal breakers
• Message bus across versions and systems –
Get message bus to run on VM
Get message bus to run in container
• Send/receive messages via command line
• Start applications from command line
• Change operating system
• Migrate java version as far forward as possible
• Separate applications
• Accessible application logs
20. Transition to Docker Container
• Create Docker image in CE – identical to VM
• Run in container in Docker Engine (CE)
• Get buy in for Docker Enterprise
• Determine minimal set for container
Library dependencies
$PATH and environment vars
Start application from the command line
• Move to smaller footprint OS - Debian
• Run in Production swarm in Docker Enterprise***
21. Moments of Failure
• Running 1.4.2 under different OS
• Finding the right OS
• Finding the right jdk
• Can’t run from command line
• Non-existent runtime flags
• Library dependencies
Libraries not available
Incompatible library errors
22. Moments of Failure
• Environment variables
• Startup errors in the logfile
• Wrong files in repo
• Files in production not in repo
• Unmet dependencies/broken pkgs
• “Undefined” dependencies
• ELFCLASS32/64 architecture mismatch errors
23. Moments of Failure
• Test send/recv program not just command line
Big endian, little endian errors
• Unable to install msg bus in container as part of
Dockerfile - learn expect
• Disabled stack guard errors (execstack)
• File to big to get out of repo
• Core dumps (ipv6 issue)
24. Example Breadcrumb
# A fatal error has been detected by the Java Runtime Environment:
# Internal Error (os_linux_x86.cpp:291), pid=7026, tid=0xf66a1b40
# fatal error: An irrecoverable SI_KERNEL SIGSEGV has occurred due to unstable signal
handling in this distribution.
#
# JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: Java HotSpot(TM) Server VM (25.181-b13 mixed mode linux-x86 )
#
# An error report file with more information is saved as: … hs_err_pid7026.log
#
# The crash happened outside the Java Virtual Machine in native code.
Aborted (core dumped)
…
An unexpected exception has been detected in native code outside the VM.
Unexpected Signal : 11 occurred at PC=0xF768C3A1
Function=inet_pton+0x161
Library=/lib32/libc.so.6
27. Shared
memory
Legacy Servers
Where we started
MES - WorkStream
COMETS
Remotes
Equipment Controllers
and Drivers
HP-UX Server
DBS Applications
1 of each
HP-UX Server
DBS Applications
1 of each
HP-UX Server
DBS Applications
1 of each
HP-UX Server
Equipment Controllers
and Drivers
HP-UX Server
Equipment Controllers
and Drivers
HP-UX Server
Equipment Controllers
and Drivers
HP-UX Server
Database
9.4
WSMsrv
HP-UX Server
WorkStream +
DB
HP-UX Server
28. Virtualization: Where we’re going
HP-UX Server
WorkStream
COMETS
Remotes
Database 12.10
VM - RedHat
Equipment
Controllers
DBS
Applications
Equipment
Controllers
Equipment
Controllers
Drivers
WSMsrv
HP-UX Server
DBS
Applications
DBS
Applications
Equipment
Controllers
Equipment
Controllers
Drivers
29. Future - MES Server
3 options for future
• Refurbished rp**** servers
• Refurbished rx**** servers
• HP9000 PA-RISC emulation
We’ll be seeing a demo
from them in May
30. Architecture DOCKER ENTERPRISE
Management Plane
CE Node
Nginx
LB for UCP
CE Node
Nginx
LB for DTR
Registry
Node
Manager
Node
Manager
Node
Manager
Node
DTR
Node
DTR
Node
DTR
Node
Worker
Node
Worker
Node
Worker
Node
Worker
31. Is Containerization For You?
• Maybe a container isn’t your first stop
• What are your risks?
• What’s your goal?
• Does the legacy application have a future?
• Do you have a plan B?
• Time/resources to devote?
• Anyone know Docker
• Start new services?
• New development in containers and keep the legacy around until
replacement is ready?
• Do you know your application?
• How will you manage the moments of failures?