Accelrys Enterprise Platform scales from laptops to grids, but can how does it do that and how can it scale to meet the demands of an Enterprise Application? Enterprise Architects and Developers will get a detailed view of how Pipeline Pilot handles job management, job queuing, job pooling, security, memory management, process isolation and more. This session provides background information that will support later presentations in the platform and developers tracks.
2. The information on the roadmap and future software development efforts are
intended to outline general product direction and should not be relied on in making
a purchasing decision.
3. Agenda
• Core services and security
• Job launching
• Process management
• Latency and scalability data
• Clustering methods
4. Accelrys Enterprise Platform
SharePoint & Office 3rd Party
Web Applications Thick Client Applications
Applications Applications
Web Application Framework Client Integration APIs MS Office Integration SOA Integration
Accelrys Enterprise Platform
Scientific and Generic Services Data Management Services
Work Reports Experiment Instrument
Request Workflow Interfaces
Notebook
ORACLE Docs Isentris
Vault
Experiment Scheduling Virtual Data Mining
Design Chemistry & Analytics
LIMS LEA Other
Modeling
Biology Registration Imaging
& Simulation …
5. Accelrys Enterprise Platform Integration
Client Integration
Build clients that connect to Pipeline
Pilot and run protocol services.
Web Browser
Run Protocol .NET Client Java Client SOAP Client
Professional Client Command Line Web Port JavaScript Client
Client
JavaScript Client .NET Client
REST API Java Client SDK Web Services API
SDK SDK
Pipeline Pilot Enterprise Server
Web Apps Web Services API Admin Portal Help Portal
Grid System Integration (optional)
Protocol Runtime Environment (scisvr)
VB Script VB Script
Run SOAP & Telnet / ODBC /
(On (On Java Perl Python .NET SSH / SCP
Program HTTP FTP JDBC
Client) Server)
Java Perl .NET
VB Script Cmd Line
Classes Scripts Classes
Server Integration REST SOAP
Cmd Line DBs
Extend pipelines with new components that Service Service
integrate your code, data and services.
6. Pipeline Server Architecture
Apache HTTP Server
Authentication and Authorization Security Module
Mod_balancer File Protocol runjob CGI Admin Portal
Locator XMLDB Runner Logging
Access Web
Service Service Service Service WSDL CGI Help Portal
Mod_proxy Service Services
1 .. 1 1 .. N Corporate
Directory
Data Flow Services
Apache Tomcat
DB’s
Query
Scheduler
Service
SOA
File System
CMS
XMLDB User Data Job Data
7. Launching Asynchronous (polling) Jobs
Apache HTTP Server
Authentication and Authorization Security Module
1. 2. 3. Runner Service 4. 5.
Create job Need to fork Poll job status via Read result
directory with scisvr? sts file file from disk
compressed and return
protocol.xml Monitor Job to client
Write lck file to
and uploaded existence via lck through
lck directory
input files file and process Apache
JVM status
scisvr(.exe) CLR
Write sts file and results files
to job directory
Job Folder
8. Launching Synchronous (blocking) Jobs
Apache HTTP Server
Authentication and Authorization Security Module
1. XMLDB 2. 3. Runner Service 4. 5.
Get Need to fork Connect to scisvr pipe. Send notification Stream
protocol scisvr? Send protocol XML and to apache via pipe results back
XML from request parameters when done to Apache
Protocol on pipe
DB
Write lck file to
lck directory
JVM
scisvr(.exe) CLR
XMLDB XMLDB
9. Job Settings
• Set Max Running Jobs to 2x available
cores
• Set Blocking Job timeout between
10-30 seconds, not more due to
client starvation
• Maximum Number of Parallel
Processing is a guideline, not a strict
maximum. Set to 2x cores
• Set Maximum Job Daemons Per Pool
to 2x available cores
• Job Readiness Refresh Rate assists
with multipurpose servers which can
become “cold”
• Read application specific
recommendations for more details
10. Process Management - Pools
– Identified by
__poolid=<name>
parameter on request.
– Needs to be sent from the
client, not from the saved
protocol
– Latency of 20-200 ms
– Creates a pool of scisvr.exe
processes dedicated to that
pool
– Enables caching of
expensive resources:
• JVM
• CLR
• Database connections
• Protocol DB Shortcuts
and References
11. Process Management – Pools w/ Impersonation
– Impersonation
create a small
pool for each
user for each
pool
– Lower the pool
sizes to
accommodate
this behaviors
12. Scisvr Pool Settings – Config files
Setting Default Description
Start Servers 0 Number of initial processes in this pool, created when apache starts
Min Spare Servers 1 Min number of idle processes to keep alive
Max Spare Servers 1 Max number of “available”processes to keep alive
Max Spare Servers Trim Time 0 Time to wait (seconds) before pruning “Available” servers exceeding
Max Spare Servers value
Max Servers 16 The total number of servers to allow for this pool
Max Queue Depth 32 maximum number of jobs to queue before rejected. Can be 0 or -1 for
infinite
Max Requests Per Server -1 Maximum number of requests to handle in a single server before
exiting, -1 is infinite
Time to Live 300 Idle timeout (seconds) for pooled server to live
Warm-up Protocol Path to initial protocol to run
Memory Threshold 80 Max % phys mem use by all proc’s before pruning
Individual Usage Threshold 15 Max % phys mem use by one proc before pruning
13. Web Job Launch Scalability Improvements
Framework overhead on blocking, pooled jobs on 8 core Windows 2008 R2 (64 bit)
14. Web Job Launch Scalability Improvements Linux
For simple chemistry fetch of 10 records to JSON on 8 core RedHat Linux ES5 (64 bit)
Identical tests on Windows 2008 RS on identical hardware
15. Performance Tuning Document
• Guide available on Accelrys Forums
– http://doc.accelrys.com/library/PipelinePilot/doc/performance
_tuning.pdf
16. Public Cluster
Execute
Login
Users
Secondary
Clients Pipeline
NFS
Pilot Servers
Primary
Pipeline
Pilot Server
Users
16
17. Private Cluster
Login
Users
Execute
Secondary
Pipeline
Users Pilot Servers
NFS
Primary
Pipeline
Pilot
Users Server
Clients
17
18. Grid (SGE, PBS, LSF, other)
Grid
Login software
Users and SOAP
Execute
Grid Nodes:
do not require
Users Apache HTTPD
Primary Pipeline Pilot
NFS
server
and
grid submission server
Users
Clients
18
19. IP-based Load Balancing 1
Execute
Login
Users
XMLDB
Clients
File share
Reverse Proxy
or
IP-based Load Job Folders
Symmetrical
Balancer Pipeline Pilot
User Folders
Users Server Nodes
Shared Storage
19
20. Summary
• What we learned
– Apache service and launching system
– Job launching and settings
– Process management for pooling
– How pooling has improved latency (snappiness)
– Clustering and grids
21. The information on the roadmap and future software development efforts are
intended to outline general product direction and should not be relied on in making
a purchasing decision.
For more information on the Accelrys Tech Summits and other IT & Developer
information, please visit:
https://community.accelrys.com/groups/it-dev