In this talk we will provide an overview of the new great features and architectural options of Alfresco 4 around scalability, performance and benchmarking. With a solution oriented focus around the most common Alfresco large scale use cases, we will show the scalability and consistency implications of, amongst others, Apache SOLr integration, optional in-transaction indexing, redesigned permission checking and filesystem interfaces (e.g. CIFS) clustering. Finally we will also introduce objectives and practical results expected from the currently ongoing benchmark for Alfresco 4.
1. Alfresco Scalability and Performance
How Alfresco 4.x will solve all you headaches around scalable ECM solutions
2. Agenda
ECM high end scenarios
• ECM platform use cases
• What is ECM scalability?
Alfresco ECM scalability and performance
• What we (should) already know
• Alfresco 3.4 improvements
• Alfresco 4.0 “hardening” features
• Apache SolR
• Clustered filesystems
• Post 4.0 scalability opportunities
Alfresco Platform benchmarks
• Benchmarks rationales and progress
• Alfresco Benchmark Tools
3. Alfresco ECM Solutions
What do you expect from an ECM system?
• ECM semantics grows alongside with the ‘content explosion’
• The Classic Alfresco ECM Trio
• System of Record (Massive Injection & Retrieval Content Platform)
• System of Engagement (Enterprise Collaboration Platform)
• Web Content Publishing (Multichannel XML/HTML delivery)
• The present (and future)
• Social Content Management
• Records Management & Archival
• Business Intelligence (Content & Workflow OLAP)
• Each solution has specific requirements around
1. Scalablity
2. Information retrieval Isolation
4. ECM Scalability defined
What does scalability mean for ECM?
• Performance
• “Acceptable” to “No” search degradation as users/content grow
• “Acceptable” to “No” browsing degradation as users/content grow
• Geo-independence of the system (if global)
• Availability
• Offer service continuity upon
• Disasters
• Functional / corrective maintenance
• No single point of failure
• Load distribution
• Parallelization
• Optimization of CPU usage ($$$) per transaction
Scalability requirements are solution dependent …
So pick your battles!
5. Alfresco is designed to scale
At any level of the infrastructure
• Repository
• Cluster nodes can be added dynamically
• Ehcache supports distributed cache replication
• User interface
• Alfresco Share is stateless and HTTP based
• Share can scale out independently from the repository
• Database
• Alfresco supports Master / Slave DB replication
• Solutions a la Oracle RAC are also supported
• Storage
• Content Addressable Storage (XAM) support – Enterprise Only
• Content Store Selector – Enterprise Only
Check out the Scale your Alfresco Solutions paper!
http://support.alfresco.com/ics/support/DLRedirect.asp?fileID=18158
6. Full-Blown Multi-layer scalable architecture
Load Balancer
Share Share Share Share
App Srv App Srv App Srv App Srv
Load Balancer
Alfresco Alfresco Alfresco Alfresco
EHCache EHCache EHCache EHCache
Index
Index
Index
Index
Database
Database Database
Content
(Master) (Slave)
Store
Failover
Database Clustering
Content Content
Store 1 store 2
Content store selector
7. Levels of ECM Information Isolation
Think about it as “transaction isolation” for Databases
1. SERIALIZABLE
• System of records
2. REPEATABLE_READS
3. READ_COMMITTED
4. READ_UNCOMMITTED
8. Where does your solution stand?
Identify where your solution is in the graph, then trade
off between consistency & performance!
There is NO one size fits all, so pick your battles!
9. 10 tips you MUST know to scale Alfresco
1. Disable quotas when unneeded
system.usages.enabled=false
2. Disable audit
audit.enabled=false
3. When ~1M docs and above fine tune #index segments
http://wiki.alfresco.com/index.php?title=Index_Merging_Performance
4. Tune DB pool size (default for evaluation mode)
db.pool.max=225
5. And don’t forget to tune DB accordingly
Ask your DBA to allow enough incoming connections (especially in high concurrency)
6. Use multi-operation batches for your transactions
Transaction setup and teardown are expensive!
7. For bulk injection you can disable in transaction indexing
index.tracking.disableInTransactionIndexing=true
8. Tune permission checking behavior
system.acl.maxPermissionChecks and system.acl.maxPermissionCheckTimeMillis
9. Read the “Scale your Alfresco Solutions” paper!
10. Call me (or any other Alfresco Consultant)
10. 5 scalability gotchas prior to Alfresco 3.4
1. In process (or in transaction) content indexing
Alfresco spending transaction time to update Lucene index
2. Lucene index replicated & tracked per cluster node
Additional DB and Alfresco load for IndexTransactionTracker
3. Query ‘bottlenecks’ during index maintenance
High (blocking) CPU spike during index merging
4. Not all interfaces available in High Availability mode
Additional DB and Alfresco load for IndexTransactionTracker
5. Time Limited permission checking
Non deterministic search results on large user / content bases
11. Alfresco 3.4 scalability improvements
• Hibernate removal
• Improved / optimized DB querying with Ibatis
• Faster commit time
• Permission checking improvements
• Ongoing work in all 3.x versions
• Content Replication
• Geographic master/slave distribution of content
• Can be used also for archival, WCM deployment, etc.
• Site performance project (Enterprise 3.4.6)
• High Share concurrency scenarios
• Tested and usable up to 60.000 Share sites!
12. Alfresco 4.0 radical answers to scalability
• Introduction of logical separate indexing tier
• Apache Solr Integration
• NOTE: Eventual vs transactional index consistency
• Clustering for File System interfaces (e.g. CIFS)
• ContentDiskDriver2
• Scenario specific session linked state distributed using
Hazelcast
• Deterministic permission checking
• Refactored DB canned queries allow in query checking
• Solr filters allow in search permission checking
13. The Apache SOLR subsystem
Rationale
• Removes Lucene load from Alfresco Repository
• Externalize and centralize a logical indexing tier
• Avoid per cluster node index tracking
Architectural features
• Pull (vs. Push) indexing
• Solr polls Alfresco periodically for index updates
• Default 15s can be configured
• Can be scaled out
• Multi Solr architectural options
• NOTE: sharding/clustering not available yet
• Dedicated or shared Enterprise search engine
16. Solr Implications
Do I have to migrate?
• No, you can still use Lucene indexing subsystem
• Solr can be configured to run and index in parallel
Key Features
• No in-txn indexing
• One core per Alfresco store (e.g. WorkspaceStore, ArchiveStore)
• Cores can be configured separately
• In query permission checking with Solr filters
• Deterministic
Alfresco impacts
• Wherever “transactional consistency” was needed index
queries have been substituted with DB “canned queries”
• Authentication, Doclib, Bootstrap, Check-in/out amongst others
• Implementers should be aware of eventual index consistency
17. Transactional vs. Eventual index update
Transactional (prior to 4.0) Eventual (4.0)
• Indexes updated within • Index server periodically
the database txn polls repository (default =15s)
Pros Pros
• Indexes consistent with DB • Faster commit time (50%)
at any time • Separately scalable tier
• Applications can work • Configurable index delay
independently with DB or
indexes • Independent from
Cons #(concurrent users)
• Resource intensive
• Slower commit time Cons
• Index locking and • Dirty or non repeatable
contemption with index reads are possible
concurrent user growth • Cannot be used where
transactional consistency
is needed
• E.g. RM, AVM, custom apps
18. Clustered file systems
What’s new?
• ContentDiskDriver2 Brand new implementation
• JLAN-Alfresco interface binds state to sessions
• No clustering required
• JLAN Clustering
• Hazelcast provides distributed locking
http://www.hazelcast.com/
Hazelcast configuration
filesystem.cluster.enabled=[true,false]
Enables or disables the filesystem cluster.
filesystem.cluster.configFile
Location of Hazelcast configuration file
http://wiki.alfresco.com/wiki/Configuring_Hazelcast_for_JLAN_clustering
19. Post 4.0 frontiers
SolR
• Index Shards
• Clustering (Read replication)
• Solr on EC2
• Faceted Search
• Term highlighting
Product evolution
• Cloud offering
• Benchmarking process
20. Recap - Remember the gotchas?
1. In process (or in transaction) content indexing
Asynchronous indexing with Solr
2. Lucene index replicated & tracked per cluster node
Centralized Solr indexing tier
3. Query ‘bottlenecks’ during index maintenance
Index load / maintenance moved to a separate tier
4. Not all interfaces available in High Availability mode
ContentDiskDriver2 & Hazelcast enable CIFS clustering
5. Time Limited permission checking
Solr filters query time deterministic permission checking
23. Alfresco Platform Benchmarks (EE 4.0)
What are we going to test?
• Scenario driven
• Bulk loading / massive injection
• Enterprise Collaboration Platform
• Dimensions
• (#content nodes, #users, #cluster nodes)
• Overtime architectural options / versions
• Initial data points identified starting from field experience
What are we going to measure?
• Throughput
• Min/Avg/Max Response time
• Cluster scalability
What can you expect?
• Updated Scalability paper with quantitative information
• Initial comparative data with 3.4
24. Alfresco Benchmarks – Why?
For Community and Enterprise network
• Provide quantitative evaluation of Alfresco scalability
• Offer tooling to self benchmark Alfresco in your context
For Engineering Research
• Determine impact of new ideas
• Profile performance issue
For Sizing Guidelines
• How many CPUs do you require?
• How many documents can you store
For QA
• Performance Regression Tests
• Not a one-off exercise
• Become part of engineering process
25. Alfresco Benchmarks Tools
Repository benchmark suite
• JMeter scripts executable from ANT
• CMIS (Mixed and Sequential)
• WebDav (Mixed and Sequential)
• Available at HEADcoderootprojectsrepository-bm
Alfresco Bulk Import Tool
• Hosted on Google Code
• Now Multi-threaded!
• Offers a “content streaming free” mode
• Great performances (especially with no in txn indexing)
Collaboration Platform benchmark suite
• JMeter scripts executable from ANT
• Testing Alfresco Share functionalities with configurable concurrent
users
• Not publicly available at the time of this writing