SlideShare une entreprise Scribd logo
1  sur  33
Mike Farman
Product Manager, Alfresco
Peter Monks
Director, Professional Services, Alfresco
Derek Hulley
Senior Engineer, Alfresco




2
Many areas to consider...

• Core Repository
• Web-tier load balancing and caching
• Scale-up/scale out - horizontal vs. vertical
• Components tuning
• Replication strategies (3.4)
• Profiling and benchmarking
• ....


We‟re going to focus on the Core Repository

4
What happens when you create a node?
         1
   Begin
Transaction
                        3                  4                                                  8
         2     Write                                                   5
  Create                          Update DB                    Begin           Commit
              stream                                                         (Transaction ID for
node in DB                       content URL                  Commit           IndexTracking)
              to disk
                                                                        6             9
                                                              Transform       Add to L2
                                                            (extract) Text     Cache


                                                               Update 7
                                                            Index (Props
                                                             & Content)
                            Content Indexing
                            automatically moved to
                            background if text extraction               7a
                            exceeds 20 ms                   Index Fulltext
5                                                           (Background)
What happens when you querying for nodes?


           1             2                   3
  Query                           Batch                     4            5
               Results Set                       In Cache       Result Set
(Lucene)                         Pre-fetch
                                                        4a
                                                 DB Fetch




             Check 6               Deliver 7
           Permissions             Results
       - Max Permission Checks
       - Timeout




6
What happens when you read a nodes content?


             1                           4          5
    Node Read              2
                                Fetch         Stream
                  Cached
     Request                   Content       Response
                           3
                 DB Lookup




7
Example Use Cases:

• UC01: Bulk Loading
    • High batch throughput, ongoing
       • e.g. scanning, archival solutions, systems of record
    • Migration
       • One-off migration to Alfresco from legacy system
          • Then UC02...
• UC02: Enterprise Collaboration Platform
    • Concurrent users, variety of interfaces
    • e.g. Team/Project Collaboration, Document/Knowledge
      Management




8
Typical Characteristics
• Large number of documents and throughput
     • 10‟s thousands documents injected per day, often during nightly hours
     • 10‟s million documents per year
• Low User concurrency
     • 100-1000 users (read only access)
• Application profile – System of Record
     •   End users mostly search & read
     •   Document formats: PDF, TIFF, JPG (i.e. no full text indexing)
     •   Typically fixed metadata
     •   No or little version control
     •   Few to no rules, actions, workflows, content transformations
• Client Interfaces
     • Share/Explorer or Custom e.g. Web Scripts, CMIS
     • Typically little CIFS/WebDAV/FTP


10
Primary Objective is to Maximise Throughput
• Parallel processing
     • Load nodes simultaneously
• Avoid unnecessary in-transaction processing
     • In-transaction services often not required when loading
        • e.g. Transformation, Indexing
• Disable unneeded services
     • Many standard services are not required when loading
• Minimise network and file I/O operations
     • Get source content as close to server storage as possible


• Always benchmark and tune...
     • JVM, Network, Threads, DB Connections...

12
Architectural considerations
• Creation is CPU, memory, network intensive
     • Always 64 bit
     • Rule of thumb: Prefer scale up over scale out – simpler deployment and
       management
     • Rule of thumb: get the content as close as possible to Alfresco
• Nature of the data set (i.e. batches) is KEY
     • If batches are sequential -> minimize time-per-batch
          • Scale up in CPU and memory
     • If batches are parallelizable -> maximize number of batches processed
          • Scale out with multi-threaded uploads
     • Consider dedicated server(s) for ingestion
          • Use production servers for migration use case and then reconfigure
• Design content storage around your data
     • How can you get the source content as close as possible to repository content
       storage?
• Note: Avoid Sparc T and related series
     • Highly parallel but not suited for atomic heavy serial operations

13
Tuning best practices - JVM                       Tuning – Application Server

• 64 bit                                          • Pay attention to the
• Make NewSize as large as                          machine capacity i.e.
  possible to avoid spill over                       • Threads
  to OldGen                                          • CPU Utilization
                                                     • I/O
• See
     http://wiki.alfresco.com/wiki/JVM_Tuning

                           Sample JVM Config: 64-bit, dual 2.6GHz
                           Xeon / dual-core per CPU , 8GB RAM
                           environment

                           -server
                           -Xss1M
                           -Xms2G
                           -Xmx3G
                           -XX:NewSize=1G
                           -XX:MaxPermSize=256M
16
Bad    Good 




17
Tuning best practices – I/O
• Network
     • Alfresco to Database is Key
         • Latency is key e.g. > 10ms is absolute max
         • JDBC fetch size should be 150
             • See BP-1_Alfresco_Environment_Validation_and_Day_Zero_Configuration
     • Alfresco to storage (if remote)
         • If possible, avoid it completely for file transfers - Stage content on local disks
         • Use a dedicated network for storage e.g. Fibre channel
     • Incoming to Alfresco – Typically not relevant for bulk loading use case

• Disk
     • Lucene index operations' are disk I/O intensive
         • Fast read/writes i.e. local disk
         • Avoid indexing if not required
     • Avoid unnecessary content file copying
         • Stage content on local disks
         • Consider set cm:content property directly e.g.
             • contentUrl=store://mypath/mydocument.docx|mimetype=application/vnd.openxmlformats-
               officedocument.wordprocessingml.document|size=51142|encoding=UTF-8|locale=en_GB_

18
Tuning best practices - Database
• Connections – Relevant if you are loading concurrently
     • See BP-1_Alfresco_Environment_Validation_and_Day_Zero_Configuration
• DB Indexes & Statistics
     • Plan your batch loads to allow for periodic statistics maintenance
• Make sure the database hardware/software is sized
appropriately e.g.
     • Log sizes, flush on transaction commit, cache tuning, lock
       management....
     • Use of multiple physical volumes/RAID....
•All databases provide many options to optimise
performance
     • Get a DB administrator, partner involved


19
Tuning best practice - Repository Services

• Force background indexing
     • alfresco-global.properties
         • Everything: index.tracking.disableInTransactionIndexing=true
         • Just Content: lucene.maxAtomicTransformationTime=0
     • Is content indexing required at all?
         • DoNotIndex aspect


• “Run As” system user to avoid permission checking




20
Tuning best practice - Repository Services
• Use an optimised custom bulk loader
     • Process docs in batches - not 1 doc per transaction or 1 transaction for entire
       content set
         • Example: 100 documents per batch
     • Use Foundation (Java) API if possible
• Design multi-threaded import code
     • Partition your data set so you can use multiple threads loading in different areas
     • Scale up CPU accordingly
•Consider direct APIs (e.g. “NodeService” vs “nodeService”)
     • Public services are heavily wrapped with interceptors for transactions, auditing,
       permissions, multilingual translations, etc.
• Disable behaviours
     • Rules evaluations, cm:auditable, versioning, quotas (system.usages.enabled=false)
•Use proper transaction demarcation
     • Complete all operations on a node in a single transaction
     • Batching – group multiple updates in a single transaction
     • Avoid mixing reads and writes
• See session CS2-Repository_Internals for more details on API specifics
21
Tuning best practices – Repository Services

• Disable modified timestamp propagation to parent folders
     • system.enableTimestampPropagation=false (default)
• Deleting large numbers of nodes
     • Skip deleted items (archive) by adding sys:temporary aspect your
       content before deletion
• Partition your content within the repository
     • Depends on read access requirements
     • Consider partitioning more than 2000 nodes per space if browsing
       space children
        Note: Performance much improved in later releases 3.3.3, 3.4 –
          test for your use case


22
Scale Out Using Dedicated Bulk Load Server(s)

• Alfresco can support a non-clustered injection only tier

     • Objective: Separate input write process from front end read load

     • Solution: Dedicated injection tier pointing to same DB/Content
       store(s) as front end servers. No need to cluster caches from this
       tier with the front end. Background index properties and/or content,
       indexes will catch up from DB transactions.

     • Benefits: No Cache update/invalidation overhead. Indexing does not
       block loading process



24
Bulk load server(s) not clustered but share storage and DB
  product servers will „catch up‟ via index tracking
Bulk Load Process                                                         Runtime Clients
Creates Only
 Bulk Load A             Bulk Load B                   Production A            Production B            Production C


      Tomcat                  Tomcat                        Tomcat                   Tomcat                  Tomcat



      EHCache                 EHCache                       EHCache                 EHCache                 EHCache




                Lucene                  Lucene                        Lucene                  Lucene                  Lucene
                 Index                   Index                         Index                   Index                   Index




                                                          Database
                                             Content
                                                                MySQL
                                              Store




 25
Load Server(s) Configuration Tips
• Bulk Load Server(s)
     • To exclude servers(s) from cluster:
        • Do not set cluster name for bulk load servers in alfresco-global.properties
           • alfresco.cluster.name=
     • Force background indexing in the local alfresco-global.properties using:
        • Everything:
           • index.tracking.disableInTransactionIndexing=true
        • Just Content:
           • lucene.maxAtomicTransformationTime=0
     • Note: The load process should perform creates only, no updates or
       reads
• Production Server(s)
     • Ensure index tracking is enabled:
        • index.tracking.cronExpression=0/5 * * * * ?
        • index.recovery.mode=AUTO



26
Example: In-transaction v‟s Background Indexing

• 10,000 docs, 1,000 folders
• 50kb word documents
• FTP with 10 sessions
• Laptop

• Foreground Indexing:
     • 33 mins
• Background Indexing:
     • 5 mins



27
UC02: Enterprise Collaboration Platform




29
Requirements
• High (and potentially highly distributed) user concurrency
     •   1,000‟s -10,000‟s users (read & write)
     •   Medium/High number of documents
     •   10,000-1 million+ documents
     •   1000 document updates per day
• Complex enterprise content and permission models
     • Multiple content models/Dynamic ACL
     • Versioning and full text indexing on all documents
     • Document types: Office, drawing, images
• Advanced content management
     • Multiple rules and actions
     • Heavy use of content transformations/workflow
•Interfaces (All)
     • Share, WebDAV, CIFS ....


30
Architectural considerations
• Fully fledged platform deployment
     • Need to consider maintenance window
• Scale out Share independently from Repo
     • Front and intermediate Load balancer/Web Cache layers
     • Read/write split and scheduled repository exclusion for maintenance
• Scale out transformation server
     • Enterprise only: JOD OpenOffice subsystem
• Scale out and up infrastructure
     • Cluster CIFS with DFS (Distributed File System)
     • All HTTP based protocols scale seamlessly (SSP on port 7070)
•Balance multi-CPU (scale up) and multi-node clusters (scale out)
     • Overhead of index tracking


31
Design best practices
• Distribute your content within the repository
     • Otherwise search and retrieval performance degradation is likely
     • Use versioning and indexing where appropriate, not just because it‟s
       there..
     • e.g. don‟t simply apply cm:versionable to the full cm:content
• Modelling
     • Prefer aspects over types
        • Remember aspects support inheritance as well
     • Content Model indexing options
        • Tune what you need to index
• Quotas (aka Usages)
     • Might save your repo from content explosion but also have an
       overhead!


32
Tuning best practices – Note: Also see bulk load use case!
• RDBMS
     • Number of connections much more important for this use case
     • Formula: HTTP Worker Threads + 75 per cluster node
         • For Tomcat defaults this is 275
• Cache Configuration
     • L2 Cache: increase with RAM to include more objects in cache
     • Use ehcache tracing tool to indentify which caches have low hit ratios and increase if you have available memory
     • See http://wiki.alfresco.com/wiki/Repository_Cache_Configuration#Tracing_cache_sizes for details
• Alfresco Configuration optimization
     • VFS thread pool tuning (default: <threadPool init=“25” max=“50” />)
     • Tune ACLs and preload common searches (if needed)
         system.acl.maxPermissionCheckTimeMillis=10000
         system.acl.maxPermissionChecks=10000
         Query via node browser as different users, not only admin
     • Consider bulk load large user bases (10,000s) to single (un-clustered) node and then cluster
         • Disable eager home folder creation
                •   home.folder.creation.eager=false in alfresco-globallproperties
     • Use multi-threaded and incremental LDAP sync once initial sync has been completed
         • Differential sync is the default
• Lucene Tuning
     • Lucene.maxAtomicTransformationTime=20
• Monitor the network performance when adding nodes to a cluster
     • What for ehcache waiting for the network via thread dumps
     • Consider disabling some/all of the L2 caches




33
HTTP Clients
                                                                                                     Example Windows ECM
                                                                            CIFS
                                           e.g. Share                  via alfrescocifs            Production
                                                                                                     Cluster Install
                                            HTTP Load Balancer       DFS Round Robin                 - Local & Shared Content
                                                                                                     Store              Active
                                                                                                                     Directory
                                                                                                     User/Group Sync
                                                                                                     NTLM Authentication

                                  alfappsrv01                                        alfappsrv02

                                        Tomcat 1                                   Tomcat 2
                       Local                                                                               Local
                      alf_data                                                                            alf_data
• Lucene Index                                                                                                       • Lucene Index
                                        EHCache              Clustered             EHCache                              d:alf_storelucene-indexes
   d:alf_storelucene-indexes
• Content Store                                                                                                      • Content Store
   d:alf_storecontentstore                                                                                            d:alf_storecontentstore
   In & Outbound Replication                                                                                            In & Outbound Replication to
                                                                                                                     shared content store on SAN

                                                                  JDBC
                                                                  oraclecluster

                                 alfclustsrv01                                       alfclustsrv02

 • Replicating Content Store                                                                             • Replicating Content Store
                                         Oracle 1                                  Oracle 2
 In & Outbound replication                                 <- Failover ->                                In & Outbound replication
 between local and shared                                                                                between local and shared
 content store                                                                                           content store
                                                                                     MSCS Cluster


                                                               SAN


                                 • Shared Content Store: sharedContentStore (alfdataDatastore)
                                 • Oracle:
                                    - Data (o:oradataalfresco), Control (o:oradataalfresco) & Logfiles (L:oradataalfresco)
                                    - Oracle Backup (o:flash_recovery_area)
                                 • Lucene Index Backup (alfdataHold)
Replication (3.4) offers new deployment options

• Replication may be appropriate for specific contexts
     • Provides selective replication of content between distinct Alfresco
       repositories
     • On demand or scheduled via Replication Jobs
     • Reporting and Tracking of Replication Jobs


• Read and viewing performance: Content is served from a
local server




35
For any system...
• Do not use the OOTB settings for application server, database etc
Alfresco you must always tune for your use case
• Balance your resources
     • Separate tiers for DataBase, Content, App Servers
• Indexes should always be on fast, local disk e.g. not NFS mounts,
USB drives etc
• Run on a supported stack e.g.
     • e.g. issues with 1.6u10 use JDK 1.6u.20, use MySQL 5.1.39 or later
• Don‟t starve your database of connections:
     • db.pool.max=XXX
• Use appropriate application server worker threads
     • Configuration details are application server specific e.g. Tomcat: server.xml
• When clustering, use JGroups and Unicast
• Use the latest Alfresco version/service pack e.g.
     • 3.3.3, 3.4


36
Things you should NOT change

• The database transaction isolation level
     • Use defaults for all databases except MS SQLServer
     • FYI. SQLServer should be:
        • db.txn.isolation=4096
        • ALTER DATABASE alfresco SET ALLOW_SNAPSHOT_ISOLATION ON;

• The ehcache default configuration i.e. Replicate async
• The Lucene indexing defaults unless you know what you
are doing and why!
• Note: Also do not do a full-index rebuild unless you know
what was wrong in the first place!
     • Use the index checker

37
Benchmark your solutions




38
Alfresco Benchmarks

• Alfresco Benchmark Tools
     • alfresco-bm – http://wiki.alfresco.com/wiki/Server_Benchmarks
     • SimpleInjector – (check partners.alfresco.com)
     • For CIFS loading -> Jmeter + SMB mount
• Alfresco Benchmark Results
     • Unisys benchmark results
     • JCR Benchmarks
• WIP
     • “Scale your Alfresco Solutions” (in http://partners.alfresco.com)
     • More Platform benchmark ongoing – watch this space!



39
Profiling your Alfresco solution
•Alfresco Application Profiling
     • JMX (for Enterprise Only see Admin Guide)
       http://wiki.alfresco.com/wiki/JMX
     • Audit Surf
       http://forge.alfresco.com/projects/auditsurf/
     • Nagios integration
       http://forge.alfresco.com/projects/nagios4alfresco/
• Infrastructure Profiling
     • VisualVM (JVM)
       http://ur.ly/esjZ
     • Thread Dump Analyzer
     • https://tda.dev.java.net/
     • YourKit (JVM)
       http://wiki.alfresco.com/wiki/JMX
     • WireShark (Network)
       http://www.wireshark.org/
     • Mysql Query Profiler (DBMS)
       http://dev.mysql.com/tech-resources/articles/using-new-query-profiler.html



40
Q/A & Feedback

• Any Questions?
• Share your experiences (good and bad) with us so we can
all learn!
     •   Successful scaled up/out architectures
     •   Limitations, bottlenecks
     •   Use case parameters => Implementation => Results
     •   What worked, what didn‟t




43

Contenu connexe

Tendances

Moving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco RepositoryMoving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco RepositoryJeff Potts
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in AlfrescoAngel Borroy López
 
Bulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoBulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoRichard McKnight
 
Alfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionAlfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionFrancesco Corti
 
Introduction to Apache Hive
Introduction to Apache HiveIntroduction to Apache Hive
Introduction to Apache HiveAvkash Chauhan
 
Alfresco 5.2 REST API
Alfresco 5.2 REST APIAlfresco 5.2 REST API
Alfresco 5.2 REST APIJ V
 
Alfresco Transform Service DevCon 2019
Alfresco Transform Service DevCon 2019Alfresco Transform Service DevCon 2019
Alfresco Transform Service DevCon 2019J V
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Guide to alfresco monitoring
Guide to alfresco monitoringGuide to alfresco monitoring
Guide to alfresco monitoringMiguel Rodriguez
 
Alfresco Backup and Disaster Recovery White Paper
Alfresco Backup and Disaster Recovery White PaperAlfresco Backup and Disaster Recovery White Paper
Alfresco Backup and Disaster Recovery White PaperToni de la Fuente
 
Apache Spark on K8S and HDFS Security with Ilan Flonenko
Apache Spark on K8S and HDFS Security with Ilan FlonenkoApache Spark on K8S and HDFS Security with Ilan Flonenko
Apache Spark on K8S and HDFS Security with Ilan FlonenkoDatabricks
 
Impala presentation
Impala presentationImpala presentation
Impala presentationtrihug
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...Databricks
 
Sizing your alfresco platform
Sizing your alfresco platformSizing your alfresco platform
Sizing your alfresco platformLuis Cabaceira
 
Alfresco DevCon 2019 Performance Tools of the Trade
Alfresco DevCon 2019   Performance Tools of the TradeAlfresco DevCon 2019   Performance Tools of the Trade
Alfresco DevCon 2019 Performance Tools of the TradeLuis Colorado
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 

Tendances (20)

Upgrading to Alfresco 6
Upgrading to Alfresco 6Upgrading to Alfresco 6
Upgrading to Alfresco 6
 
Moving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco RepositoryMoving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco Repository
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
 
Bulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoBulk Export Tool for Alfresco
Bulk Export Tool for Alfresco
 
Alfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionAlfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in Action
 
Introduction to Apache Hive
Introduction to Apache HiveIntroduction to Apache Hive
Introduction to Apache Hive
 
Alfresco 5.2 REST API
Alfresco 5.2 REST APIAlfresco 5.2 REST API
Alfresco 5.2 REST API
 
Alfresco Transform Service DevCon 2019
Alfresco Transform Service DevCon 2019Alfresco Transform Service DevCon 2019
Alfresco Transform Service DevCon 2019
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Guide to alfresco monitoring
Guide to alfresco monitoringGuide to alfresco monitoring
Guide to alfresco monitoring
 
Alfresco Backup and Disaster Recovery White Paper
Alfresco Backup and Disaster Recovery White PaperAlfresco Backup and Disaster Recovery White Paper
Alfresco Backup and Disaster Recovery White Paper
 
Apache Spark on K8S and HDFS Security with Ilan Flonenko
Apache Spark on K8S and HDFS Security with Ilan FlonenkoApache Spark on K8S and HDFS Security with Ilan Flonenko
Apache Spark on K8S and HDFS Security with Ilan Flonenko
 
Impala presentation
Impala presentationImpala presentation
Impala presentation
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
 
Sizing your alfresco platform
Sizing your alfresco platformSizing your alfresco platform
Sizing your alfresco platform
 
Alfresco DevCon 2019 Performance Tools of the Trade
Alfresco DevCon 2019   Performance Tools of the TradeAlfresco DevCon 2019   Performance Tools of the Trade
Alfresco DevCon 2019 Performance Tools of the Trade
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Apache NiFi Crash Course Intro
Apache NiFi Crash Course IntroApache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 

En vedette

Alfresco As SharePoint Alternative - Architecture Overview
Alfresco As SharePoint Alternative - Architecture OverviewAlfresco As SharePoint Alternative - Architecture Overview
Alfresco As SharePoint Alternative - Architecture OverviewAlfresco Software
 
Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...Alfresco Software
 
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...Symphony Software Foundation
 
Intro to Alfresco for Developers
Intro to Alfresco for DevelopersIntro to Alfresco for Developers
Intro to Alfresco for DevelopersJeff Potts
 
Alfresco in few points - Search Tutorial
Alfresco in few points - Search TutorialAlfresco in few points - Search Tutorial
Alfresco in few points - Search TutorialPASCAL Jean Marie
 
Alfresco Day Roma 2015: Full Stack Load Testing
Alfresco Day Roma 2015: Full Stack Load TestingAlfresco Day Roma 2015: Full Stack Load Testing
Alfresco Day Roma 2015: Full Stack Load TestingAlfresco Software
 
Scaling the Content Repository with Elasticsearch
Scaling the Content Repository with ElasticsearchScaling the Content Repository with Elasticsearch
Scaling the Content Repository with ElasticsearchNuxeo
 
Alfresco Day Roma 2015: Big Repository
Alfresco Day Roma 2015: Big RepositoryAlfresco Day Roma 2015: Big Repository
Alfresco Day Roma 2015: Big RepositoryAlfresco Software
 
Getting to know alfresco 4
Getting to know alfresco 4Getting to know alfresco 4
Getting to know alfresco 4Paul Hampton
 
Alfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy BehavioursAlfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy BehavioursJ V
 
Alfresco & Kofax - scan, manage, collaborate
Alfresco & Kofax - scan, manage, collaborateAlfresco & Kofax - scan, manage, collaborate
Alfresco & Kofax - scan, manage, collaborateAlfresco Software
 
The power of faceted search in alfresco
The power of faceted search in alfrescoThe power of faceted search in alfresco
The power of faceted search in alfrescoXeniT Solutions nv
 
Alfresco Day Roma 2015: Alfresco Activiti
Alfresco Day Roma 2015: Alfresco ActivitiAlfresco Day Roma 2015: Alfresco Activiti
Alfresco Day Roma 2015: Alfresco ActivitiAlfresco Software
 
Alfresco Security Best Practices 2012
Alfresco Security Best Practices 2012Alfresco Security Best Practices 2012
Alfresco Security Best Practices 2012Toni de la Fuente
 
Apache Chemistry: The Alfresco Open Source Implementation of CMIS
Apache Chemistry: The Alfresco Open Source Implementation of CMISApache Chemistry: The Alfresco Open Source Implementation of CMIS
Apache Chemistry: The Alfresco Open Source Implementation of CMISAlfresco Software
 
Enterprise Content Management Migration Best Practices Feat Migrations From...
Enterprise Content Management Migration Best Practices   Feat Migrations From...Enterprise Content Management Migration Best Practices   Feat Migrations From...
Enterprise Content Management Migration Best Practices Feat Migrations From...Alfresco Software
 
Alfresco Day Amsterdam 2015 - Alfresco One Product Suite Update + Demo
Alfresco Day Amsterdam 2015 - Alfresco One Product Suite Update + DemoAlfresco Day Amsterdam 2015 - Alfresco One Product Suite Update + Demo
Alfresco Day Amsterdam 2015 - Alfresco One Product Suite Update + DemoAlfresco Software
 
Installing and Getting Started with Alfresco
Installing and Getting Started with AlfrescoInstalling and Getting Started with Alfresco
Installing and Getting Started with AlfrescoWildan Maulana
 
Alfresco Day Brussels 2016 - Alfresco One Product Suite Update + Demo
Alfresco Day Brussels 2016 - Alfresco One Product Suite Update + DemoAlfresco Day Brussels 2016 - Alfresco One Product Suite Update + Demo
Alfresco Day Brussels 2016 - Alfresco One Product Suite Update + DemoAlfresco Software
 

En vedette (20)

Alfresco As SharePoint Alternative - Architecture Overview
Alfresco As SharePoint Alternative - Architecture OverviewAlfresco As SharePoint Alternative - Architecture Overview
Alfresco As SharePoint Alternative - Architecture Overview
 
Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...
 
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...
 
Intro to Alfresco for Developers
Intro to Alfresco for DevelopersIntro to Alfresco for Developers
Intro to Alfresco for Developers
 
Alfresco in few points - Search Tutorial
Alfresco in few points - Search TutorialAlfresco in few points - Search Tutorial
Alfresco in few points - Search Tutorial
 
Alfresco Day Roma 2015: Full Stack Load Testing
Alfresco Day Roma 2015: Full Stack Load TestingAlfresco Day Roma 2015: Full Stack Load Testing
Alfresco Day Roma 2015: Full Stack Load Testing
 
Scaling the Content Repository with Elasticsearch
Scaling the Content Repository with ElasticsearchScaling the Content Repository with Elasticsearch
Scaling the Content Repository with Elasticsearch
 
Alfresco Day Roma 2015: Big Repository
Alfresco Day Roma 2015: Big RepositoryAlfresco Day Roma 2015: Big Repository
Alfresco Day Roma 2015: Big Repository
 
Getting to know alfresco 4
Getting to know alfresco 4Getting to know alfresco 4
Getting to know alfresco 4
 
Alfresco Search Internals
Alfresco Search InternalsAlfresco Search Internals
Alfresco Search Internals
 
Alfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy BehavioursAlfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy Behaviours
 
Alfresco & Kofax - scan, manage, collaborate
Alfresco & Kofax - scan, manage, collaborateAlfresco & Kofax - scan, manage, collaborate
Alfresco & Kofax - scan, manage, collaborate
 
The power of faceted search in alfresco
The power of faceted search in alfrescoThe power of faceted search in alfresco
The power of faceted search in alfresco
 
Alfresco Day Roma 2015: Alfresco Activiti
Alfresco Day Roma 2015: Alfresco ActivitiAlfresco Day Roma 2015: Alfresco Activiti
Alfresco Day Roma 2015: Alfresco Activiti
 
Alfresco Security Best Practices 2012
Alfresco Security Best Practices 2012Alfresco Security Best Practices 2012
Alfresco Security Best Practices 2012
 
Apache Chemistry: The Alfresco Open Source Implementation of CMIS
Apache Chemistry: The Alfresco Open Source Implementation of CMISApache Chemistry: The Alfresco Open Source Implementation of CMIS
Apache Chemistry: The Alfresco Open Source Implementation of CMIS
 
Enterprise Content Management Migration Best Practices Feat Migrations From...
Enterprise Content Management Migration Best Practices   Feat Migrations From...Enterprise Content Management Migration Best Practices   Feat Migrations From...
Enterprise Content Management Migration Best Practices Feat Migrations From...
 
Alfresco Day Amsterdam 2015 - Alfresco One Product Suite Update + Demo
Alfresco Day Amsterdam 2015 - Alfresco One Product Suite Update + DemoAlfresco Day Amsterdam 2015 - Alfresco One Product Suite Update + Demo
Alfresco Day Amsterdam 2015 - Alfresco One Product Suite Update + Demo
 
Installing and Getting Started with Alfresco
Installing and Getting Started with AlfrescoInstalling and Getting Started with Alfresco
Installing and Getting Started with Alfresco
 
Alfresco Day Brussels 2016 - Alfresco One Product Suite Update + Demo
Alfresco Day Brussels 2016 - Alfresco One Product Suite Update + DemoAlfresco Day Brussels 2016 - Alfresco One Product Suite Update + Demo
Alfresco Day Brussels 2016 - Alfresco One Product Suite Update + Demo
 

Similaire à Scale your Alfresco Solutions

Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Perforce
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inRahulBhole12
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Alexey Rybak
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware ProvisioningMongoDB
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Alluxio, Inc.
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio, Inc.
 
Alfresco 4: Scalability and Performance
Alfresco 4: Scalability and PerformanceAlfresco 4: Scalability and Performance
Alfresco 4: Scalability and PerformanceAlfresco Software
 
Alfresco scalability and performnce
Alfresco   scalability and performnceAlfresco   scalability and performnce
Alfresco scalability and performncePaul Hampton
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
Oracle SOA suite and Coherence dehydration
Oracle SOA suite and  Coherence dehydrationOracle SOA suite and  Coherence dehydration
Oracle SOA suite and Coherence dehydrationMichel Schildmeijer
 
Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Chris Nauroth
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Tony Pearson
 
Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpSander Temme
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streamsYoni Farin
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfsNAVER D2
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Alluxio, Inc.
 

Similaire à Scale your Alfresco Solutions (20)

Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
CLFS 2010
CLFS 2010CLFS 2010
CLFS 2010
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata Services
 
Alfresco 4: Scalability and Performance
Alfresco 4: Scalability and PerformanceAlfresco 4: Scalability and Performance
Alfresco 4: Scalability and Performance
 
Alfresco scalability and performnce
Alfresco   scalability and performnceAlfresco   scalability and performnce
Alfresco scalability and performnce
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Oracle SOA suite and Coherence dehydration
Oracle SOA suite and  Coherence dehydrationOracle SOA suite and  Coherence dehydration
Oracle SOA suite and Coherence dehydration
 
HDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and SupportabilityHDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and Supportability
 
Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4
 
Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling Up
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfs
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
 

Plus de Alfresco Software

Alfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Software
 
Alfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management applicationAlfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management applicationAlfresco Software
 
Alfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion HogescholenAlfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion HogescholenAlfresco Software
 
Alfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente AmsterdamAlfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente AmsterdamAlfresco Software
 
Alfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of AlfrescoAlfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of AlfrescoAlfresco Software
 
Alfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo GroupAlfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo GroupAlfresco Software
 
Alfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About FlowAlfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About FlowAlfresco Software
 
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...Alfresco Software
 
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...Alfresco Software
 
Alfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest APIAlfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest APIAlfresco Software
 
Alfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-KonsoleAlfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-KonsoleAlfresco Software
 
Alfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit AlfrescoAlfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit AlfrescoAlfresco Software
 
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...Alfresco Software
 
Alfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: WesternacherAlfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: WesternacherAlfresco Software
 
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...Alfresco Software
 
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novumAlfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novumAlfresco Software
 
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...Alfresco Software
 
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...Alfresco Software
 
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - SafranAlfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - SafranAlfresco Software
 
Alfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital BusinessAlfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital BusinessAlfresco Software
 

Plus de Alfresco Software (20)

Alfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossier
 
Alfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management applicationAlfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management application
 
Alfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion HogescholenAlfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
 
Alfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente AmsterdamAlfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
 
Alfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of AlfrescoAlfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of Alfresco
 
Alfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo GroupAlfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo Group
 
Alfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About FlowAlfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About Flow
 
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
 
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
 
Alfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest APIAlfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest API
 
Alfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-KonsoleAlfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-Konsole
 
Alfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit AlfrescoAlfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit Alfresco
 
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
 
Alfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: WesternacherAlfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
 
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
 
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novumAlfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novum
 
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
 
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
 
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - SafranAlfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
 
Alfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital BusinessAlfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital Business
 

Dernier

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Dernier (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Scale your Alfresco Solutions

  • 1. Mike Farman Product Manager, Alfresco Peter Monks Director, Professional Services, Alfresco Derek Hulley Senior Engineer, Alfresco 2
  • 2. Many areas to consider... • Core Repository • Web-tier load balancing and caching • Scale-up/scale out - horizontal vs. vertical • Components tuning • Replication strategies (3.4) • Profiling and benchmarking • .... We‟re going to focus on the Core Repository 4
  • 3. What happens when you create a node? 1 Begin Transaction 3 4 8 2 Write 5 Create Update DB Begin Commit stream (Transaction ID for node in DB content URL Commit IndexTracking) to disk 6 9 Transform Add to L2 (extract) Text Cache Update 7 Index (Props & Content) Content Indexing automatically moved to background if text extraction 7a exceeds 20 ms Index Fulltext 5 (Background)
  • 4. What happens when you querying for nodes? 1 2 3 Query Batch 4 5 Results Set In Cache Result Set (Lucene) Pre-fetch 4a DB Fetch Check 6 Deliver 7 Permissions Results - Max Permission Checks - Timeout 6
  • 5. What happens when you read a nodes content? 1 4 5 Node Read 2 Fetch Stream Cached Request Content Response 3 DB Lookup 7
  • 6. Example Use Cases: • UC01: Bulk Loading • High batch throughput, ongoing • e.g. scanning, archival solutions, systems of record • Migration • One-off migration to Alfresco from legacy system • Then UC02... • UC02: Enterprise Collaboration Platform • Concurrent users, variety of interfaces • e.g. Team/Project Collaboration, Document/Knowledge Management 8
  • 7. Typical Characteristics • Large number of documents and throughput • 10‟s thousands documents injected per day, often during nightly hours • 10‟s million documents per year • Low User concurrency • 100-1000 users (read only access) • Application profile – System of Record • End users mostly search & read • Document formats: PDF, TIFF, JPG (i.e. no full text indexing) • Typically fixed metadata • No or little version control • Few to no rules, actions, workflows, content transformations • Client Interfaces • Share/Explorer or Custom e.g. Web Scripts, CMIS • Typically little CIFS/WebDAV/FTP 10
  • 8. Primary Objective is to Maximise Throughput • Parallel processing • Load nodes simultaneously • Avoid unnecessary in-transaction processing • In-transaction services often not required when loading • e.g. Transformation, Indexing • Disable unneeded services • Many standard services are not required when loading • Minimise network and file I/O operations • Get source content as close to server storage as possible • Always benchmark and tune... • JVM, Network, Threads, DB Connections... 12
  • 9. Architectural considerations • Creation is CPU, memory, network intensive • Always 64 bit • Rule of thumb: Prefer scale up over scale out – simpler deployment and management • Rule of thumb: get the content as close as possible to Alfresco • Nature of the data set (i.e. batches) is KEY • If batches are sequential -> minimize time-per-batch • Scale up in CPU and memory • If batches are parallelizable -> maximize number of batches processed • Scale out with multi-threaded uploads • Consider dedicated server(s) for ingestion • Use production servers for migration use case and then reconfigure • Design content storage around your data • How can you get the source content as close as possible to repository content storage? • Note: Avoid Sparc T and related series • Highly parallel but not suited for atomic heavy serial operations 13
  • 10. Tuning best practices - JVM Tuning – Application Server • 64 bit • Pay attention to the • Make NewSize as large as machine capacity i.e. possible to avoid spill over • Threads to OldGen • CPU Utilization • I/O • See http://wiki.alfresco.com/wiki/JVM_Tuning Sample JVM Config: 64-bit, dual 2.6GHz Xeon / dual-core per CPU , 8GB RAM environment -server -Xss1M -Xms2G -Xmx3G -XX:NewSize=1G -XX:MaxPermSize=256M 16
  • 11. Bad  Good  17
  • 12. Tuning best practices – I/O • Network • Alfresco to Database is Key • Latency is key e.g. > 10ms is absolute max • JDBC fetch size should be 150 • See BP-1_Alfresco_Environment_Validation_and_Day_Zero_Configuration • Alfresco to storage (if remote) • If possible, avoid it completely for file transfers - Stage content on local disks • Use a dedicated network for storage e.g. Fibre channel • Incoming to Alfresco – Typically not relevant for bulk loading use case • Disk • Lucene index operations' are disk I/O intensive • Fast read/writes i.e. local disk • Avoid indexing if not required • Avoid unnecessary content file copying • Stage content on local disks • Consider set cm:content property directly e.g. • contentUrl=store://mypath/mydocument.docx|mimetype=application/vnd.openxmlformats- officedocument.wordprocessingml.document|size=51142|encoding=UTF-8|locale=en_GB_ 18
  • 13. Tuning best practices - Database • Connections – Relevant if you are loading concurrently • See BP-1_Alfresco_Environment_Validation_and_Day_Zero_Configuration • DB Indexes & Statistics • Plan your batch loads to allow for periodic statistics maintenance • Make sure the database hardware/software is sized appropriately e.g. • Log sizes, flush on transaction commit, cache tuning, lock management.... • Use of multiple physical volumes/RAID.... •All databases provide many options to optimise performance • Get a DB administrator, partner involved 19
  • 14. Tuning best practice - Repository Services • Force background indexing • alfresco-global.properties • Everything: index.tracking.disableInTransactionIndexing=true • Just Content: lucene.maxAtomicTransformationTime=0 • Is content indexing required at all? • DoNotIndex aspect • “Run As” system user to avoid permission checking 20
  • 15. Tuning best practice - Repository Services • Use an optimised custom bulk loader • Process docs in batches - not 1 doc per transaction or 1 transaction for entire content set • Example: 100 documents per batch • Use Foundation (Java) API if possible • Design multi-threaded import code • Partition your data set so you can use multiple threads loading in different areas • Scale up CPU accordingly •Consider direct APIs (e.g. “NodeService” vs “nodeService”) • Public services are heavily wrapped with interceptors for transactions, auditing, permissions, multilingual translations, etc. • Disable behaviours • Rules evaluations, cm:auditable, versioning, quotas (system.usages.enabled=false) •Use proper transaction demarcation • Complete all operations on a node in a single transaction • Batching – group multiple updates in a single transaction • Avoid mixing reads and writes • See session CS2-Repository_Internals for more details on API specifics 21
  • 16. Tuning best practices – Repository Services • Disable modified timestamp propagation to parent folders • system.enableTimestampPropagation=false (default) • Deleting large numbers of nodes • Skip deleted items (archive) by adding sys:temporary aspect your content before deletion • Partition your content within the repository • Depends on read access requirements • Consider partitioning more than 2000 nodes per space if browsing space children Note: Performance much improved in later releases 3.3.3, 3.4 – test for your use case 22
  • 17. Scale Out Using Dedicated Bulk Load Server(s) • Alfresco can support a non-clustered injection only tier • Objective: Separate input write process from front end read load • Solution: Dedicated injection tier pointing to same DB/Content store(s) as front end servers. No need to cluster caches from this tier with the front end. Background index properties and/or content, indexes will catch up from DB transactions. • Benefits: No Cache update/invalidation overhead. Indexing does not block loading process 24
  • 18. Bulk load server(s) not clustered but share storage and DB product servers will „catch up‟ via index tracking Bulk Load Process Runtime Clients Creates Only Bulk Load A Bulk Load B Production A Production B Production C Tomcat Tomcat Tomcat Tomcat Tomcat EHCache EHCache EHCache EHCache EHCache Lucene Lucene Lucene Lucene Lucene Index Index Index Index Index Database Content MySQL Store 25
  • 19. Load Server(s) Configuration Tips • Bulk Load Server(s) • To exclude servers(s) from cluster: • Do not set cluster name for bulk load servers in alfresco-global.properties • alfresco.cluster.name= • Force background indexing in the local alfresco-global.properties using: • Everything: • index.tracking.disableInTransactionIndexing=true • Just Content: • lucene.maxAtomicTransformationTime=0 • Note: The load process should perform creates only, no updates or reads • Production Server(s) • Ensure index tracking is enabled: • index.tracking.cronExpression=0/5 * * * * ? • index.recovery.mode=AUTO 26
  • 20. Example: In-transaction v‟s Background Indexing • 10,000 docs, 1,000 folders • 50kb word documents • FTP with 10 sessions • Laptop • Foreground Indexing: • 33 mins • Background Indexing: • 5 mins 27
  • 22. Requirements • High (and potentially highly distributed) user concurrency • 1,000‟s -10,000‟s users (read & write) • Medium/High number of documents • 10,000-1 million+ documents • 1000 document updates per day • Complex enterprise content and permission models • Multiple content models/Dynamic ACL • Versioning and full text indexing on all documents • Document types: Office, drawing, images • Advanced content management • Multiple rules and actions • Heavy use of content transformations/workflow •Interfaces (All) • Share, WebDAV, CIFS .... 30
  • 23. Architectural considerations • Fully fledged platform deployment • Need to consider maintenance window • Scale out Share independently from Repo • Front and intermediate Load balancer/Web Cache layers • Read/write split and scheduled repository exclusion for maintenance • Scale out transformation server • Enterprise only: JOD OpenOffice subsystem • Scale out and up infrastructure • Cluster CIFS with DFS (Distributed File System) • All HTTP based protocols scale seamlessly (SSP on port 7070) •Balance multi-CPU (scale up) and multi-node clusters (scale out) • Overhead of index tracking 31
  • 24. Design best practices • Distribute your content within the repository • Otherwise search and retrieval performance degradation is likely • Use versioning and indexing where appropriate, not just because it‟s there.. • e.g. don‟t simply apply cm:versionable to the full cm:content • Modelling • Prefer aspects over types • Remember aspects support inheritance as well • Content Model indexing options • Tune what you need to index • Quotas (aka Usages) • Might save your repo from content explosion but also have an overhead! 32
  • 25. Tuning best practices – Note: Also see bulk load use case! • RDBMS • Number of connections much more important for this use case • Formula: HTTP Worker Threads + 75 per cluster node • For Tomcat defaults this is 275 • Cache Configuration • L2 Cache: increase with RAM to include more objects in cache • Use ehcache tracing tool to indentify which caches have low hit ratios and increase if you have available memory • See http://wiki.alfresco.com/wiki/Repository_Cache_Configuration#Tracing_cache_sizes for details • Alfresco Configuration optimization • VFS thread pool tuning (default: <threadPool init=“25” max=“50” />) • Tune ACLs and preload common searches (if needed) system.acl.maxPermissionCheckTimeMillis=10000 system.acl.maxPermissionChecks=10000 Query via node browser as different users, not only admin • Consider bulk load large user bases (10,000s) to single (un-clustered) node and then cluster • Disable eager home folder creation • home.folder.creation.eager=false in alfresco-globallproperties • Use multi-threaded and incremental LDAP sync once initial sync has been completed • Differential sync is the default • Lucene Tuning • Lucene.maxAtomicTransformationTime=20 • Monitor the network performance when adding nodes to a cluster • What for ehcache waiting for the network via thread dumps • Consider disabling some/all of the L2 caches 33
  • 26. HTTP Clients Example Windows ECM CIFS e.g. Share via alfrescocifs Production Cluster Install HTTP Load Balancer DFS Round Robin - Local & Shared Content Store Active Directory User/Group Sync NTLM Authentication alfappsrv01 alfappsrv02 Tomcat 1 Tomcat 2 Local Local alf_data alf_data • Lucene Index • Lucene Index EHCache Clustered EHCache d:alf_storelucene-indexes d:alf_storelucene-indexes • Content Store • Content Store d:alf_storecontentstore d:alf_storecontentstore In & Outbound Replication In & Outbound Replication to shared content store on SAN JDBC oraclecluster alfclustsrv01 alfclustsrv02 • Replicating Content Store • Replicating Content Store Oracle 1 Oracle 2 In & Outbound replication <- Failover -> In & Outbound replication between local and shared between local and shared content store content store MSCS Cluster SAN • Shared Content Store: sharedContentStore (alfdataDatastore) • Oracle: - Data (o:oradataalfresco), Control (o:oradataalfresco) & Logfiles (L:oradataalfresco) - Oracle Backup (o:flash_recovery_area) • Lucene Index Backup (alfdataHold)
  • 27. Replication (3.4) offers new deployment options • Replication may be appropriate for specific contexts • Provides selective replication of content between distinct Alfresco repositories • On demand or scheduled via Replication Jobs • Reporting and Tracking of Replication Jobs • Read and viewing performance: Content is served from a local server 35
  • 28. For any system... • Do not use the OOTB settings for application server, database etc Alfresco you must always tune for your use case • Balance your resources • Separate tiers for DataBase, Content, App Servers • Indexes should always be on fast, local disk e.g. not NFS mounts, USB drives etc • Run on a supported stack e.g. • e.g. issues with 1.6u10 use JDK 1.6u.20, use MySQL 5.1.39 or later • Don‟t starve your database of connections: • db.pool.max=XXX • Use appropriate application server worker threads • Configuration details are application server specific e.g. Tomcat: server.xml • When clustering, use JGroups and Unicast • Use the latest Alfresco version/service pack e.g. • 3.3.3, 3.4 36
  • 29. Things you should NOT change • The database transaction isolation level • Use defaults for all databases except MS SQLServer • FYI. SQLServer should be: • db.txn.isolation=4096 • ALTER DATABASE alfresco SET ALLOW_SNAPSHOT_ISOLATION ON; • The ehcache default configuration i.e. Replicate async • The Lucene indexing defaults unless you know what you are doing and why! • Note: Also do not do a full-index rebuild unless you know what was wrong in the first place! • Use the index checker 37
  • 31. Alfresco Benchmarks • Alfresco Benchmark Tools • alfresco-bm – http://wiki.alfresco.com/wiki/Server_Benchmarks • SimpleInjector – (check partners.alfresco.com) • For CIFS loading -> Jmeter + SMB mount • Alfresco Benchmark Results • Unisys benchmark results • JCR Benchmarks • WIP • “Scale your Alfresco Solutions” (in http://partners.alfresco.com) • More Platform benchmark ongoing – watch this space! 39
  • 32. Profiling your Alfresco solution •Alfresco Application Profiling • JMX (for Enterprise Only see Admin Guide) http://wiki.alfresco.com/wiki/JMX • Audit Surf http://forge.alfresco.com/projects/auditsurf/ • Nagios integration http://forge.alfresco.com/projects/nagios4alfresco/ • Infrastructure Profiling • VisualVM (JVM) http://ur.ly/esjZ • Thread Dump Analyzer • https://tda.dev.java.net/ • YourKit (JVM) http://wiki.alfresco.com/wiki/JMX • WireShark (Network) http://www.wireshark.org/ • Mysql Query Profiler (DBMS) http://dev.mysql.com/tech-resources/articles/using-new-query-profiler.html 40
  • 33. Q/A & Feedback • Any Questions? • Share your experiences (good and bad) with us so we can all learn! • Successful scaled up/out architectures • Limitations, bottlenecks • Use case parameters => Implementation => Results • What worked, what didn‟t 43

Notes de l'éditeur

  1. We won’t be going into details on how to setup clustering and the web tier
  2. [Check with AH the background indexing stuff, i.e. is it indexing or extraction that exceeds 20 ms]
  3. Theses are typically, specifics with obviously vary.
  4. [Derek]
  5. [Derek]
  6. [PM – how does the custom loading fit into this??]