This document summarizes a presentation about Alfresco Search Services 2.0. Key points include:
- Solr was updated to remove the custom content store and leverage more built-in Solr features like replication and backups. This improved performance and reduced disk usage.
- New date fields were added that break dates down into individual components like year, month, day, etc. to enable more granular search queries.
- Asynchronous maintenance actions were introduced to schedule and retry tasks like reindexing, purging, and fixing index issues in the background.
- Security was enhanced with support for mutual TLS and storing passwords in JVM properties instead of plain text files. Performance tracking and indexing controls
This document discusses reindexing large repositories in Alfresco. It covers the Alfresco SOLR architecture, the indexing process, scenarios that require reindexing, alternatives for deployment during reindexing to minimize downtime, monitoring and profiling tools, and future improvements planned for Search Services 2.0 to optimize indexing performance. Benchmark results are presented showing improvements that reduced reindexing time for 1.2 billion documents from 21 days to 10 days.
The document discusses performance tuning of Alfresco. It covers JVM tuning including memory and garbage collection settings. It also discusses analyzing garbage collection logs and common problems. The document outlines different cache mechanisms in Alfresco including L1, L2 caches and Hazelcast caching. Tuning caches based on data change frequency and hit ratios is recommended. Finally, the document provides guidance on investigating performance issues by examining logs, threads, databases, storage and Alfresco/Solr configurations and settings.
The document provides an overview and best practices for tuning an Alfresco installation. It discusses disabling unused services, limiting group hierarchies, monitoring resources, optimizing Solr configuration, indexing processes, and query caching. General tips include separating custom configurations, testing backups and changes, and using support tools for troubleshooting performance issues.
Alfresco node lifecyle, services and zonesSanket Mehta
This ppt explains you the details about an alfresco node lifecycle (including which alfresco database tables are affected upon node operation-like node creation, deletion). Apart from it, it also explain which particular case-sensitive alfresco service should be used (nodeService vs NodeService, searchService vs SearchService) in order to maintain security in your application. Lastly it covers zones in alfresco (authentication-related zones and application-related zones)
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseAngel Borroy López
Presentation on how to move from the Alfresco Search Services product based in Apache Solr to the new Alfresco Search Enterprise integrated with Elasticsearch and Amazon Opensearch.
This is the session delivered during the Alfresco Developers Conference in Lisbon, January 2018. Learn all what you need to know to perform a proper backup and disaster recovery strategy. From a single server installation with hundreds of documents to a large deployment with multiple nodes, layers, databases and multi-million documents. What is the best way for each case?
The document provides an overview and best practices for tuning an Alfresco installation for performance. It discusses disabling unused services, limiting folder hierarchies and group nesting, monitoring resources, tuning Solr indexes and caches, and using separate servers for specific tasks like indexing. General tips include testing changes thoroughly before deploying, adjusting sizing for increased usage, and following the standard performance methodology.
Features of Alfresco Search Services.
Features of Alfresco Search & Insight Engine.
Future plans for the product
---
DEMO GUIDE
[1] Queries: Share > Node Browser
ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'
SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')
[2] Queries: Share > JS Console
var ctxt = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var searchService = ctxt.getBean('SearchService', org.alfresco.service.cmr.search.SearchService);
var StoreRef = Packages.org.alfresco.service.cmr.repository.StoreRef;
var SearchService = Packages.org.alfresco.service.cmr.search.SearchService;
var ResultSet = Packages.org.alfresco.repo.search.impl.lucene.SolrJSONResultSet;
ResultSet =
searchService.query(
StoreRef.STORE_REF_WORKSPACE_SPACESSTORE,
SearchService.LANGUAGE_FTS_ALFRESCO,
"ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'");
logger.log(ResultSet.getNodeRefs());
---
var ctxt = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var searchService = ctxt.getBean('SearchService', org.alfresco.service.cmr.search.SearchService);
var StoreRef = Packages.org.alfresco.service.cmr.repository.StoreRef;
var SearchService = Packages.org.alfresco.service.cmr.search.SearchService;
var ResultSet = Packages.org.alfresco.repo.search.impl.lucene.SolrJSONResultSet;
ResultSet =
searchService.query(
StoreRef.STORE_REF_WORKSPACE_SPACESSTORE,
SearchService.LANGUAGE_CMIS_ALFRESCO,
"SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')");
logger.log(ResultSet.getNodeRefs());
---
var def =
{
query: "ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'",
language: "fts-alfresco"
};
var results = search.query(def);
logger.log(results);
[3] Queries: api-explorer
{
"query": {
"language": "afts",
"query": "ASPECT:\"cm:titled\" AND cm:title:\"*Sample\" AND TEXT:\"code\""
}
}
---
{
"query": {
"language": "cmis",
"query": "SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')"
}
}
[4] Queries: CMIS Workbench > Groovy Console
rs = session.query("SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')", false)
for (res in rs) {
println(res.getPropertyValueById('cmis:objectId'))
}
[5] Queries: SOLR Web Console > (alfresco) > Query
/afts
ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'
---
/cmis
SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')
---
This document discusses reindexing large repositories in Alfresco. It covers the Alfresco SOLR architecture, the indexing process, scenarios that require reindexing, alternatives for deployment during reindexing to minimize downtime, monitoring and profiling tools, and future improvements planned for Search Services 2.0 to optimize indexing performance. Benchmark results are presented showing improvements that reduced reindexing time for 1.2 billion documents from 21 days to 10 days.
The document discusses performance tuning of Alfresco. It covers JVM tuning including memory and garbage collection settings. It also discusses analyzing garbage collection logs and common problems. The document outlines different cache mechanisms in Alfresco including L1, L2 caches and Hazelcast caching. Tuning caches based on data change frequency and hit ratios is recommended. Finally, the document provides guidance on investigating performance issues by examining logs, threads, databases, storage and Alfresco/Solr configurations and settings.
The document provides an overview and best practices for tuning an Alfresco installation. It discusses disabling unused services, limiting group hierarchies, monitoring resources, optimizing Solr configuration, indexing processes, and query caching. General tips include separating custom configurations, testing backups and changes, and using support tools for troubleshooting performance issues.
Alfresco node lifecyle, services and zonesSanket Mehta
This ppt explains you the details about an alfresco node lifecycle (including which alfresco database tables are affected upon node operation-like node creation, deletion). Apart from it, it also explain which particular case-sensitive alfresco service should be used (nodeService vs NodeService, searchService vs SearchService) in order to maintain security in your application. Lastly it covers zones in alfresco (authentication-related zones and application-related zones)
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseAngel Borroy López
Presentation on how to move from the Alfresco Search Services product based in Apache Solr to the new Alfresco Search Enterprise integrated with Elasticsearch and Amazon Opensearch.
This is the session delivered during the Alfresco Developers Conference in Lisbon, January 2018. Learn all what you need to know to perform a proper backup and disaster recovery strategy. From a single server installation with hundreds of documents to a large deployment with multiple nodes, layers, databases and multi-million documents. What is the best way for each case?
The document provides an overview and best practices for tuning an Alfresco installation for performance. It discusses disabling unused services, limiting folder hierarchies and group nesting, monitoring resources, tuning Solr indexes and caches, and using separate servers for specific tasks like indexing. General tips include testing changes thoroughly before deploying, adjusting sizing for increased usage, and following the standard performance methodology.
Features of Alfresco Search Services.
Features of Alfresco Search & Insight Engine.
Future plans for the product
---
DEMO GUIDE
[1] Queries: Share > Node Browser
ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'
SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')
[2] Queries: Share > JS Console
var ctxt = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var searchService = ctxt.getBean('SearchService', org.alfresco.service.cmr.search.SearchService);
var StoreRef = Packages.org.alfresco.service.cmr.repository.StoreRef;
var SearchService = Packages.org.alfresco.service.cmr.search.SearchService;
var ResultSet = Packages.org.alfresco.repo.search.impl.lucene.SolrJSONResultSet;
ResultSet =
searchService.query(
StoreRef.STORE_REF_WORKSPACE_SPACESSTORE,
SearchService.LANGUAGE_FTS_ALFRESCO,
"ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'");
logger.log(ResultSet.getNodeRefs());
---
var ctxt = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var searchService = ctxt.getBean('SearchService', org.alfresco.service.cmr.search.SearchService);
var StoreRef = Packages.org.alfresco.service.cmr.repository.StoreRef;
var SearchService = Packages.org.alfresco.service.cmr.search.SearchService;
var ResultSet = Packages.org.alfresco.repo.search.impl.lucene.SolrJSONResultSet;
ResultSet =
searchService.query(
StoreRef.STORE_REF_WORKSPACE_SPACESSTORE,
SearchService.LANGUAGE_CMIS_ALFRESCO,
"SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')");
logger.log(ResultSet.getNodeRefs());
---
var def =
{
query: "ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'",
language: "fts-alfresco"
};
var results = search.query(def);
logger.log(results);
[3] Queries: api-explorer
{
"query": {
"language": "afts",
"query": "ASPECT:\"cm:titled\" AND cm:title:\"*Sample\" AND TEXT:\"code\""
}
}
---
{
"query": {
"language": "cmis",
"query": "SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')"
}
}
[4] Queries: CMIS Workbench > Groovy Console
rs = session.query("SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')", false)
for (res in rs) {
println(res.getPropertyValueById('cmis:objectId'))
}
[5] Queries: SOLR Web Console > (alfresco) > Query
/afts
ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'
---
/cmis
SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')
---
Infrastructure, use cases and performance considerations for
an Enterprise Grade ECM implementation up to 1B documents on AWS (Amazon Web Services EC2 and Aurora) based on the Alfresco (http://www.alfresco.com) Platform, leading Open Source Enterprise Content Management system.
This session will provide a guide to Alfresco truststores and keystores. Several live examples will be shown, including the replacement of existing cryptographic stores or certificates. Additionally, a troubleshooting configuration guide for mTLS communication will be provided.
Jose portillo dev con presentation 1138Jose Portillo
This document discusses best practices for implementing Solr sharding in Alfresco. It defines what sharding is and explains that it involves splitting a single index into multiple parts or shards to improve search performance, distribute indexing load, and scale horizontally. The document outlines different types of sharding, considerations for the number of shards, high availability, backup procedures, and common configuration settings when using Solr sharding in Alfresco.
The objective of this article is to describe what to monitor in and around Alfresco in order to have a good understanding of how the applications are performing and to be aware of potential issues.
The document discusses best practices for upgrading to Alfresco 6 from a previous version. It recommends backing up the database and content store from the source Alfresco, identifying any customizations, installing the new Alfresco from scratch, restoring the backups, applying customizations, and patching the database in stages if needed through intermediate "halfway" Alfresco instances. It also covers identifying deprecated features, adapting custom code to be compatible with Alfresco 6, monitoring the new installation, and addressing potential issues.
Alfresco DevCon 2019 Performance Tools of the TradeLuis Colorado
Discover tips and tools that will help you to keep your Alfresco environment in shape. Most of the best tools are free or Open Source, and this presentation will guide you through the steps to improve the performance of your system.
In this session, we'll discuss architectural, design and tuning best practices for building rock solid and scalable Alfresco Solutions. We'll cover the typical use cases for highly scalable Alfresco solutions, like massive injection and high concurrency, also introducing 3.3 and 3.4 Transfer / Replication services for building complex high availability enterprise architectures.
Important work-arounds for making ASS multi-lingualAxel Faust
Slides from my Alfresco DevCon 2018 Lightning Talk (5 min, 15s per main slide, auto-advancing) about the Alfresco Search Services product, its current limitations with regards to usage in an organisation with mixed user locales, and the work-arounds (as well as long-term solution) to making it work nonetheless. The recording of the Lightning Talk session will be uploaded to the Alfresco YouTube channel sometime in the next days / weeks.
This document discusses backup and disaster recovery strategies for Alfresco. It recommends scheduling regular backups of the Solr and Lucene indexes, database, and file system. Full backups should be done periodically, with incremental backups in between. Backups can be cold (system offline), warm (some services offline), or hot (live system). Restores involve recovering the indexes, database, files and configuration. Planning includes defining recovery objectives for data loss and downtime.
Moving From Actions & Behaviors to MicroservicesJeff Potts
My DevCon 2019 talk discusses how to make it easier to integrate Alfresco with other systems using an event-based approach. Two real world examples are discussed and demonstrated. The first is about reporting against Alfresco metadata. The second is about enriching metadata by running content through a Natural Language Processing (NLP) model. Both solutions work by listening to generic events generated by Alfresco and placed on an Apache Kafka queue. For the reporting example, the Spring Boot consumer subscribes to Kafka events, then fetches metadata via CMIS and indexes that into Elasticsearch. For the NLP example, a separate Spring Boot consumer subscribes to the same events, but in this case, fetches the content, extracts text using Apache Tika, runs the text through Apache OpenNLP, then writes back extracted entities to Alfresco via CMIS. These are relatively simple examples, but illustrate how a de-coupled, asynchronous, event-based approach can make integrating Alfresco with other systems easier.
This document provides an overview of Storage Foundation and Alfresco solutions. It discusses hardware storage concepts including drive types, interfaces, and RAID. It also covers Alfresco storage-related solutions such as the S3 connector, XAM connector, content store selector, and replication capabilities. Partnership solutions from Xenit, Star Storage, and community solutions are also mentioned. The document concludes with best practices around content store, indexes, logs, and backup/recovery.
The document discusses Alfresco security best practices. It covers topics such as hardening the network and operating system, implementing firewall rules, assessing vulnerabilities, and compliance with standards. Best practices for the Alfresco implementation include staying current with patches, enforcing strong permissions, and deleting content when it is removed. The document provides an overview of security considerations for the Alfresco architecture, mobile access, and other deployment aspects.
Alfresco DevCon 2019: Encryption at-rest and in-transitToni de la Fuente
To guarantee data integrity and confidentiality in Alfresco, we need to implement authentication and encryption at-rest and in-transit. With micro services proliferation, orchestrating platforms, complex topologies of services and multiple programming languages, there is a demand of new ways to manage service-to-service communication, and in some cases, without the application needing to be aware. In addition to that, compliance requirements around encryption and authentication come to the picture requiring new ways to handle them. This talk will review encryption at-rest solutions for ADBP, and will be also discuss about solutions for encryption and authentication between services. This will be an introduction to service mesh and TLS/mTLS. We will see a demo of ACS running with Istio over EKS along with tools like WaveScope, Kiali, Jaeger, Grafana, Service Graph and Prometheus.
The document introduces the ELK stack, which consists of Elasticsearch, Logstash, Kibana, and Beats. Beats ship log and operational data to Elasticsearch. Logstash ingests, transforms, and sends data to Elasticsearch. Elasticsearch stores and indexes the data. Kibana allows users to visualize and interact with data stored in Elasticsearch. The document provides descriptions of each component and their roles. It also includes configuration examples and demonstrates how to access Elasticsearch via REST.
Alfresco DevCon 2019 (Edinburgh)
"Transforming the Transformers" for Alfresco Content Services (ACS) 6.1 & beyond
https://community.alfresco.com/community/ecm/blog/2019/02/07/alfresco-transform-service-new-with-acs-61
Alfresco provides various content transformation options across the Digital Business Platform (DBP). In this talk, we will explore the new independently-scalable Alfresco Transform Service. This enables a new option for transforms to be asynchronously off-loaded by Alfresco Content Services (ACS).
https://devcon.alfresco.com/speaker/jan-vonka/
The document summarizes Jan Vonka's presentation on Alfresco's exciting new REST APIs. It provides an overview of the REST API architecture and components. It highlights many new features in the Content Services 5.2 and Process Services 1.6 APIs, including new endpoints, operations, and enhanced APIs for sites and people. It demonstrates using the APIs via Postman. It discusses the API documentation and upcoming futures like exposing more services and improvements to the REST framework.
This document discusses solutions for generating unique identifiers at high speeds. It compares auto-increment, UUID, hash, and Snowflake approaches. Snowflake is highlighted as able to generate up to 4 billion IDs per second while maintaining order, supporting distribution and sharding, and providing security benefits. The document outlines how Snowflake works by combining a timestamp, node ID determined via file, random number, IP address or ZooKeeper, and an increasing sequence number stored in Redis to generate the IDs at high speeds with strong ordering properties.
This document provides an introduction to Apache Solr, including:
- Solr is an open-source search engine and REST API built on Lucene for indexing and searching documents.
- Solr architecture includes nodes, cores, schemas, and concepts like SolrCloud which uses Zookeeper for coordination across collections and shards.
- Documents are indexed, queried, updated, and deleted via the REST API or client libraries. Queries support various types including range, date, boolean, and proximity queries.
- Installation and configuration of standalone Solr involves downloading, extracting, and running bin/solr scripts to start the server and create cores.
- Resources for learning more include tutorials, documentation, and integration options
The document discusses new features and improvements in MySQL 5.6, including significant performance gains over MySQL 5.5. Key highlights include improved InnoDB performance through features like online DDL and buffer pool pre-loading, up to 151-234% performance gains on benchmarks. Other enhancements cover full-text search in InnoDB, NoSQL support through memcached integration, replication improvements with GTIDs and crash-safe slaves, and strengthened security with audit logging and password policies.
SOUG Day Oracle 21c New Security FeaturesStefan Oehrli
With the Innovation Release 21c Oracle has introduced one or the other security feature. These include small improvements that make DB operation more secure and easier. But also completely new concepts like DB Nest, which introduce a new approach for databases, how DB security can be implemented in multitenant.
Infrastructure, use cases and performance considerations for
an Enterprise Grade ECM implementation up to 1B documents on AWS (Amazon Web Services EC2 and Aurora) based on the Alfresco (http://www.alfresco.com) Platform, leading Open Source Enterprise Content Management system.
This session will provide a guide to Alfresco truststores and keystores. Several live examples will be shown, including the replacement of existing cryptographic stores or certificates. Additionally, a troubleshooting configuration guide for mTLS communication will be provided.
Jose portillo dev con presentation 1138Jose Portillo
This document discusses best practices for implementing Solr sharding in Alfresco. It defines what sharding is and explains that it involves splitting a single index into multiple parts or shards to improve search performance, distribute indexing load, and scale horizontally. The document outlines different types of sharding, considerations for the number of shards, high availability, backup procedures, and common configuration settings when using Solr sharding in Alfresco.
The objective of this article is to describe what to monitor in and around Alfresco in order to have a good understanding of how the applications are performing and to be aware of potential issues.
The document discusses best practices for upgrading to Alfresco 6 from a previous version. It recommends backing up the database and content store from the source Alfresco, identifying any customizations, installing the new Alfresco from scratch, restoring the backups, applying customizations, and patching the database in stages if needed through intermediate "halfway" Alfresco instances. It also covers identifying deprecated features, adapting custom code to be compatible with Alfresco 6, monitoring the new installation, and addressing potential issues.
Alfresco DevCon 2019 Performance Tools of the TradeLuis Colorado
Discover tips and tools that will help you to keep your Alfresco environment in shape. Most of the best tools are free or Open Source, and this presentation will guide you through the steps to improve the performance of your system.
In this session, we'll discuss architectural, design and tuning best practices for building rock solid and scalable Alfresco Solutions. We'll cover the typical use cases for highly scalable Alfresco solutions, like massive injection and high concurrency, also introducing 3.3 and 3.4 Transfer / Replication services for building complex high availability enterprise architectures.
Important work-arounds for making ASS multi-lingualAxel Faust
Slides from my Alfresco DevCon 2018 Lightning Talk (5 min, 15s per main slide, auto-advancing) about the Alfresco Search Services product, its current limitations with regards to usage in an organisation with mixed user locales, and the work-arounds (as well as long-term solution) to making it work nonetheless. The recording of the Lightning Talk session will be uploaded to the Alfresco YouTube channel sometime in the next days / weeks.
This document discusses backup and disaster recovery strategies for Alfresco. It recommends scheduling regular backups of the Solr and Lucene indexes, database, and file system. Full backups should be done periodically, with incremental backups in between. Backups can be cold (system offline), warm (some services offline), or hot (live system). Restores involve recovering the indexes, database, files and configuration. Planning includes defining recovery objectives for data loss and downtime.
Moving From Actions & Behaviors to MicroservicesJeff Potts
My DevCon 2019 talk discusses how to make it easier to integrate Alfresco with other systems using an event-based approach. Two real world examples are discussed and demonstrated. The first is about reporting against Alfresco metadata. The second is about enriching metadata by running content through a Natural Language Processing (NLP) model. Both solutions work by listening to generic events generated by Alfresco and placed on an Apache Kafka queue. For the reporting example, the Spring Boot consumer subscribes to Kafka events, then fetches metadata via CMIS and indexes that into Elasticsearch. For the NLP example, a separate Spring Boot consumer subscribes to the same events, but in this case, fetches the content, extracts text using Apache Tika, runs the text through Apache OpenNLP, then writes back extracted entities to Alfresco via CMIS. These are relatively simple examples, but illustrate how a de-coupled, asynchronous, event-based approach can make integrating Alfresco with other systems easier.
This document provides an overview of Storage Foundation and Alfresco solutions. It discusses hardware storage concepts including drive types, interfaces, and RAID. It also covers Alfresco storage-related solutions such as the S3 connector, XAM connector, content store selector, and replication capabilities. Partnership solutions from Xenit, Star Storage, and community solutions are also mentioned. The document concludes with best practices around content store, indexes, logs, and backup/recovery.
The document discusses Alfresco security best practices. It covers topics such as hardening the network and operating system, implementing firewall rules, assessing vulnerabilities, and compliance with standards. Best practices for the Alfresco implementation include staying current with patches, enforcing strong permissions, and deleting content when it is removed. The document provides an overview of security considerations for the Alfresco architecture, mobile access, and other deployment aspects.
Alfresco DevCon 2019: Encryption at-rest and in-transitToni de la Fuente
To guarantee data integrity and confidentiality in Alfresco, we need to implement authentication and encryption at-rest and in-transit. With micro services proliferation, orchestrating platforms, complex topologies of services and multiple programming languages, there is a demand of new ways to manage service-to-service communication, and in some cases, without the application needing to be aware. In addition to that, compliance requirements around encryption and authentication come to the picture requiring new ways to handle them. This talk will review encryption at-rest solutions for ADBP, and will be also discuss about solutions for encryption and authentication between services. This will be an introduction to service mesh and TLS/mTLS. We will see a demo of ACS running with Istio over EKS along with tools like WaveScope, Kiali, Jaeger, Grafana, Service Graph and Prometheus.
The document introduces the ELK stack, which consists of Elasticsearch, Logstash, Kibana, and Beats. Beats ship log and operational data to Elasticsearch. Logstash ingests, transforms, and sends data to Elasticsearch. Elasticsearch stores and indexes the data. Kibana allows users to visualize and interact with data stored in Elasticsearch. The document provides descriptions of each component and their roles. It also includes configuration examples and demonstrates how to access Elasticsearch via REST.
Alfresco DevCon 2019 (Edinburgh)
"Transforming the Transformers" for Alfresco Content Services (ACS) 6.1 & beyond
https://community.alfresco.com/community/ecm/blog/2019/02/07/alfresco-transform-service-new-with-acs-61
Alfresco provides various content transformation options across the Digital Business Platform (DBP). In this talk, we will explore the new independently-scalable Alfresco Transform Service. This enables a new option for transforms to be asynchronously off-loaded by Alfresco Content Services (ACS).
https://devcon.alfresco.com/speaker/jan-vonka/
The document summarizes Jan Vonka's presentation on Alfresco's exciting new REST APIs. It provides an overview of the REST API architecture and components. It highlights many new features in the Content Services 5.2 and Process Services 1.6 APIs, including new endpoints, operations, and enhanced APIs for sites and people. It demonstrates using the APIs via Postman. It discusses the API documentation and upcoming futures like exposing more services and improvements to the REST framework.
This document discusses solutions for generating unique identifiers at high speeds. It compares auto-increment, UUID, hash, and Snowflake approaches. Snowflake is highlighted as able to generate up to 4 billion IDs per second while maintaining order, supporting distribution and sharding, and providing security benefits. The document outlines how Snowflake works by combining a timestamp, node ID determined via file, random number, IP address or ZooKeeper, and an increasing sequence number stored in Redis to generate the IDs at high speeds with strong ordering properties.
This document provides an introduction to Apache Solr, including:
- Solr is an open-source search engine and REST API built on Lucene for indexing and searching documents.
- Solr architecture includes nodes, cores, schemas, and concepts like SolrCloud which uses Zookeeper for coordination across collections and shards.
- Documents are indexed, queried, updated, and deleted via the REST API or client libraries. Queries support various types including range, date, boolean, and proximity queries.
- Installation and configuration of standalone Solr involves downloading, extracting, and running bin/solr scripts to start the server and create cores.
- Resources for learning more include tutorials, documentation, and integration options
The document discusses new features and improvements in MySQL 5.6, including significant performance gains over MySQL 5.5. Key highlights include improved InnoDB performance through features like online DDL and buffer pool pre-loading, up to 151-234% performance gains on benchmarks. Other enhancements cover full-text search in InnoDB, NoSQL support through memcached integration, replication improvements with GTIDs and crash-safe slaves, and strengthened security with audit logging and password policies.
SOUG Day Oracle 21c New Security FeaturesStefan Oehrli
With the Innovation Release 21c Oracle has introduced one or the other security feature. These include small improvements that make DB operation more secure and easier. But also completely new concepts like DB Nest, which introduce a new approach for databases, how DB security can be implemented in multitenant.
In this deck from the 2015 PBS Works User Group, Sarah Storms from Lockheed Martin presents: A New Multi-Level Security Initiative.
"Historically cyber security in HPC has been limited to detecting intrusions rather than designing security from the beginning in a holistic, layered approach to protect the system. SELinux has provided the needed framework to address cyber security issues for a decade, but the lack of an HPC and data analysis eco-system based on SELinux and the perception that the resulting configuration is “hard” to use has prevented SELinux configurations from being widely accepted. This presentation will discuss the eco-system that has been developed and certified, debunk the “hard” perception, and illustrate approaches for both government and commercial applications. The presentation includes discussions on SELinux architecture and features, Altair PBS Professional Queuing System, Scale-out Lustre Storage, Applications Performance on SELinux (Vectorization and Parallelization), Relational Databases, and Security Functions (Auditing and other Security Administration actions)."
Learn more: http://www.pbsworks.com/pbsug/2015/agenda.aspx
Watch the video presentation: https://www.youtube.com/watch?v=kBNKmGCg4ho
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
DataStax | Best Practices for Securing DataStax Enterprise (Matt Kennedy) | C...DataStax
This talk will review the advanced security features in DataStax Enterprise and discuss best practices for secure deployments. In particular, topics reviewed will cover: Authentication with Kerberos & LDAP/Active Directory, Role-based Authorization and LDAP role assignment, Auditing, Securing network communication, Encrypting data files and using the Key-Management Interoperability Protocol (KMIP) for secure off-host key management. The talk will also suggest strategies for addressing security needs not met directly by the built-in features of the database such as how to address applications that require Attribute Based Access Control (ABAC).
About the Speaker
Matt Kennedy Sr. Product Manager, DataStax
Matt Kennedy works at DataStax as the product manager for DataStax Enterprise Core. Matt has been a Cassandra user and occasional contributor since version 0.7 and was named a Cassandra MVP in 2013 shortly before joining DataStax. Unlike Cassandra, Matt is not partition tolerant.
The document summarizes an upcoming talk on Azure Site Recovery and business continuity by Janaka Rangama of Empired Ltd. on August 9th at Index Consultants in Melbourne. It provides details on the speaker, topic, date, location and includes links to recent Microsoft Azure announcements on new government datacenter regions, Azure Stack ordering, and Azure Batch Rendering in public preview.
The document provides an overview of enhancements in CICS Transaction Server V5.1 focused on improved scalability. Key areas addressed include greater use of 64-bit storage to relieve virtual storage constraints, improved threadsafe support to reduce TCB switching and increase workload capacity, and doubling the maximum task limit. Monitoring and statistics were also enhanced to provide deeper insight into performance and capacity to help optimize hardware and software configurations.
The document provides an overview of enhancements in CICS Transaction Server V5.1 to improve scalability. Key points discussed include:
1) CICS V5.1 includes improvements to horizontal and vertical scalability through enhancements such as improved support for threadsafe programming, greater use of 64-bit storage, and increased maximum task limits.
2) Specific scalability enhancements discussed include open transaction environment improvements to reduce TCB switching; virtual storage constraint relief to reduce pressure on below-the-line storage; and increased maximum task limits.
3) Instrumentation and monitoring enhancements provide additional performance metrics and statistics to help understand system load and potential bottlenecks.
The current trends to work in Agile and DevOps are challenging for database developers. Source control is a standard for non-database code but it’s a challenge for databases. This talk has an ambition to change that situation and help developers and DBA take over control of source code and data.
The document discusses benchmarking the performance of Apache Solr. It describes testing the indexing performance of SolrCloud clusters of varying sizes. The results show that indexing performance scales nearly linearly as nodes are added. It also discusses using the Solr Scale Toolkit, which is a set of tools for deploying, managing, and benchmarking SolrCloud clusters. Future work mentioned includes benchmarking mixed workloads and integrating chaos monkey tests.
The document discusses steps taken to optimize a Magento stack for performance, scalability, and high availability. Key changes included removing unnecessary modules, adding optimized community modules, improving database and caching performance, optimizing indexers, implementing a Redis cache backend, handling long tasks asynchronously, caching blocks and layouts efficiently, optimizing product and navigation blocks, adding cache locking, and deploying the infrastructure on an autoscaling architecture with services like Galera, Varnish, and Elasticsearch. The goal was to make the core lightweight, improve scaling capabilities, and ensure a self-healing and highly available Magento deployment.
Cloudflare and Drupal - fighting bots and traffic peaksŁukasz Klimek
This document discusses using Cloudflare to improve the performance, security, and reliability of Drupal websites. It outlines the problems Drupal sites often face like spam, traffic peaks, and complex infrastructure needs. Cloudflare is presented as a solution by providing a content delivery network, web application firewall, code optimizations and other features. The document reviews Cloudflare's specific capabilities and provides guidance on preparing a Drupal site for deployment with Cloudflare, including cache invalidation strategies and modules to integrate the two platforms. Areas for future work by the Drupal community are also identified.
KP Partners: DataStax and Analytics Implementation MethodologyDataStax Academy
Apache Cassandra is the leading distributed database in use at thousands of sites with the world’s most demanding scalability and availability requirements. Apache Spark is a distributed data analytics computing framework that has gained a lot of traction in processing large amounts of data in an efficient and user-friendly manner. The joining of both provides a powerful combination of real-time data collection with analytics. After a brief overview of Cassandra and Spark, this class will dive into various aspects of the integration.
The document discusses MySQL 5.6 replication features including:
- Multi-threaded replication which allows parallel application of transactions to different databases for increased slave throughput.
- Binary log group commit which increases master performance by committing multiple transactions as a group to the binary log.
- Optimized row-based replication which reduces binary log size and network bandwidth by only replicating changed row elements.
- Global transaction identifiers which simplify tracking replication across clusters and identifying the most up-to-date slave for failover.
- Crash-safe slaves which store replication metadata in tables, allowing automatic recovery of slaves and binary logs after failures.
The document discusses upcoming changes and new features in MySQL 5.7. Key points include:
- MySQL 5.7 development has focused on performance, scalability, security and refactoring code.
- New features include online DDL support for additional DDL statements, InnoDB support for spatial data types, and cost information added to EXPLAIN output.
- Benchmarks show MySQL 5.7 providing significantly higher performance than previous versions, with a peak of 645,000 queries/second on some workloads.
A duplicate (clone or snapshot) database is useful for a variety of purposes, most of which involve testing &
upgrade
• You can perform the following tasks in a duplicate database:
• Test backup and recovery procedures
• Test an upgrade to a new release of Oracle Database
• Test the effect of applications on database performance
• Create a standby database (Dataguard) with DG Broker
• Leverage on Transient Logical Standby to perform an upgrade
• Generate reports
Pythian is a global leader in database administration and consulting services. The document discusses the speaker's first 100 days of experience with an Oracle Exadata database machine. It provides an overview of Exadata components and features like Hybrid Columnar Compression and Smart Scan, which offloads processing from database servers to storage cells.
Similaire à Discovering the 2 in Alfresco Search Services 2.0 (20)
n this session, we'll simplify the complexities of configuring and troubleshooting mutual TLS (mTLS) within Alfresco environments. Attendees will gain practical insights into certificate management, trust validation, and common challenges encountered during configuration.
We'll showcase and provide custom tools for troubleshooting during the session. These tools can be used with ZIP, Ansible, Docker and Kubernetes deployments.
Event description available in https://hub.alfresco.com/t5/news-announcements/ttl-157-troubleshooting-made-easy-deciphering-alfresco-s-mtls/ba-p/319735/jump-to/first-unread-message
Using Generative AI and Content Service Platforms togetherAngel Borroy López
Slides for FOSDEM 2024 session: https://fosdem.org/2024/schedule/event/fosdem-2024-1858-using-generative-ai-and-content-service-platforms-together/
Describes a framework that provides GenAI operations for documents using a REST API. LLMs are stored locally, so no data is sent away.
It also includes a sample integration with a Content Service Platform (Alfresco), to enhance documents and pictures context information.
Session recording is available in https://ftp.fau.de/fosdem/2024/h2213/fosdem-2024-1858-using-generative-ai-and-content-service-platforms-together.av1.webm
Enhancing Document-Centric Features with On-Premise Generative AI for Alfresc...Angel Borroy López
Oractical guide on integrating Alfresco Community with On-Premise Generative AI.
This session outlines the steps to enhance both existing and new content, demonstrating features such as classification, summarization, translation, and prompting. But this framework allows you to include additional features.
Source code is available in https://github.com/aborroy/alfresco-genai
This presentation describes different methods to produce Alfresco Docker Assets for Docker Compose deployment.
From the previous methods (based in Python, Yeoman and Docker) to the Docker Init with Templates approach.
The recent launch of the Docker Init command has significantly simplified the process of generating Dockerfiles and Docker Compose templates for containerized applications. This presentation aims to explore the evolution of Docker deployment resources generation process, comparing its approach prior to the Docker Init command release and discussing the way forward. Before the introduction of the Docker Init command, I've been delivering some projects like the "alfresco-docker-installer"[1], which provides custom scripts and configurations to streamline the process of deploying Alfresco in Docker containers. These kinds of projects use tools like Yeoman or raw Python. There are some differences between a Docker Template for a technology (Go, Python, Node or Rust) and a Docker Template for a product (like Alfresco) that may be covered when generating automatic deployment resources. This presentation will delve into the methodologies employed before the Docker Init command:
Custom Dockerfile Extension
Compose Template for a complete product deployment, including a set of services like the database, content repository, search engine, or web application
Configuration Management, including techniques such as environment variable injection, externalized configuration files, and configuration overrides
Following the release of the Docker Init command, this presentation will provide insights into the possibilities and advantages it brings to complex products Docker deployment process. A PoC of a Docker Plugin, including this product-oriented approach for docker init, will be demoed live. >> Note that the Open Source Alfresco product is used only to explain the concepts of building a Docker Compose generator with a real example.
This deck includes a description of the Transform Service available for Alfresco 7.4.0.
Secure configuration sample, relying on mTLS, is also discussed.
This presentation describes how to use Podman to replace Docker in the Alfresco 7.4.0 development process.
Alfresco platform is built using containerization technology. Alfresco can utilize containerization platforms like Podman, which provide the necessary tools and infrastructure to create, manage, and run containers.
Podman is presented as an alternative to Docker. Both Docker and Podman can be used effectively for Alfresco development. So consider your familiarity with the tools, preferred workflow, ecosystem support, security requirements, and any specific performance considerations to make the best choice for your Alfresco development needs.
CSP: Evolución de servicios de código abierto en un mundo Cloud NativeAngel Borroy López
Presentación realizada en Openexpo Europe 2023:
https://openexpoeurope.com/es/session/cuando-hyland-encontro-a-alfresco-evolucion-de-servicios-de-codigo-abierto-en-un-mundo-cloud-native/
Presenta una visión evolutiva de las plataformas de gestión documental: ECM, CSP y Cloud Native.
Incluye información relevante de los productos Alfresco, Nuxeo y Hyland Experience.
This presentation describes how to use the BPM Engine included with Alfresco ACS repository.
All the different APIs are covered: Workflow Console UI, REST API and Java API.
Support material for the blog post available in https://hub.alfresco.com/t5/alfresco-content-services-blog/alfresco-7-3-upgrading-to-transform-core-3-0-0/ba-p/315364
This presentation describes the differences between Alfresco Transform Engine and Alfresco Transform Core 3.0.0.
Deployment, configuration and extension topics for Transform Core are covered.
Practical information for Alfresco integration with AOS (Sharepoint Protocol), Google Drive, Microsoft 365, ONLYOFFICE and Collabora Online.
Additionally ADW support for ONLYOFFICE is provided by https://github.com/atolcd/adf-onlyoffice-extension#installation
Este documento proporciona recursos para aprender Docker, incluyendo documentación, libros, videos de YouTube y la comunidad Docker. Explica cómo instalar Docker en Windows, Mac y Linux, y cubre herramientas como Docker Desktop y Docker Hub. También describe los planes de suscripción disponibles para Docker.
Docker 101 - Zaragoza Docker Meetup - Universidad de ZaragozaAngel Borroy López
This document provides an introduction to Docker presented at a Docker Zaragoza Meetup. It discusses Docker Engine, images and containers, Docker architecture, creating images with Dockerfiles, sharing images with Docker registries like Docker Hub, and hands-on exercises using Docker Classroom and Play with Docker. The presentation introduces key Docker concepts and components to help attendees discover Docker and get started using it.
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Łukasz Chruściel
No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception.
In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed.
We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemPeter Muessig
Learn about the latest innovations in and around OpenUI5/SAPUI5: UI5 Tooling, UI5 linter, UI5 Web Components, Web Components Integration, UI5 2.x, UI5 GenAI.
Recording:
https://www.youtube.com/live/MSdGLG2zLy8?si=INxBHTqkwHhxV5Ta&t=0
Mobile App Development Company In Noida | Drona InfotechDrona Infotech
Drona Infotech is a premier mobile app development company in Noida, providing cutting-edge solutions for businesses.
Visit Us For : https://www.dronainfotech.com/mobile-application-development/
Measures in SQL (SIGMOD 2024, Santiago, Chile)Julian Hyde
SQL has attained widespread adoption, but Business Intelligence tools still use their own higher level languages based upon a multidimensional paradigm. Composable calculations are what is missing from SQL, and we propose a new kind of column, called a measure, that attaches a calculation to a table. Like regular tables, tables with measures are composable and closed when used in queries.
SQL-with-measures has the power, conciseness and reusability of multidimensional languages but retains SQL semantics. Measure invocations can be expanded in place to simple, clear SQL.
To define the evaluation semantics for measures, we introduce context-sensitive expressions (a way to evaluate multidimensional expressions that is consistent with existing SQL semantics), a concept called evaluation context, and several operations for setting and modifying the evaluation context.
A talk at SIGMOD, June 9–15, 2024, Santiago, Chile
Authors: Julian Hyde (Google) and John Fremlin (Google)
https://doi.org/10.1145/3626246.3653374
Using Query Store in Azure PostgreSQL to Understand Query PerformanceGrant Fritchey
Microsoft has added an excellent new extension in PostgreSQL on their Azure Platform. This session, presented at Posette 2024, covers what Query Store is and the types of information you can get out of it.
What to do when you have a perfect model for your software but you are constrained by an imperfect business model?
This talk explores the challenges of bringing modelling rigour to the business and strategy levels, and talking to your non-technical counterparts in the process.
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...XfilesPro
Wondering how X-Sign gained popularity in a quick time span? This eSign functionality of XfilesPro DocuPrime has many advancements to offer for Salesforce users. Explore them now!
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesQuickdice ERP
Explore the seamless transition to e-invoicing with this comprehensive guide tailored for Saudi Arabian businesses. Navigate the process effortlessly with step-by-step instructions designed to streamline implementation and enhance efficiency.
8 Best Automated Android App Testing Tool and Framework in 2024.pdfkalichargn70th171
Regarding mobile operating systems, two major players dominate our thoughts: Android and iPhone. With Android leading the market, software development companies are focused on delivering apps compatible with this OS. Ensuring an app's functionality across various Android devices, OS versions, and hardware specifications is critical, making Android app testing essential.
When it is all about ERP solutions, companies typically meet their needs with common ERP solutions like SAP, Oracle, and Microsoft Dynamics. These big players have demonstrated that ERP systems can be either simple or highly comprehensive. This remains true today, but there are new factors to consider, including a promising new contender in the market that’s Odoo. This blog compares Odoo ERP with traditional ERP systems and explains why many companies now see Odoo ERP as the best choice.
What are ERP Systems?
An ERP, or Enterprise Resource Planning, system provides your company with valuable information to help you make better decisions and boost your ROI. You should choose an ERP system based on your company’s specific needs. For instance, if you run a manufacturing or retail business, you will need an ERP system that efficiently manages inventory. A consulting firm, on the other hand, would benefit from an ERP system that enhances daily operations. Similarly, eCommerce stores would select an ERP system tailored to their needs.
Because different businesses have different requirements, ERP system functionalities can vary. Among the various ERP systems available, Odoo ERP is considered one of the best in the ERp market with more than 12 million global users today.
Odoo is an open-source ERP system initially designed for small to medium-sized businesses but now suitable for a wide range of companies. Odoo offers a scalable and configurable point-of-sale management solution and allows you to create customised modules for specific industries. Odoo is gaining more popularity because it is built in a way that allows easy customisation, has a user-friendly interface, and is affordable. Here, you will cover the main differences and get to know why Odoo is gaining attention despite the many other ERP systems available in the market.
WWDC 2024 Keynote Review: For CocoaCoders AustinPatrick Weigel
Overview of WWDC 2024 Keynote Address.
Covers: Apple Intelligence, iOS18, macOS Sequoia, iPadOS, watchOS, visionOS, and Apple TV+.
Understandable dialogue on Apple TV+
On-device app controlling AI.
Access to ChatGPT with a guest appearance by Chief Data Thief Sam Altman!
App Locking! iPhone Mirroring! And a Calculator!!
2. 22
Discovering the 2 in Search Services 2.0
Tech Talk Live
• Solr Core and Solr Schema
• Security, Performance and Precision
• Enterprise Enhancements
• One more thing...
• Q&A
14th October 2020
4. 4
Solr Content Store Removal
ACS
Repository
Content Store
Search Services
1.4
Content StoreDB Solr Index
COMMUNITY
5. 5
Solr Content Store Removal
ACS
Repository
Content Store
Search Services
1.4
Content StoreDB Solr Index
ACS
Repository
Content Store
Search Services
2.0
DB Solr Index
COMMUNITY
6. 6
Solr Content Store Removal Benefits
Removed custom code
9,311 lines of code removed
https://github.com/Alfresco/SearchServices/blob/mas
ter/search-services/alfresco-
search/doc/architecture/solr-content-store-
removal/00001-solr-content-store-removal.md
Helps leverage built-in Solr features
It's now possible to make use of built-in Solr features
(e.g. replication and backups)
Reduces I/O work
Particularly in systems with replication
Reduced disk usage
Search Services Version 1.4 2.0
Index Size (bytes per doc) 1 3,000
Content Store Size (bytes per doc) 40,000 0
COMMUNITY
7. 7
Solr Content Store Removal Reindex
• Moving data from the content store to the index requires a reindex
Reindexing with sharding: Demo later
For more information see:
https://github.com/aborroy/solr-sharding-reindex
For more information about
reindexing see:
https://www.alfresco.com/events/webinars/
tech-talk-live-reindexing-large-repositories
COMMUNITY
TTL
#120
8. 8
Solr Content Store Removal Impact
● More efficient replication as we're now using the default Solr
mechanism
○ Docker-compose example available at
https://github.com/aborroy/search-services-replication
● Now using atomic updates instead of removing and
recreating documents
○ To achieve this we enabled the SOLR Transaction Log
● Review your backup and restore procedures, as the folder
$SOLR_HOME/contentstore is not created anymore
$ du -h /opt/alfresco-search-
services/data/alfresco
4.7M ./index
8.5M ./tlog
4.0K ./snapshot_metadata
COMMUNITY
FTSSTATUS
9. 9
Full information for a
Document can be still
recovered by using Solr
Queries.
Solr Content Store Removal Impact
http://127.0.0.1:8983/solr/alfresco/select?fl=*,[cached]&indent=on&q=DBID:563
COMMUNITY
10. 10
New Destructured Date Fields
Solr schema simplification solrhome/core/conf/schema.xml
Improved storage of DATE fields
quarter
day_of_month
day_of_year
day_of_week
COMMUNITY
11. 11
New fields *_unit_of_time_* can be used to build queries
Get all the documents created in 2020
SOLR FTS
Nb. CMIS is also supported, but not for this example:
● cm:created is not supported as cm:auditable aspect is not exposed for CMIS protocol
New Destructured Date Fields
COMMUNITY
12. 12
Asynchronous Actions and Maintenance
SearchServices
Administrator
Maintenance Queue
Retryt1
Commit TrackerIndex
----
----
----
https://docs.alfresco.com/search-community/concepts/solr-admin-asynchronous-actions.html
COMMUNITY
13. 13
Asynchronous Actions and Maintenance
SearchServices
Administrator
Maintenance Queue
Retryt1
Reindext2
Commit TrackerIndex
----
----
----
https://docs.alfresco.com/search-community/concepts/solr-admin-asynchronous-actions.html
COMMUNITY
14. 14
Asynchronous Actions and Maintenance
SearchServices
Administrator
Maintenance Queue
Retryt1
Reindext2
Purge
t3
Commit TrackerIndex
----
----
----
https://docs.alfresco.com/search-community/concepts/solr-admin-asynchronous-actions.html
COMMUNITY
15. 15
Asynchronous Actions and Maintenance
SearchServices
Administrator
Maintenance Queue
Retryt1
Reindext2
Purge
t3
Fixt4
Commit TrackerIndex
----
----
----
https://docs.alfresco.com/search-community/concepts/solr-admin-asynchronous-actions.html
COMMUNITY
16. 16
Asynchronous Actions and Maintenance
SearchServices
Administrator
Maintenance Queue
Retryt1
Reindext2
Purge
t3
Fixt4
Commit TrackerIndex
----
----
----
t5
Dequeues scheduled work
https://docs.alfresco.com/search-community/concepts/solr-admin-asynchronous-actions.html
COMMUNITY
17. 17
Asynchronous Actions and Maintenance
SearchServices
Administrator
Maintenance Queue
Retryt1
Reindext2
Purge
t3
Fixt4
Commit TrackerIndex
-+-
--+
+--
t5
Dequeues scheduled work
t6
Index management
https://docs.alfresco.com/search-community/concepts/solr-admin-asynchronous-actions.html
COMMUNITY
18. 18
The FIX tool finds transactions and ACL change sets which are mismatched between the DB and Solr
It adds them to be reindexed on the next maintenance cycle performed by the CommitTracker
FIX Tool
{
"responseHeader": {
"QTime": 1,
"status": 0
},
"action": {
"status": "scheduled",
"txToReindex": [1, 2],
"aclChangeSetToReindex": [3, 4]
}
}
Old Response Shape
● “status” is always scheduled
● Only two error categories
● Each category contains the corresponding
transaction identifiers
COMMUNITY
19. 19
{
"responseHeader": {
// As before
},
"action": {
"dryRun": true,
"status": "notScheduled",
"txToReindex": {
"txInIndexNotInDb": {
"192": 282, // Tx 192 is associated to 282 nodes
"827": 99 // Tx 192 is associated to 282 nodes
},
"duplicatedTxInIndex": {...},
"missingTxInIndex": {...}
},
"aclChangeSetToReindex": {
// Very similar to txToReindex, but for ACLs
}
}
}
FIX Tool New Features
● dryRun (defaults to true): If true the output report is
generated, but no reindex work is scheduled.
● fromTxCommitTime: The lower bound (the minimum
transaction commit time) of the target transactions
that you want to check or fix.
● toTxCommitTime: The upper bound (the maximum
transaction commit time) of the target transactions
that you want to check or fix.
● maxScheduledTransactions: The maximum number
of transactions that will be scheduled. The default is
500 but this can be overridden in solrcore.properties.
COMMUNITY
20. 20
Enable/Disable Indexing
Motivation: Disable indexing in order to cancel a huge maintenance load
• Enable / disable indexing on a specific core or on all master/standalone cores
• MetadataTracker, ContentTracker, CascadeTracker, AclTracker are affected
• CommitTracker, ModelTracker, ShardStatePublisher are not affected
• When disabled, some admin endpoints (e.g. PURGE,INDEX) won’t execute
• When disabled, the FIX endpoint will be forced to run in dryRun mode
• If indexing is disabled in the middle of a tracking process, trackers will be set to rollback mode
• Commands are idempotent
• For more information see https://issues.alfresco.com/jira/browse/SEARCH-2330
Examples:
Disable indexing on all master/standalone cores
http://localhost:8983/solr/admin/cores?action=enable-indexing
Disable indexing on a specific (master or standalone core)
http://localhost:8983/solr/admin/cores?action=enable-indexing&core=alfresco
COMMUNITY
21. 21
FIX Tool Demo
Postman Collection containing the example requests used in the demo
https://www.getpostman.com/collections/4c2fbe407a0134729546
COMMUNITY
23. 23
● Communication between Repository and SOLR (for searching and indexing) may be
protected using mTLS Protocol with client authentication [1]
● New password handling mechanism has been introduced from ASS 2.0 / ACS 6.2.N [2]:
○ Switch from storing configuration in property files with passwords in plain text to JVM system
properties
○ The old way of configuring should still work for backwards compatibility, but is discouraged for security
reasons
[2] ACS 6.2.N is not released yet!
New mTLS Configuration
[1] https://hub.alfresco.com/t5/alfresco-content-services-blog/alfresco-6-1-is-coming-with-mutual-tls-authentication-by-default/ba-p/287905
COMMUNITY
24. 24
alfresco-ssl-generator command Line Tool to generate self-
signed certificates (classic and current formats)
https://github.com/Alfresco/alfresco-ssl-generator
alfresco-solr-docker-mtls sample configuration (repo using
classic and solr using current)
https://github.com/aborroy/alfresco-solr-docker-mtls
Additional resources
Installing and configuring Search Services with mutual TLS using the
distribution zip
https://docs.alfresco.com/search-community/tasks/solr-install.html
Alfresco mTLS Configuration Deep Dive
https://hub.alfresco.com/t5/alfresco-content-services-blog/alfresco-mtls-
configuration-deep-dive/ba-p/296422
New mTLS Configuration
COMMUNITY
26. 26
Trackers Reworking
Transaction Batch Size for nodes and ACLs has an impact while the
maximum number for your deployment is not reached. After that, you can
increase this batch size but there will be no performance changes
alfresco.transactionDocsBatchSize (default 2000)
alfresco.changeSetAclsBatchSize (default 500)
Increasing the Node Batch Size can improve your performance up to an
optimal point for your deployment. After that, you can increase this batch
size but the performance will be penalised
alfresco.nodeBatchSize (default 100)
alfresco.cascade.tracker.nodeBatchSize (default 10)
alfresco.contentUpdateBatchSize (default 2000)
alfresco.aclBatchSize (default 100)
Increasing the maximum number of Parallel Threads improved performance
until the maximum number for the deployment is reached.
alfresco.metadata.tracker.maxParallelism (default 32)
alfresco.cascade.tracker.maxParallelism (default 32)
alfresco.content.tracker.maxParallelism (default 32)
alfresco.acl.tracker.maxParallelism (default 32)
HOTSPOT
HOTSPOT
Execution
Time
Parameter
Size
solrcore.properties
1
2
3
COMMUNITY
27. 27
FTS operator = has changed behaviour in 2.0.0
● Detailed information is available in https://hub.alfresco.com/t5/alfresco-content-services-blog/exact-term-queries-in-
search-services-2-0/ba-p/302200
● Thanks @AFaust for noticing this issue: https://issues.alfresco.com/jira/browse/SEARCH-2461
Exact Search
COMMUNITY
29. 29
In previous releases, Shard State was communicated to the repository as part of the retrieval of
information from the Metadata Tracker.
That could generate problems when the Metadata Tracker cycle takes long time to execute.
A new Shard State Publisher tracker has been added in order to report the state to the repository on
regular basis.
The new configuration for this tracker includes the following property.
alfresco.nodestate.tracker.cron
If this property is not specified, default cron is applied:
alfresco.cron=0/10 * * * * ? *
ShardState Tracker
solrcore.properties
ENTERPRISE
Sharding
30. 30
DB_ID_RANGE Sharding
• When a shard goes down then search can now be restored more quickly
For more details see MNT-21591
ACS Node 1
ACS Node 2
SOLR Shard 1
DB_ID_RANGE
SOLR Shard 2
DB_ID_RANGE
Replica 1
Replica 2
ACS (alfresco-global.properties):
search.solrShardRegistry.shardInstanceTimeoutInSeconds = 30
(Historically this should be set to more like 300 seconds)
InsightEngine (solrcore.properties):
alfresco.nodestate.tracker.cron=0/10 * * * * ? *
This should be more frequent than the value set in ACS
ENTERPRISE
Sharding
31. 31
Solr Sharding Reindex
When re-indexing a living Alfresco Repository with SOLR Sharding and
solr.useDynamicShardRegistration enabled, the new SOLR Shard Indexer services should be
configured with Alfresco NodeState Tracker off.
Using this approach, the SOLR Indexer services are not registered in the living Alfresco Repository as
available SOLR Shards and the living system can operate normally.
Sharding Reindex (Demo)
https://github.com/aborroy/solr-sharding-reindex
This configuration uses two Docker Compose templates:
● living is an ACS server running 2 SOLR Shards configured with DB_ID
method and Alfresco Search Services 1.4.3
● indexer is an Indexer service running 2 SOLR Shards configured with
DB_ID method and Alfresco Search Services 2.0.0.1
ENTERPRISE
Sharding
32. 32
● Improved SOLR JDBC support
● Added support for Excel and Tableau to Alfresco Search and Insight Engine using an ODBC Driver
provided by a 3rd party company called CDATA
○ Download the driver in https://www.cdata.com/drivers/alfresco/
Alfresco
REPOSITORY
BI Tool Support
ENTERPRISE
BI Tools
33. 33
Improvements to SQL Support (JDBC & ODBC)
• Support for Date Functions in SELECT Clause
• Support for Date Functions in WHERE Clause
• Support for Date Functions in GROUP BY Clause
• Support for SQL avg(field) with multiple GROUP BY
• Support for Date Functions in ORDER BY Clause
• Support SQL TIMESTAMP format
• Support for CAST AS TIMESTAMP function
• Support for QUARTER function
• Support for DAYOFMONTH, DAYOFWEEK, DAYOFYEAR functions
• Support for TIMESTAMPADD(timeUnit, integer, datetime) function
ENTERPRISE
BI Tools
34. 34
JDBC Driver with DBVisualizer (Demo)
ENTERPRISE
BI Tools
Alfresco
REPOSITORY
>> Working JDBC Client sample is available in https://github.com/aborroy/solr-jdbc-client
35. 35
CDATA ODBC installation
The driver is simple to install on your machine and can be done using the steps on the following page:
http://cdn.cdata.com/help/SJF/odbc/
Installation and setup is a simple two-step process, to be performed on end user’s machine
1. Install the driver
2. Configure the ODBC data source
Configuration is fully documented by Cdata.
ENTERPRISE
BI Tools
36. 36
ODBC for Tableau
• Can connect to your relevant data source and portray the results in a table from the source.
• The results can be displayed by using the table directly or by entering a custom sql query to portray results specific
to what the user wants to see.
• Tableau consists of worksheets where we can build views of our data using the fields and graphs.
• Each worksheet builds the results of one query through the use of the fields.
• Can visualise our results as pie charts, bar charts, stacked bar charts, continuous line graphs and many more
• We can edit out results by applying filters within Tableau on our selected fields.
• Tableau has the ability to create dashboards to store all of our related queries on each of the sheets in one place.
• Can preview the results on different devices like desktop, tablet and more.
ENTERPRISE
BI Tools
37. 37
ODBC for Excel
• Simply start by doing a data dump into excel
• Similar process to connect to the ODBC source like Tableau where you can connect and view all the results from the
table or provide a custom sql query similar to Tableau.
• Excel gives a preview of the results before going on to displaying the results on a different sheet.
• You can filter the data before displaying the results through the preview by clicking the ‘transform’ button and then
going on to filter your data to how you want.
• You can use native excel functionality from your chosen dataset without heavily relying on SQL in comparison to
using Zeppelin.
ENTERPRISE
BI Tools
38. 38
Supported Stack
• Linux (Red Hat Enterprise v7.6 x64)
• CentOS 7 x64
• Ubuntu 18.04
• SUSE 12.0 SP1 x64
• Windows Server 2012 R2 (x64)
• Windows Server 2016
Server OS
• Solr 6.6.5
Solr
• OpenJDK 11.0.8
• Oracle JDK 11.0.1
Java
• Alfresco Enterprise Edition (ACS) 6.2
• Alfresco Community Edition 201911 GA
Alfresco Content Services
COMMUNITY
ENTERPRISE
Release notes
https://hub.alfresco.com/t5/alfresco-content-services-blog/search-services-2-0-0-release/ba-p/301308
39. 39
2.0.0.0
2.0.0.1
shared.properties
• Suggestable Properties and Cross Locale fields
• This may have an impact in the SOLR index
• Spellcheck and Tokenisation work by default
2.0.x
• Settings changed back to commented out
by default like previous versions
2.0.0.1
COMMUNITY
ENTERPRISE
42. 42
Index Checker Tool
https://github.com/AlfrescoLabs/index-checker
Simple report
$ java -jar target/index-checker-0.0.1-SNAPSHOT.jar --report.detailed=false --run.fix.actions=false
Count SOLR documents = 814
Count DB nodes = 815
The database contains 2 nodes more than SOLR Index for {http://www.alfresco.org/model/content/1.0}category
SOLR indexed 1 nodes more than the existing in database for {http://www.alfresco.org/model/content/1.0}content
Count SOLR permissions = 58
Count DB permissions = 58
>> Available from Search Services 1.4.3
43. 43
Index Checker Tool
Detailed report
$ java -jar target/index-checker-0.0.1-SNAPSHOT.jar --report.detailed=true --run.fix.actions=false
Count SOLR documents = 814
Count DB nodes = 815
The database contains 2 nodes more than SOLR Index for {http://www.alfresco.org/model/content/1.0}category
TYPE {http://www.alfresco.org/model/content/1.0}category: DbIds present in DB but missed in SOLR [212, 213]
SOLR indexed 1 nodes more than the existing in database for {http://www.alfresco.org/model/content/1.0}content
TYPE {http://www.alfresco.org/model/content/1.0}content: DbIds present in SOLR but missed in DB [584]
Count SOLR permissions = 58
Count DB permissions = 58
Batches of
1,000
elements
44. 44
Fix actions
$ java -jar target/index-checker-0.0.1-SNAPSHOT.jar --report.detailed=true --run.fix.actions=true
Count SOLR documents = 814
Count DB nodes = 815
...
No Database Rows Were Harmed in the Fixing of This Solr Index
$ java -jar target/index-checker-0.0.1-SNAPSHOT.jar --report.detailed=false --run.fix.actions=false
Count SOLR documents = 815
Count DB nodes = 815
Index Checker Tool
>> Watch the living demo in https://youtu.be/YU-WyNgCH2U