SlideShare une entreprise Scribd logo
1  sur  58
#SummitNow
Super Size Your Search
6th November 2013
Piergiorgio Lucidi (Sourcesense)
Fran Alvarez (Zaizi)
#SummitNow#SummitNow
Piergiorgio Lucidi
• Open Source ECM Specialist at Sourcesense
• Alfresco Certified Trainer / Engineer
• Alfresco Wiki Gardener / Community Star
• Alfresco forum supporter
• Global Moderator of the italian forum
• Author and Technical Reviewer at Packt
• PMC Member and Mentor at ASF
• Project Leader in the JBoss Community
#SummitNow#SummitNow
Overview
How to build and manage your search
server:
1. Scenario
2. Introducing Apache ManifoldCF
3. Zaizi Integrated Search Solution
#SummitNow#SummitNow
Scenario
An overview about the typical complex
search architecture
#SummitNow#SummitNow
Scenario - Alfresco limitations
Alfresco supports these search engines:
• Apache Lucene (embedded)
• Apache Solr (provided by Alfresco)
• needs development if other repositories
must be involved
Every other approach must be implemented
(ScheduledActions, WebScripts, etc..)
#SummitNow#SummitNow
Scenario – Embedded
Simple Search Architecture
Alfresco is the only one repository involved in the
architecture using the embedded search engine:
• the repository must take care of indexes also
managing index transactions
Indexes
Alfresco
FrontEnd
applications
Apache Lucene
#SummitNow#SummitNow
Scenario – Embedded - Cluster
Embedded
Not easy to scale out with Lucene
1. every cluster must have its own search
indexes
2. The cluster must synchronize indexes
Indexes
Alfresco
Apache Lucene
Indexes
Alfresco
Apache Lucene
JGroups
#SummitNow#SummitNow
Scenario – Simple Architecture
Simple search architecture
Alfresco is the only one repository involved in the
architecture with an external search server
1. The search server can be used for publish
contents in the front end architecture
2. The repository will stay in the logic backend
Search Engine
Indexes
Alfresco FrontEnd
applications
#SummitNow#SummitNow
Scenario – Publish with search
A search engine can be used for:
• advanced management of search indexes
• scaling out
• executing complex search on contents
• publishing contents in the FE
architecture
#SummitNow#SummitNow
Scenario – Publish with search
Publish with search architecture
Alfresco is the only one repository involved in the
architecture with an external search server
1. The search server can be used for publishing
contents in the front end architecture (HTML)
2. The repository will stay in the logic backend
Search Engine
Indexes
Alfresco FrontEnd
applications
BackEnd FrontEnd
Lucene / Solr
Indexes
#SummitNow#SummitNow
Scenario – Simple Architecture
Simple Search Architecture
Alfresco is the only one repository involved in the
architecture with an external search server
1. The search server can be used for publish
contents in the front end architecture
2. The repository will stay in the logic backend
Search Engine
Indexes
Alfresco FrontEnd
applications
#SummitNow#SummitNow
Scenario – Complex
Architecture
1. Alfresco is only one of the platforms that
must be involved in your search
architecture
2. You don’t want to increase the
development effort
3. You want just something to configure 
#SummitNow#SummitNow
Scenario – Complex
Architecture
Architecture with different ECM systems
Alfresco is one of the content platforms that must
be involved in the indexing process
Alfresco
Search Engine
Indexes
SharePoint
FileNet
CMIS
JIRA
Google Drive
DropBox
#SummitNow#SummitNow
Scenario – Complex
Architecture
Architecture with different ECM systems
Alfresco is one of the content platforms that must
be involved in the indexing process
Alfresco
Search Engine
Indexes
SharePoint
FileNet
CMIS
JIRA
Google Drive
DropBox
#SummitNow#SummitNow
Scenario – Complex
Architecture
Architecture with different ECM systems
Alfresco is one of the content platforms that must
be involved in the indexing process
Alfresco
Search Engine
Indexes
SharePoint
FileNet
CMIS
JIRA
Google Drive
DropBox
#SummitNow#SummitNow
Introducing Apache ManifoldCF
#SummitNow#SummitNow
Apache ManifoldCF - History
ManifoldCF code base was granted by MetaCarta to the
Apache Software Foundation in December 2009.
The MetaCarta effort represented more than five years of
successful development and testing in multiple, challenging
enterprise environments.
The project was graduated as Apache Top Level
Project in July 2012.
#SummitNow#SummitNow
Apache ManifoldCF – What is?
Open Source crawler
• crawling model (add, change, delete)
• schedule jobs to create indexes
• get contents from repositories
• push contents on search servers
#SummitNow#SummitNow
Apache ManifoldCF – What is?
Repository 1
Repository 3
Repository 4
Repository 2
Apache ManifoldCF
Search Server 1
Search Server 2
Search Server 3
Search Server 4
#SummitNow#SummitNow
Apache ManifoldCF – What is?
Out-Of-The-Box it is distributed as a webapp
• REST API
• Authority Service
• ACL indexes
• Crawler UI
can be embedded in any Java application
#SummitNow#SummitNow
Apache ManifoldCF – Why?
• Reliability
• Incremental
• Flexible
• Multi repositories
• Security model
• Monitoring
#SummitNow#SummitNow
ManifoldCF – Why? - Reliability
Jobs scheduling and configuration are stored in the
database to maintain the state of all the executions
Repository 1
Repository 3
Repository 4
Repository 2
Apache ManifoldCF
Search Server 1
Search Server 2
Search Server 3
Search Server 4
Pull Agent Daemon
Database
#SummitNow#SummitNow
ManifoldCF – Why? -
Incremental
get content changesets obtained from the repository API
Repository 1 Apache ManifoldCF
Pull Agent Daemon
Database
query
Complete
Changesets
#SummitNow#SummitNow
ManifoldCF – Why? - Flexible
If the repository can't supply all the changes Manifold can
discover them through crawling
Apache ManifoldCF
Pull Agent Daemon
Database
query
Incomplete
Changesets
Change Discovery
N N
#SummitNow#SummitNow
ManifoldCF – Why? – Multi repo
Jobs can retrieve contents
from the following
repositories:
• Google Drive
• Dropbox
• HDFS
• CMIS-compliant
• Alfresco
• IBM FileNet
• EMC Documentum
• Microsoft SharePoint
• OpenText LiveLink
• Autonomy Meridio
• Memex Patriarch
• Windows Share/DFS
• Generic JDBC
• Generic Filesystem
• Generic RSS and Web
#SummitNow#SummitNow
ManifoldCF – Why? – Multi repo
Jobs can ingest contents to
the following search servers:
• Apache Solr
• ElasticSearch
• OpenSearchServer
• MetaCarta GTS
#SummitNow#SummitNow
ManifoldCF – Why? - Security
Retrieve per-content ACLs
Repository 1
Repository 3
Repository 4
Repository 2
Apache ManifoldCF
Search Server 1
Search Server 2
Search Server 3
Search Server 4
Authority Service
Authority 1
Authority 2
access
tokens
#SummitNow#SummitNow
ManifoldCF – Why? - Security
Retrieve per-content ACLs
Repository 1
Repository 3
Repository 4
Repository 2
Apache ManifoldCF
Search Server 1
Search Server 2
Search Server 3
Search Server 4
Authority Service
Authority 1
Authority 2
user access tokens
user specific
search results
#SummitNow#SummitNow
ManifoldCF – Why? –
MonitoringUI Crawler allows you to:
• configure jobs and connectors
• monitor jobs execution
• monitor contents ingestion
• status reports
• document status
• queue status
• history reports
• simple history
• maximum activity
• maximum bandwidth
• result histogram
#SummitNow#SummitNow
ManifoldCF – Architecture
Repository Job Search Server
ACLs
#SummitNow#SummitNow
ManifoldCF – Architecture
Repository Job Search Server
ACLs
Repository Connector
#SummitNow#SummitNow
ManifoldCF – Architecture
Repository Job Search Server
ACLs
Repository Connector Output Connector
#SummitNow#SummitNow
ManifoldCF – Architecture
Repository Job Search Server
ACLs
Repository Connector Output Connector
Authority Connector
#SummitNow#SummitNow
ManifoldCF – Architecture
Repository Job Search Server
ACLs
Repository Connector
query to retrieve
contents
Output Connector
Authority Connector
#SummitNow#SummitNow
ManifoldCF – Architecture
Repository Job Search Server
ACLs
Repository Connector
query to retrieve
contents
Output Connector
metadata mapping
content ingestion
Authority Connector
#SummitNow#SummitNow
ManifoldCF – Architecture
Repository Job Search Server
ACLs
Repository Connector
query to retrieve
contents
Output Connector
metadata mapping
content ingestion
Authority Connector
retrieve content
ACEs
#SummitNow#SummitNow
ManifoldCF – Architecture
Repository Job Search Server
ACLs
Repository Connector
query to retrieve
contents
Output Connector
metadata mapping
content ingestion
Authority Connector
retrieve content
ACEs
• verbal
description
• crawling model
• scheduling
#SummitNow#SummitNow
Who is using ManifoldCF?
#SummitNow#SummitNow
ManifoldCF - Resources
The project is available at
http://manifoldcf.apache.org/
From this website you can access to
the mailing lists, documentation and
download links for binaries and
source.
#SummitNow#SummitNow
ManifoldCF – Resources - Book
ManifoldCF in Action
by Karl Wright
published by Manning
Karl is the original developer and the
principal committer of Apache
ManifoldCF
The book is available at
http://www.manning.com/wright
#SummitNow#SummitNow
Zaizi Integrated Search Solution
#SummitNow#SummitNow
Fran Alvarez
• Director of Zaizi Iberia and Lead Architect
• Alfresco Certified Engineer
• Responsible of large Alfresco
architectures
• Semantic Consultant for Sensefy
• Alfresco Meetups Organizer
#SummitNow#SummitNow
Alfresco + Solr Approach
Quite a good architecture
• Performance issues are solved
• Different architectures depending on business requirements
However…
• It does not cover some use cases or scenarios
• It does not leverage Cloud benefits or latest technologies
• With huge data volume there are other approaches
How can we solve limitations and enhance benefits?
#SummitNow#SummitNow
Alfresco + Solr Approach
• Decouples Search solution from Alfresco
• Allow to implement different Search solutions
• Allow to change Search solution without changing anything in Alfresco
• Not even a property!
• Provides an API to integrate it with Alfresco as search engine
• Even other repository vendors! E.g. Filesystem, Sharepoint,
Documentum, Filenet, Drupal…
• And preserve security permissions in the results
• Alfresco permissions are indexed and used during search
It’s included in our Semantic solution: Sensefy!
#SummitNow#SummitNow
What we’ve done in Manifold
Repository Connector:
• Alfresco Repository Connector: New implementation
• Removing dependency with Alfresco Solr API
Output connectors:
• Cloud Search Output Connector: Design & Development
• Elastic Search Output Connector: Improvements
• Solr Cloud Output Connector: Configuration for Alfresco
Authority Connector
• Alfresco Authority Connector: Design & Development
• Similar approach to Alfresco Solr
• Acl reads for Users and Groups in Alfresco
#SummitNow#SummitNow
Scenarios
Let’s see some examples
#SummitNow#SummitNow
I: Several Alfresco instances
Current Approach:
• Each Alfresco has its own Search
subsystem
• They can’t share indexes
Implications:
• Federated search is not an option
• Results can’t be merged
• If so, what resultset should be
first?
Conclusion
Results could be presented to users in
different tabs or “manually” merged.
Not the best approach
#SummitNow#SummitNow
I: Several Alfresco instances
Zaizi Approach:
• Our solution like search box
• Which manages a single index
Implications:
• All documents are driven to same
index
• Users can select results from either all
Alfresco instances or a subset
Conclusion
Search across Repositories
Could be based Elastic Search, Solr
Cloud, Amazon Cloud, etc.
#SummitNow#SummitNow
II: Alfresco + Other data providers
Current Approach:
• Alfresco has its own Search
subsystem
• Other repository may have (or not) its
own Search subsystem
Implications:
• Different data providers mean different
formats
• E.g. Filesystem does not support
CMIS
• Alfresco can’t reach external data
Conclusion
No way to merge results and present
them uniformly to end users
#SummitNow#SummitNow
II: Alfresco + Other data providers
Zaizi Approach:
• Both Alfresco and other repositories
share Search subsystem (Manifold)
Implications:
• Alfresco and other providers results
will have same format in our Solution
• They will speak ‘our’ language
• Alfresco reaches external data when
communicating with our solution
Conclusion
Results are present and accessible between
data providers
#SummitNow#SummitNow
III: Alfresco + O(TB) data
Current Approach:
• Alfresco has its own Search
subsystem
• All data is in one (or several if cluster)
Solr instance
Implications:
• Every Solr node manages the whole
index
• No chance to apply scale techniques
for indexing:
• Sharding, Replication…
Conclusion
Huge servers are required and
performance might be compromised
#SummitNow#SummitNow
III: Alfresco + O(TB) data
Zaizi Approach:
• Alfresco uses our solution
• Data is indexed in search solution which
better suits:
• Amazon Cloud, Solr Cloud, Elastic
Search…
Implications:
• Cloud Search solution manages index
• Indexing techniques can be applied
according to use cases
• Sharding, Replication
Conclusion
Search strategy can be adopted and easily
implemented with search solution which
better fits
#SummitNow#SummitNow
Apache Manifold: Other benefits
Can extract, index and map information from any other
sources
• Apache Stanbol, RedLink, any other data enricher
• Our solution will gather everything in one place
• Documents, entities…
Permissions are checked just once
• Everything is in the same place, even user authorization
capabilities
• Performance and scalability is improved
• Faceted search and other search capabilities are combined
with such permission feature
#SummitNow#SummitNow
Demo
#SummitNow#SummitNow
Conclusions
Zaizi solution allows searching and indexing in the most popular Cloud
Search solutions
• Other Search solutions can be integrated as well
Zaizi solution allows retrieving information from the most popular
repositories
• Other Data providers can be integrated too
• It solves plenty of current issues related search and indexing in
Alfresco
• Can be used outside Alfresco or even with Alfresco and any other
data repository
Zaizi solution manages permissions and security from the most popular
repositories and the latest Cloud search technologies
Fully supported by us!
#SummitNow#SummitNow
Conclusions
#SummitNow#SummitNow
What’s coming
Powerful User Interface
• Admin functions
• Wide range of facets
• UI for Share
Benchmarking
New connectors
• Filesystem authority
• RedLink repository
• Stanbol repository
Alfresco Search
Subsystem?
#SummitNow

Contenu connexe

Tendances

Integrating Alfresco @ Scale (via event-driven micro-services)
Integrating Alfresco @ Scale (via event-driven micro-services)Integrating Alfresco @ Scale (via event-driven micro-services)
Integrating Alfresco @ Scale (via event-driven micro-services)J V
 
They why behind php frameworks
They why behind php frameworksThey why behind php frameworks
They why behind php frameworksKirk Madera
 
Alfresco DevCon 2018: SDK 3 Multi Module project using Nexus 3 for releases a...
Alfresco DevCon 2018: SDK 3 Multi Module project using Nexus 3 for releases a...Alfresco DevCon 2018: SDK 3 Multi Module project using Nexus 3 for releases a...
Alfresco DevCon 2018: SDK 3 Multi Module project using Nexus 3 for releases a...Martin Bergljung
 
Rapid application development with spring roo j-fall 2010 - baris dere
Rapid application development with spring roo   j-fall 2010 - baris dereRapid application development with spring roo   j-fall 2010 - baris dere
Rapid application development with spring roo j-fall 2010 - baris dereBaris Dere
 
Gr8Conf 2016 - GORM Inside and Out
Gr8Conf 2016 - GORM Inside and OutGr8Conf 2016 - GORM Inside and Out
Gr8Conf 2016 - GORM Inside and Outgraemerocher
 
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk ServerUsing ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk ServerBizTalk360
 
Introduction to Ruby Native Extensions and Foreign Function Interface
Introduction to Ruby Native Extensions and Foreign Function InterfaceIntroduction to Ruby Native Extensions and Foreign Function Interface
Introduction to Ruby Native Extensions and Foreign Function InterfaceOleksii Sukhovii
 
What's New in OpenLDAP
What's New in OpenLDAPWhat's New in OpenLDAP
What's New in OpenLDAPLDAPCon
 
Alfresco 5.2 REST API
Alfresco 5.2 REST APIAlfresco 5.2 REST API
Alfresco 5.2 REST APIJ V
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in AlfrescoAngel Borroy López
 
Artifacts management with DevOps
Artifacts management with DevOpsArtifacts management with DevOps
Artifacts management with DevOpsChen-Tien Tsai
 
Hyderabad MuleSoft Meetup - Anypoint Studio Tips and Tricks & Salesforce Comp...
Hyderabad MuleSoft Meetup - Anypoint Studio Tips and Tricks & Salesforce Comp...Hyderabad MuleSoft Meetup - Anypoint Studio Tips and Tricks & Salesforce Comp...
Hyderabad MuleSoft Meetup - Anypoint Studio Tips and Tricks & Salesforce Comp...Sravan Lingam
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
Grails 3.0 Preview
Grails 3.0 PreviewGrails 3.0 Preview
Grails 3.0 Previewgraemerocher
 
Real world microservice architecture
Real world microservice architectureReal world microservice architecture
Real world microservice architectureViacheslav Poturaev
 
How to win skeptics to aggregated logging using Vagrant and ELK
How to win skeptics to aggregated logging using Vagrant and ELKHow to win skeptics to aggregated logging using Vagrant and ELK
How to win skeptics to aggregated logging using Vagrant and ELKSkelton Thatcher Consulting Ltd
 
5 steps to take setting up a streamlined container pipeline
5 steps to take setting up a streamlined container pipeline5 steps to take setting up a streamlined container pipeline
5 steps to take setting up a streamlined container pipelineMichel Schildmeijer
 
Middleware in Golang: InVision's Rye
Middleware in Golang: InVision's RyeMiddleware in Golang: InVision's Rye
Middleware in Golang: InVision's RyeCale Hoopes
 
Update on the OpenDJ project
Update on the OpenDJ projectUpdate on the OpenDJ project
Update on the OpenDJ projectLDAPCon
 

Tendances (20)

Integrating Alfresco @ Scale (via event-driven micro-services)
Integrating Alfresco @ Scale (via event-driven micro-services)Integrating Alfresco @ Scale (via event-driven micro-services)
Integrating Alfresco @ Scale (via event-driven micro-services)
 
They why behind php frameworks
They why behind php frameworksThey why behind php frameworks
They why behind php frameworks
 
Alfresco DevCon 2018: SDK 3 Multi Module project using Nexus 3 for releases a...
Alfresco DevCon 2018: SDK 3 Multi Module project using Nexus 3 for releases a...Alfresco DevCon 2018: SDK 3 Multi Module project using Nexus 3 for releases a...
Alfresco DevCon 2018: SDK 3 Multi Module project using Nexus 3 for releases a...
 
Rapid application development with spring roo j-fall 2010 - baris dere
Rapid application development with spring roo   j-fall 2010 - baris dereRapid application development with spring roo   j-fall 2010 - baris dere
Rapid application development with spring roo j-fall 2010 - baris dere
 
Gr8Conf 2016 - GORM Inside and Out
Gr8Conf 2016 - GORM Inside and OutGr8Conf 2016 - GORM Inside and Out
Gr8Conf 2016 - GORM Inside and Out
 
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk ServerUsing ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
 
Railsで作るBFFの功罪
Railsで作るBFFの功罪Railsで作るBFFの功罪
Railsで作るBFFの功罪
 
Introduction to Ruby Native Extensions and Foreign Function Interface
Introduction to Ruby Native Extensions and Foreign Function InterfaceIntroduction to Ruby Native Extensions and Foreign Function Interface
Introduction to Ruby Native Extensions and Foreign Function Interface
 
What's New in OpenLDAP
What's New in OpenLDAPWhat's New in OpenLDAP
What's New in OpenLDAP
 
Alfresco 5.2 REST API
Alfresco 5.2 REST APIAlfresco 5.2 REST API
Alfresco 5.2 REST API
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
 
Artifacts management with DevOps
Artifacts management with DevOpsArtifacts management with DevOps
Artifacts management with DevOps
 
Hyderabad MuleSoft Meetup - Anypoint Studio Tips and Tricks & Salesforce Comp...
Hyderabad MuleSoft Meetup - Anypoint Studio Tips and Tricks & Salesforce Comp...Hyderabad MuleSoft Meetup - Anypoint Studio Tips and Tricks & Salesforce Comp...
Hyderabad MuleSoft Meetup - Anypoint Studio Tips and Tricks & Salesforce Comp...
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
Grails 3.0 Preview
Grails 3.0 PreviewGrails 3.0 Preview
Grails 3.0 Preview
 
Real world microservice architecture
Real world microservice architectureReal world microservice architecture
Real world microservice architecture
 
How to win skeptics to aggregated logging using Vagrant and ELK
How to win skeptics to aggregated logging using Vagrant and ELKHow to win skeptics to aggregated logging using Vagrant and ELK
How to win skeptics to aggregated logging using Vagrant and ELK
 
5 steps to take setting up a streamlined container pipeline
5 steps to take setting up a streamlined container pipeline5 steps to take setting up a streamlined container pipeline
5 steps to take setting up a streamlined container pipeline
 
Middleware in Golang: InVision's Rye
Middleware in Golang: InVision's RyeMiddleware in Golang: InVision's Rye
Middleware in Golang: InVision's Rye
 
Update on the OpenDJ project
Update on the OpenDJ projectUpdate on the OpenDJ project
Update on the OpenDJ project
 

Similaire à Super Size Your Search

Smart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFSmart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFPiergiorgio Lucidi
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and ThenAngel Borroy López
 
The Need For Speed - NxtGen Cambridge
The Need For Speed - NxtGen CambridgeThe Need For Speed - NxtGen Cambridge
The Need For Speed - NxtGen CambridgePhil Pursglove
 
The Need For Speed - NEBytes
The Need For Speed - NEBytesThe Need For Speed - NEBytes
The Need For Speed - NEBytesPhil Pursglove
 
The Need for Speed - EpiCenter 2010
The Need for Speed - EpiCenter 2010The Need for Speed - EpiCenter 2010
The Need for Speed - EpiCenter 2010Phil Pursglove
 
Phil Pursglove: Velocity, the Need for Speed - epicenter 2010
Phil Pursglove: Velocity, the Need for Speed - epicenter 2010Phil Pursglove: Velocity, the Need for Speed - epicenter 2010
Phil Pursglove: Velocity, the Need for Speed - epicenter 2010IrishDev.com
 
Alfresco overview EDM
Alfresco overview EDMAlfresco overview EDM
Alfresco overview EDMsang nguyen
 
0910 cagliari- spring surf and cmis - the dynamic duo
0910 cagliari- spring surf and cmis - the dynamic duo0910 cagliari- spring surf and cmis - the dynamic duo
0910 cagliari- spring surf and cmis - the dynamic duoSymphony Software Foundation
 
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...Nicole Szigeti
 
Extend soa with api management spoug- Madrid
Extend soa with api management   spoug- MadridExtend soa with api management   spoug- Madrid
Extend soa with api management spoug- MadridVinay Kumar
 
Alfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform UpdateAlfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform UpdateAlfresco Software
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfJeff Smith
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfJeff Smith
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfJeff Smith
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfJeff Smith
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfJeff Smith
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfJeff Smith
 
Tech Talk Live - 5.2 REST APIs
Tech Talk Live - 5.2 REST APIsTech Talk Live - 5.2 REST APIs
Tech Talk Live - 5.2 REST APIsGavin Cornwell
 

Similaire à Super Size Your Search (20)

Smart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFSmart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCF
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and Then
 
The Need For Speed - NxtGen Cambridge
The Need For Speed - NxtGen CambridgeThe Need For Speed - NxtGen Cambridge
The Need For Speed - NxtGen Cambridge
 
Velocity - Edge UG
Velocity - Edge UGVelocity - Edge UG
Velocity - Edge UG
 
Elastic-Engineering
Elastic-EngineeringElastic-Engineering
Elastic-Engineering
 
The Need For Speed - NEBytes
The Need For Speed - NEBytesThe Need For Speed - NEBytes
The Need For Speed - NEBytes
 
The Need for Speed - EpiCenter 2010
The Need for Speed - EpiCenter 2010The Need for Speed - EpiCenter 2010
The Need for Speed - EpiCenter 2010
 
Phil Pursglove: Velocity, the Need for Speed - epicenter 2010
Phil Pursglove: Velocity, the Need for Speed - epicenter 2010Phil Pursglove: Velocity, the Need for Speed - epicenter 2010
Phil Pursglove: Velocity, the Need for Speed - epicenter 2010
 
Alfresco overview EDM
Alfresco overview EDMAlfresco overview EDM
Alfresco overview EDM
 
0910 cagliari- spring surf and cmis - the dynamic duo
0910 cagliari- spring surf and cmis - the dynamic duo0910 cagliari- spring surf and cmis - the dynamic duo
0910 cagliari- spring surf and cmis - the dynamic duo
 
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
 
Extend soa with api management spoug- Madrid
Extend soa with api management   spoug- MadridExtend soa with api management   spoug- Madrid
Extend soa with api management spoug- Madrid
 
Alfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform UpdateAlfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform Update
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdf
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdf
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdf
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdf
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdf
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdf
 
Tech Talk Live - 5.2 REST APIs
Tech Talk Live - 5.2 REST APIsTech Talk Live - 5.2 REST APIs
Tech Talk Live - 5.2 REST APIs
 

Plus de Piergiorgio Lucidi

Embracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationEmbracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationPiergiorgio Lucidi
 
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Piergiorgio Lucidi
 
Smart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StorySmart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StoryPiergiorgio Lucidi
 
Design your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesDesign your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesPiergiorgio Lucidi
 
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyAlfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyPiergiorgio Lucidi
 
The Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesThe Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesPiergiorgio Lucidi
 
Implementing portlets using Web Scripts
Implementing portlets using Web ScriptsImplementing portlets using Web Scripts
Implementing portlets using Web ScriptsPiergiorgio Lucidi
 
Alfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesenseAlfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesensePiergiorgio Lucidi
 
Alfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European BankAlfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European BankPiergiorgio Lucidi
 
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomeThe ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomePiergiorgio Lucidi
 

Plus de Piergiorgio Lucidi (13)

Embracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationEmbracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital Transformation
 
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
 
Smart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StorySmart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success Story
 
Design your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesDesign your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process Services
 
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyAlfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
 
The Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesThe Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's Successes
 
Implementing portlets using Web Scripts
Implementing portlets using Web ScriptsImplementing portlets using Web Scripts
Implementing portlets using Web Scripts
 
Alfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesenseAlfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - Sourcesense
 
Alfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European BankAlfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European Bank
 
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomeThe ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
 
Hippo CMS - A first look
Hippo CMS - A first lookHippo CMS - A first look
Hippo CMS - A first look
 
Spring Ldap
Spring LdapSpring Ldap
Spring Ldap
 
Spring In Alfresco Ecm
Spring In Alfresco EcmSpring In Alfresco Ecm
Spring In Alfresco Ecm
 

Dernier

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Dernier (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Super Size Your Search

  • 1. #SummitNow Super Size Your Search 6th November 2013 Piergiorgio Lucidi (Sourcesense) Fran Alvarez (Zaizi)
  • 2. #SummitNow#SummitNow Piergiorgio Lucidi • Open Source ECM Specialist at Sourcesense • Alfresco Certified Trainer / Engineer • Alfresco Wiki Gardener / Community Star • Alfresco forum supporter • Global Moderator of the italian forum • Author and Technical Reviewer at Packt • PMC Member and Mentor at ASF • Project Leader in the JBoss Community
  • 3. #SummitNow#SummitNow Overview How to build and manage your search server: 1. Scenario 2. Introducing Apache ManifoldCF 3. Zaizi Integrated Search Solution
  • 4. #SummitNow#SummitNow Scenario An overview about the typical complex search architecture
  • 5. #SummitNow#SummitNow Scenario - Alfresco limitations Alfresco supports these search engines: • Apache Lucene (embedded) • Apache Solr (provided by Alfresco) • needs development if other repositories must be involved Every other approach must be implemented (ScheduledActions, WebScripts, etc..)
  • 6. #SummitNow#SummitNow Scenario – Embedded Simple Search Architecture Alfresco is the only one repository involved in the architecture using the embedded search engine: • the repository must take care of indexes also managing index transactions Indexes Alfresco FrontEnd applications Apache Lucene
  • 7. #SummitNow#SummitNow Scenario – Embedded - Cluster Embedded Not easy to scale out with Lucene 1. every cluster must have its own search indexes 2. The cluster must synchronize indexes Indexes Alfresco Apache Lucene Indexes Alfresco Apache Lucene JGroups
  • 8. #SummitNow#SummitNow Scenario – Simple Architecture Simple search architecture Alfresco is the only one repository involved in the architecture with an external search server 1. The search server can be used for publish contents in the front end architecture 2. The repository will stay in the logic backend Search Engine Indexes Alfresco FrontEnd applications
  • 9. #SummitNow#SummitNow Scenario – Publish with search A search engine can be used for: • advanced management of search indexes • scaling out • executing complex search on contents • publishing contents in the FE architecture
  • 10. #SummitNow#SummitNow Scenario – Publish with search Publish with search architecture Alfresco is the only one repository involved in the architecture with an external search server 1. The search server can be used for publishing contents in the front end architecture (HTML) 2. The repository will stay in the logic backend Search Engine Indexes Alfresco FrontEnd applications BackEnd FrontEnd Lucene / Solr Indexes
  • 11. #SummitNow#SummitNow Scenario – Simple Architecture Simple Search Architecture Alfresco is the only one repository involved in the architecture with an external search server 1. The search server can be used for publish contents in the front end architecture 2. The repository will stay in the logic backend Search Engine Indexes Alfresco FrontEnd applications
  • 12. #SummitNow#SummitNow Scenario – Complex Architecture 1. Alfresco is only one of the platforms that must be involved in your search architecture 2. You don’t want to increase the development effort 3. You want just something to configure 
  • 13. #SummitNow#SummitNow Scenario – Complex Architecture Architecture with different ECM systems Alfresco is one of the content platforms that must be involved in the indexing process Alfresco Search Engine Indexes SharePoint FileNet CMIS JIRA Google Drive DropBox
  • 14. #SummitNow#SummitNow Scenario – Complex Architecture Architecture with different ECM systems Alfresco is one of the content platforms that must be involved in the indexing process Alfresco Search Engine Indexes SharePoint FileNet CMIS JIRA Google Drive DropBox
  • 15. #SummitNow#SummitNow Scenario – Complex Architecture Architecture with different ECM systems Alfresco is one of the content platforms that must be involved in the indexing process Alfresco Search Engine Indexes SharePoint FileNet CMIS JIRA Google Drive DropBox
  • 17. #SummitNow#SummitNow Apache ManifoldCF - History ManifoldCF code base was granted by MetaCarta to the Apache Software Foundation in December 2009. The MetaCarta effort represented more than five years of successful development and testing in multiple, challenging enterprise environments. The project was graduated as Apache Top Level Project in July 2012.
  • 18. #SummitNow#SummitNow Apache ManifoldCF – What is? Open Source crawler • crawling model (add, change, delete) • schedule jobs to create indexes • get contents from repositories • push contents on search servers
  • 19. #SummitNow#SummitNow Apache ManifoldCF – What is? Repository 1 Repository 3 Repository 4 Repository 2 Apache ManifoldCF Search Server 1 Search Server 2 Search Server 3 Search Server 4
  • 20. #SummitNow#SummitNow Apache ManifoldCF – What is? Out-Of-The-Box it is distributed as a webapp • REST API • Authority Service • ACL indexes • Crawler UI can be embedded in any Java application
  • 21. #SummitNow#SummitNow Apache ManifoldCF – Why? • Reliability • Incremental • Flexible • Multi repositories • Security model • Monitoring
  • 22. #SummitNow#SummitNow ManifoldCF – Why? - Reliability Jobs scheduling and configuration are stored in the database to maintain the state of all the executions Repository 1 Repository 3 Repository 4 Repository 2 Apache ManifoldCF Search Server 1 Search Server 2 Search Server 3 Search Server 4 Pull Agent Daemon Database
  • 23. #SummitNow#SummitNow ManifoldCF – Why? - Incremental get content changesets obtained from the repository API Repository 1 Apache ManifoldCF Pull Agent Daemon Database query Complete Changesets
  • 24. #SummitNow#SummitNow ManifoldCF – Why? - Flexible If the repository can't supply all the changes Manifold can discover them through crawling Apache ManifoldCF Pull Agent Daemon Database query Incomplete Changesets Change Discovery N N
  • 25. #SummitNow#SummitNow ManifoldCF – Why? – Multi repo Jobs can retrieve contents from the following repositories: • Google Drive • Dropbox • HDFS • CMIS-compliant • Alfresco • IBM FileNet • EMC Documentum • Microsoft SharePoint • OpenText LiveLink • Autonomy Meridio • Memex Patriarch • Windows Share/DFS • Generic JDBC • Generic Filesystem • Generic RSS and Web
  • 26. #SummitNow#SummitNow ManifoldCF – Why? – Multi repo Jobs can ingest contents to the following search servers: • Apache Solr • ElasticSearch • OpenSearchServer • MetaCarta GTS
  • 27. #SummitNow#SummitNow ManifoldCF – Why? - Security Retrieve per-content ACLs Repository 1 Repository 3 Repository 4 Repository 2 Apache ManifoldCF Search Server 1 Search Server 2 Search Server 3 Search Server 4 Authority Service Authority 1 Authority 2 access tokens
  • 28. #SummitNow#SummitNow ManifoldCF – Why? - Security Retrieve per-content ACLs Repository 1 Repository 3 Repository 4 Repository 2 Apache ManifoldCF Search Server 1 Search Server 2 Search Server 3 Search Server 4 Authority Service Authority 1 Authority 2 user access tokens user specific search results
  • 29. #SummitNow#SummitNow ManifoldCF – Why? – MonitoringUI Crawler allows you to: • configure jobs and connectors • monitor jobs execution • monitor contents ingestion • status reports • document status • queue status • history reports • simple history • maximum activity • maximum bandwidth • result histogram
  • 31. #SummitNow#SummitNow ManifoldCF – Architecture Repository Job Search Server ACLs Repository Connector
  • 32. #SummitNow#SummitNow ManifoldCF – Architecture Repository Job Search Server ACLs Repository Connector Output Connector
  • 33. #SummitNow#SummitNow ManifoldCF – Architecture Repository Job Search Server ACLs Repository Connector Output Connector Authority Connector
  • 34. #SummitNow#SummitNow ManifoldCF – Architecture Repository Job Search Server ACLs Repository Connector query to retrieve contents Output Connector Authority Connector
  • 35. #SummitNow#SummitNow ManifoldCF – Architecture Repository Job Search Server ACLs Repository Connector query to retrieve contents Output Connector metadata mapping content ingestion Authority Connector
  • 36. #SummitNow#SummitNow ManifoldCF – Architecture Repository Job Search Server ACLs Repository Connector query to retrieve contents Output Connector metadata mapping content ingestion Authority Connector retrieve content ACEs
  • 37. #SummitNow#SummitNow ManifoldCF – Architecture Repository Job Search Server ACLs Repository Connector query to retrieve contents Output Connector metadata mapping content ingestion Authority Connector retrieve content ACEs • verbal description • crawling model • scheduling
  • 39. #SummitNow#SummitNow ManifoldCF - Resources The project is available at http://manifoldcf.apache.org/ From this website you can access to the mailing lists, documentation and download links for binaries and source.
  • 40. #SummitNow#SummitNow ManifoldCF – Resources - Book ManifoldCF in Action by Karl Wright published by Manning Karl is the original developer and the principal committer of Apache ManifoldCF The book is available at http://www.manning.com/wright
  • 42. #SummitNow#SummitNow Fran Alvarez • Director of Zaizi Iberia and Lead Architect • Alfresco Certified Engineer • Responsible of large Alfresco architectures • Semantic Consultant for Sensefy • Alfresco Meetups Organizer
  • 43. #SummitNow#SummitNow Alfresco + Solr Approach Quite a good architecture • Performance issues are solved • Different architectures depending on business requirements However… • It does not cover some use cases or scenarios • It does not leverage Cloud benefits or latest technologies • With huge data volume there are other approaches How can we solve limitations and enhance benefits?
  • 44. #SummitNow#SummitNow Alfresco + Solr Approach • Decouples Search solution from Alfresco • Allow to implement different Search solutions • Allow to change Search solution without changing anything in Alfresco • Not even a property! • Provides an API to integrate it with Alfresco as search engine • Even other repository vendors! E.g. Filesystem, Sharepoint, Documentum, Filenet, Drupal… • And preserve security permissions in the results • Alfresco permissions are indexed and used during search It’s included in our Semantic solution: Sensefy!
  • 45. #SummitNow#SummitNow What we’ve done in Manifold Repository Connector: • Alfresco Repository Connector: New implementation • Removing dependency with Alfresco Solr API Output connectors: • Cloud Search Output Connector: Design & Development • Elastic Search Output Connector: Improvements • Solr Cloud Output Connector: Configuration for Alfresco Authority Connector • Alfresco Authority Connector: Design & Development • Similar approach to Alfresco Solr • Acl reads for Users and Groups in Alfresco
  • 47. #SummitNow#SummitNow I: Several Alfresco instances Current Approach: • Each Alfresco has its own Search subsystem • They can’t share indexes Implications: • Federated search is not an option • Results can’t be merged • If so, what resultset should be first? Conclusion Results could be presented to users in different tabs or “manually” merged. Not the best approach
  • 48. #SummitNow#SummitNow I: Several Alfresco instances Zaizi Approach: • Our solution like search box • Which manages a single index Implications: • All documents are driven to same index • Users can select results from either all Alfresco instances or a subset Conclusion Search across Repositories Could be based Elastic Search, Solr Cloud, Amazon Cloud, etc.
  • 49. #SummitNow#SummitNow II: Alfresco + Other data providers Current Approach: • Alfresco has its own Search subsystem • Other repository may have (or not) its own Search subsystem Implications: • Different data providers mean different formats • E.g. Filesystem does not support CMIS • Alfresco can’t reach external data Conclusion No way to merge results and present them uniformly to end users
  • 50. #SummitNow#SummitNow II: Alfresco + Other data providers Zaizi Approach: • Both Alfresco and other repositories share Search subsystem (Manifold) Implications: • Alfresco and other providers results will have same format in our Solution • They will speak ‘our’ language • Alfresco reaches external data when communicating with our solution Conclusion Results are present and accessible between data providers
  • 51. #SummitNow#SummitNow III: Alfresco + O(TB) data Current Approach: • Alfresco has its own Search subsystem • All data is in one (or several if cluster) Solr instance Implications: • Every Solr node manages the whole index • No chance to apply scale techniques for indexing: • Sharding, Replication… Conclusion Huge servers are required and performance might be compromised
  • 52. #SummitNow#SummitNow III: Alfresco + O(TB) data Zaizi Approach: • Alfresco uses our solution • Data is indexed in search solution which better suits: • Amazon Cloud, Solr Cloud, Elastic Search… Implications: • Cloud Search solution manages index • Indexing techniques can be applied according to use cases • Sharding, Replication Conclusion Search strategy can be adopted and easily implemented with search solution which better fits
  • 53. #SummitNow#SummitNow Apache Manifold: Other benefits Can extract, index and map information from any other sources • Apache Stanbol, RedLink, any other data enricher • Our solution will gather everything in one place • Documents, entities… Permissions are checked just once • Everything is in the same place, even user authorization capabilities • Performance and scalability is improved • Faceted search and other search capabilities are combined with such permission feature
  • 55. #SummitNow#SummitNow Conclusions Zaizi solution allows searching and indexing in the most popular Cloud Search solutions • Other Search solutions can be integrated as well Zaizi solution allows retrieving information from the most popular repositories • Other Data providers can be integrated too • It solves plenty of current issues related search and indexing in Alfresco • Can be used outside Alfresco or even with Alfresco and any other data repository Zaizi solution manages permissions and security from the most popular repositories and the latest Cloud search technologies Fully supported by us!
  • 57. #SummitNow#SummitNow What’s coming Powerful User Interface • Admin functions • Wide range of facets • UI for Share Benchmarking New connectors • Filesystem authority • RedLink repository • Stanbol repository Alfresco Search Subsystem?