SlideShare une entreprise Scribd logo
1  sur  43
Effective searching
Integrating External Search Engines with Adobe AEM
Dominik Kornaś
3 years in Cognifide – exactly today 
Senior software engineer & technical lead
Focused on systems integration tasks
The ”search guy” in Cognifide
Who am I?
What we won’t talk about
Sorting
Document
structure
Indexing
Managed
relevancy
model
Input data
processingHighlighter
Faceted
search
Wildcard
search
Statistics
Autocomplete
Spellchecking
Lemmatization
Sentence
search
Pagination
Content
normalization
Metadata
Data
collections
& views
The goal of searching
„What is the best British football team?”
If we ask such a question, will the search engine find the answer?
The goal of searching
„What is the best British football team?”
The search engine will find the question, not the answer.
The goal of searching
„What is the best British football team?”
vs.
„best team football UK”
Are we asking questions or issuing queries?
The goal of searching
The goal of searching
Effective searching is about finding keywords:
• in the shortest possible time
• close to each other in a block of text
• that are in a desired context
and being sure the engine knows about the data we are looking for!
Effective searchingIndexing
The Past
Microsoft FAST
The first major external search integration with AEM (then: CQ 5.4)
in Cognifide.
Push-like indexing using CQ-FAST connector from Adobe.
Microsoft FAST
Implemented as a dedicated replication agent, triggered by the
content replication.
http://wem.help.adobe.com/enterprise/en_US/10-0/wem/administering/cq2fast.html
Content builder
Transport
handler
MS FAST
Microsoft FAST
Replication agent processing workflow: HTTP request for a content
Metadata
Markup
Microsoft FAST
We can decide which instance the content should be read from.
Content builder
Transport
handler
MS FAST
Microsoft FAST
Replication agent processing workflow: metadata.ecma evaluation
Markup
Metadata
Content builder
Transport
handler
MS FAST
Microsoft FAST
Replication agent processing workflow: data upload
Markup
Metadata
Microsoft FAST
Sends content to MS FAST.
The ”cq5” suffix in the URI is
a document collection.
A named subset of documents
in the entire FAST index.
http://wem.help.adobe.com/enterprise/en_US/10-0/wem/administering/cq2fast.html
Content builder
Transport
handler
MS FAST
Microsoft FAST
Replication agent processing workflow: indexing
Markup
Metadata
Microsoft FAST
The replication agent is OK for one site, stored in a single FAST
collection of documents.
It becomes complicated in the multi-site environment where each
site must be located in a separate index area.
And when the search results should not contain data coming from
the different sites.
Microsoft FAST
Microsoft FAST
The complex ACL configuration has been used to ensure that only
one proper agent will deliver the document to FAST.
It was hard to set and maintain without the proper tools that have
automated the whole process.
The Present Day
Google Search Appliance
For the AEM & GSA integration, we have considered reusing of the
CQ-FAST connector approach.
But aware of the issues, we have decided to develop our own
micro-framework that takes care about the indexing process.
Installed as a single OSGi bundle.
Provides a set of services and utilities to help with the indexing.
Google Search Appliance
Content
replication
Filtering
Push to
Publish
Indexing
queue (-s)
Content
gathering
Metadata
processing
Push to
external
engine
The indexing process
spans between the
author and the publish
AEM instances.
All stages are tracked
and it is possible to
recover from the failure
and retry the indexing.
AuthorPublish
Process status tracking & persistence
Google Search Appliance
Content
replication
Filtering
Push to
Publish
Indexing
queue (-s)
Content
gathering
Metadata
processing
Push to
external
engine
The process starts with
the content replication.
OR
Programatically from the
backend, e.g. triggered
by the scheduler service.
AuthorPublish
Process status tracking & persistence
Google Search Appliance
Content
replication
Filtering
Push to
Publish
Indexing
queue (-s)
Content
gathering
Metadata
processing
Push to
external
engine
Each replicated content
path is filtered against
a whitelist & a blacklist.
There’s an option to use
a custom OSGi service
able to decide if the
content should be
indexed, removed or
ignored.
AuthorPublish
Process status tracking & persistence
Google Search Appliance
Content
replication
Filtering
Push to
Publish
Indexing
queue (-s)
Content
gathering
Metadata
processing
Push to
external
engine
The indexing information
is persisted in a special
kind of repository node
and replicated to the
publish instance.
We can choose which
publish instance(-s) will
receive the data.
AuthorPublish
Process status tracking & persistence
Google Search Appliance
Content
replication
Filtering
Push to
Publish
Indexing
queue (-s)
Content
gathering
Metadata
processing
Push to
external
engine
The information is
received and instantly
dispatched to the
indexing queue(-s).
We can handle indexing
in a single or multiple
different search engines.
AuthorPublish
Process status tracking & persistence
Google Search Appliance
Content
replication
Filtering
Push to
Publish
Indexing
queue (-s)
Content
gathering
Metadata
processing
Push to
external
engine
The content is gathered
using the
SlingRequestProcessor
OSGi service.
It’s like a request for an
HTML page sent from
the Java code and
consumed by itself.
AuthorPublish
Process status tracking & persistence
Google Search Appliance
Content
replication
Filtering
Push to
Publish
Indexing
queue (-s)
Content
gathering
Metadata
processing
Push to
external
engine
Metadata is collected
according to multiple
different rules:
• the content resource
type
• the content path
• values of the
component properties
• custom rules
AuthorPublish
Process status tracking & persistence
Google Search Appliance
Content
replication
Filtering
Push to
Publish
Indexing
queue (-s)
Content
gathering
Metadata
processing
Push to
external
engine
The content and
metadata are combined
together and sent to the
search engine.
Depending on the
implementation it can be
done for each single
document or in batches.
AuthorPublish
Process status tracking & persistence
Google Search Appliance
Content
replication
Filtering
Push to
Publish
Indexing
queue (-s)
Content
gathering
Metadata
processing
Failure or
timeout
Retry
In case of any failure,
indexing is rescheduled
and launched again as
many times as it is
configured.
If the server goes down,
indexing will restart
when the machine is up
again.
AuthorPublish
Process status tracking & persistence
Google Search Appliance
The flexible nature of our solution saved us when some fancy
requirements came.
The Future
Apache Solr
The search engine, which is:
• free & open source
• powerful
• customizable
• scalable
And what is the most important, it is a part of the Jackrabbit Oak
(JCR 3), the repository engine which has been used for AEM 6.
AEM with the integrated Solr is right there.
Apache Solr
The solution developed for GSA has been ported to work with Solr.
Changes:
• Replaced the ”glue code” that does the final data push, with
one that uses SolrJ Java library.
• Names of the document metadata fields has been changed to
follow the Solr naming convention for dynamic fields.
Everything else remained untouched.
Search driven
components
Search driven components
No server-side processing.
Search engine used as a mini database of metadata.
Configuration via query parameters.
Pure front-end implementation.
Search driven components
The whole page can be read from
the dispatcher cache.
An AJAX request gets the content
directly from the search engine.
The response is JSON-structured, easy to parse and to display,
using JavaScript.
{
"id": "223344",
"firstName": "Michael",
"lastName": "Johnson",
"phone": "(123)-777-8888",
"office": "Office UK",
"department": "504",
"title": "Lead Architect"
}
Search driven components
Search results component configured to return employee data.
Search driven components
User profile.
The name, mobile,
email, image path etc.
are all metadata values
of the document.
Search driven components
Carousel with news.
By changing the
maximum number
of search results,
we can control the
number of slides in
the carousel.
Thank you!

Contenu connexe

Tendances

Search domain basics
Search domain basicsSearch domain basics
Search domain basicspmanvi
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Rahul Jain
 
Combining Django REST framework & Elasticsearch
Combining Django REST framework & ElasticsearchCombining Django REST framework & Elasticsearch
Combining Django REST framework & ElasticsearchYaroslav Muravskyi
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Sematext Group, Inc.
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search medcl
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning ElasticsearchAnurag Patel
 
A Survey of Elasticsearch Usage
A Survey of Elasticsearch UsageA Survey of Elasticsearch Usage
A Survey of Elasticsearch UsageGreg Brown
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
 
Xapian vs sphinx
Xapian vs sphinxXapian vs sphinx
Xapian vs sphinxpanjunyong
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in NetflixDanny Yuan
 
Ahsay Backup Software v7 - Datasheet
Ahsay Backup Software v7 - DatasheetAhsay Backup Software v7 - Datasheet
Ahsay Backup Software v7 - DatasheetRonnie Chan
 
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, TargetJourney of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, TargetLucidworks
 
Building a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchMark Greene
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Edureka!
 
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...Lucidworks
 
Elasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easyElasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easyItamar
 
Introduction to Apache solr
Introduction to Apache solrIntroduction to Apache solr
Introduction to Apache solrKnoldus Inc.
 
Django Rest Framework - tips & trick
Django Rest Framework - tips & trick Django Rest Framework - tips & trick
Django Rest Framework - tips & trick Luca Zacchetti
 
Gab2015 azure search as a service
Gab2015 azure search as a serviceGab2015 azure search as a service
Gab2015 azure search as a serviceAlexandre Marreiros
 
Elasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionElasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionDavid Pilato
 

Tendances (20)

Search domain basics
Search domain basicsSearch domain basics
Search domain basics
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )
 
Combining Django REST framework & Elasticsearch
Combining Django REST framework & ElasticsearchCombining Django REST framework & Elasticsearch
Combining Django REST framework & Elasticsearch
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
 
A Survey of Elasticsearch Usage
A Survey of Elasticsearch UsageA Survey of Elasticsearch Usage
A Survey of Elasticsearch Usage
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Xapian vs sphinx
Xapian vs sphinxXapian vs sphinx
Xapian vs sphinx
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Ahsay Backup Software v7 - Datasheet
Ahsay Backup Software v7 - DatasheetAhsay Backup Software v7 - Datasheet
Ahsay Backup Software v7 - Datasheet
 
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, TargetJourney of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, Target
 
Building a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearch
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
 
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
 
Elasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easyElasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easy
 
Introduction to Apache solr
Introduction to Apache solrIntroduction to Apache solr
Introduction to Apache solr
 
Django Rest Framework - tips & trick
Django Rest Framework - tips & trick Django Rest Framework - tips & trick
Django Rest Framework - tips & trick
 
Gab2015 azure search as a service
Gab2015 azure search as a serviceGab2015 azure search as a service
Gab2015 azure search as a service
 
Elasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionElasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English version
 

En vedette

SBRN: How to use Facebook For Business
SBRN: How to use Facebook For BusinessSBRN: How to use Facebook For Business
SBRN: How to use Facebook For BusinessBirgit Pauli-Haack
 
Las 5 preocupaciones en Twitter sobre la salida de Claudio Palma en Twitter
Las 5 preocupaciones en Twitter sobre la salida de Claudio Palma en TwitterLas 5 preocupaciones en Twitter sobre la salida de Claudio Palma en Twitter
Las 5 preocupaciones en Twitter sobre la salida de Claudio Palma en TwitterRoberto Guerra
 
Legado de un Paciente Critico Terminal UP Med
Legado de un Paciente Critico Terminal UP MedLegado de un Paciente Critico Terminal UP Med
Legado de un Paciente Critico Terminal UP MedJorge Sinclair
 
Cuadro sinóptico
Cuadro sinópticoCuadro sinóptico
Cuadro sinópticolreina15
 
WTF is Twisted?
WTF is Twisted?WTF is Twisted?
WTF is Twisted?hawkowl
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6DEEPAK KHETAWAT
 
Action weekly'15 edition 1
Action weekly'15 edition 1Action weekly'15 edition 1
Action weekly'15 edition 1inactionagency
 
Brand Building in the Age of Big Data by Mr. Gavin Coombes
Brand Building in the Age of Big Data by Mr. Gavin CoombesBrand Building in the Age of Big Data by Mr. Gavin Coombes
Brand Building in the Age of Big Data by Mr. Gavin Coombeswkwsci-research
 
Assigment 3 it
Assigment 3 itAssigment 3 it
Assigment 3 itMona Bijan
 
A walking tour of boston
A walking tour of bostonA walking tour of boston
A walking tour of bostonJohn Maxwell
 

En vedette (20)

Oto cycles 1
Oto cycles 1Oto cycles 1
Oto cycles 1
 
Crm
CrmCrm
Crm
 
Proyecto daphne es
Proyecto daphne esProyecto daphne es
Proyecto daphne es
 
আহত সাপ
আহত সাপআহত সাপ
আহত সাপ
 
SBRN: How to use Facebook For Business
SBRN: How to use Facebook For BusinessSBRN: How to use Facebook For Business
SBRN: How to use Facebook For Business
 
Las 5 preocupaciones en Twitter sobre la salida de Claudio Palma en Twitter
Las 5 preocupaciones en Twitter sobre la salida de Claudio Palma en TwitterLas 5 preocupaciones en Twitter sobre la salida de Claudio Palma en Twitter
Las 5 preocupaciones en Twitter sobre la salida de Claudio Palma en Twitter
 
47. EPISTOLA 2CORINTENI ORIGINAL ANTIC -FACSIMIL
47. EPISTOLA  2CORINTENI  ORIGINAL ANTIC -FACSIMIL47. EPISTOLA  2CORINTENI  ORIGINAL ANTIC -FACSIMIL
47. EPISTOLA 2CORINTENI ORIGINAL ANTIC -FACSIMIL
 
Factoryweb Agencias Publicidad
Factoryweb Agencias PublicidadFactoryweb Agencias Publicidad
Factoryweb Agencias Publicidad
 
Legado de un Paciente Critico Terminal UP Med
Legado de un Paciente Critico Terminal UP MedLegado de un Paciente Critico Terminal UP Med
Legado de un Paciente Critico Terminal UP Med
 
Cuadro sinóptico
Cuadro sinópticoCuadro sinóptico
Cuadro sinóptico
 
WTF is Twisted?
WTF is Twisted?WTF is Twisted?
WTF is Twisted?
 
Instrumentos
InstrumentosInstrumentos
Instrumentos
 
Cartas a julieta
Cartas a julietaCartas a julieta
Cartas a julieta
 
Presentación holacracia
Presentación holacraciaPresentación holacracia
Presentación holacracia
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6
 
Action weekly'15 edition 1
Action weekly'15 edition 1Action weekly'15 edition 1
Action weekly'15 edition 1
 
Brand Building in the Age of Big Data by Mr. Gavin Coombes
Brand Building in the Age of Big Data by Mr. Gavin CoombesBrand Building in the Age of Big Data by Mr. Gavin Coombes
Brand Building in the Age of Big Data by Mr. Gavin Coombes
 
Your self improvement
Your self improvementYour self improvement
Your self improvement
 
Assigment 3 it
Assigment 3 itAssigment 3 it
Assigment 3 it
 
A walking tour of boston
A walking tour of bostonA walking tour of boston
A walking tour of boston
 

Similaire à Effective Searching by Dominik Kornas

SharePoint 2013 Search Topology and Optimization
SharePoint 2013 Search Topology and OptimizationSharePoint 2013 Search Topology and Optimization
SharePoint 2013 Search Topology and OptimizationMike Maadarani
 
Google indexing
Google indexingGoogle indexing
Google indexingtahoor71
 
How To Build your own Custom Search Engine
How To Build your own Custom Search EngineHow To Build your own Custom Search Engine
How To Build your own Custom Search EngineRicha Budhraja
 
#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimization#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimizationMike Maadarani
 
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document RetrievalAn Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document Retrievaliosrjce
 
SharePoint Search Topology and Optimization
SharePoint Search Topology and OptimizationSharePoint Search Topology and Optimization
SharePoint Search Topology and OptimizationMike Maadarani
 
Introduction to Search Engine Optimization
Introduction to Search Engine OptimizationIntroduction to Search Engine Optimization
Introduction to Search Engine OptimizationGauravPrajapati39
 
Google Search Engine
Google Search Engine Google Search Engine
Google Search Engine Aniket_1415
 
Search Engines: Best Practice
Search Engines: Best PracticeSearch Engines: Best Practice
Search Engines: Best PracticeYuliya_Prach
 
Understanding and Applying Cloud Hybrid Search
Understanding and Applying Cloud Hybrid SearchUnderstanding and Applying Cloud Hybrid Search
Understanding and Applying Cloud Hybrid SearchJeff Fried
 
IRJET - Review on Search Engine Optimization
IRJET - Review on Search Engine OptimizationIRJET - Review on Search Engine Optimization
IRJET - Review on Search Engine OptimizationIRJET Journal
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...Kumar Goud
 
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.iosrjce
 
Cloud Hybrid Search with SharePoint
Cloud Hybrid Search with SharePointCloud Hybrid Search with SharePoint
Cloud Hybrid Search with SharePointJeff Fried
 
Ektron 8.5 RC - Search
Ektron 8.5 RC - SearchEktron 8.5 RC - Search
Ektron 8.5 RC - SearchBillCavaUs
 
Introduction to internet.
Introduction to internet.Introduction to internet.
Introduction to internet.Anish Thomas
 

Similaire à Effective Searching by Dominik Kornas (20)

SharePoint 2013 Search Topology and Optimization
SharePoint 2013 Search Topology and OptimizationSharePoint 2013 Search Topology and Optimization
SharePoint 2013 Search Topology and Optimization
 
Google indexing
Google indexingGoogle indexing
Google indexing
 
How To Build your own Custom Search Engine
How To Build your own Custom Search EngineHow To Build your own Custom Search Engine
How To Build your own Custom Search Engine
 
#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimization#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimization
 
G017254554
G017254554G017254554
G017254554
 
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document RetrievalAn Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
 
Google
GoogleGoogle
Google
 
SharePoint Search Topology and Optimization
SharePoint Search Topology and OptimizationSharePoint Search Topology and Optimization
SharePoint Search Topology and Optimization
 
Introduction to Search Engine Optimization
Introduction to Search Engine OptimizationIntroduction to Search Engine Optimization
Introduction to Search Engine Optimization
 
Google Search Engine
Google Search Engine Google Search Engine
Google Search Engine
 
How Google Works
How Google WorksHow Google Works
How Google Works
 
Search Engines: Best Practice
Search Engines: Best PracticeSearch Engines: Best Practice
Search Engines: Best Practice
 
Understanding and Applying Cloud Hybrid Search
Understanding and Applying Cloud Hybrid SearchUnderstanding and Applying Cloud Hybrid Search
Understanding and Applying Cloud Hybrid Search
 
IRJET - Review on Search Engine Optimization
IRJET - Review on Search Engine OptimizationIRJET - Review on Search Engine Optimization
IRJET - Review on Search Engine Optimization
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
 
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
 
E017624043
E017624043E017624043
E017624043
 
Cloud Hybrid Search with SharePoint
Cloud Hybrid Search with SharePointCloud Hybrid Search with SharePoint
Cloud Hybrid Search with SharePoint
 
Ektron 8.5 RC - Search
Ektron 8.5 RC - SearchEktron 8.5 RC - Search
Ektron 8.5 RC - Search
 
Introduction to internet.
Introduction to internet.Introduction to internet.
Introduction to internet.
 

Plus de AEM HUB

Microservices for AEM by Maciej Majchrzak
Microservices for AEM by Maciej MajchrzakMicroservices for AEM by Maciej Majchrzak
Microservices for AEM by Maciej MajchrzakAEM HUB
 
When dispatcher caching is not enough by Jakub Wądołowski
When dispatcher caching is not enough by Jakub WądołowskiWhen dispatcher caching is not enough by Jakub Wądołowski
When dispatcher caching is not enough by Jakub WądołowskiAEM HUB
 
Sling Models Using Sightly and JSP by Deepak Khetawat
Sling Models Using Sightly and JSP by Deepak KhetawatSling Models Using Sightly and JSP by Deepak Khetawat
Sling Models Using Sightly and JSP by Deepak KhetawatAEM HUB
 
PhoneGap Enterprise Viewer by Anthony Rumsey
PhoneGap Enterprise Viewer by Anthony RumseyPhoneGap Enterprise Viewer by Anthony Rumsey
PhoneGap Enterprise Viewer by Anthony RumseyAEM HUB
 
Integrating Apache Wookie with AEM by Rima Mittal and Ankit Gubrani
Integrating Apache Wookie with AEM by Rima Mittal and Ankit GubraniIntegrating Apache Wookie with AEM by Rima Mittal and Ankit Gubrani
Integrating Apache Wookie with AEM by Rima Mittal and Ankit GubraniAEM HUB
 
Mastering the Sling Rewriter by Justin Edelson
Mastering the Sling Rewriter by Justin EdelsonMastering the Sling Rewriter by Justin Edelson
Mastering the Sling Rewriter by Justin EdelsonAEM HUB
 
Building Quality into the AEM Publication Workflow with Active Standards by D...
Building Quality into the AEM Publication Workflow with Active Standards by D...Building Quality into the AEM Publication Workflow with Active Standards by D...
Building Quality into the AEM Publication Workflow with Active Standards by D...AEM HUB
 
Touching the AEM component dialog by Mateusz Chromiński
Touching the AEM component dialog by Mateusz ChromińskiTouching the AEM component dialog by Mateusz Chromiński
Touching the AEM component dialog by Mateusz ChromińskiAEM HUB
 
How to build a Social Intranet with Adobe Sites and 3rd Party products ... us...
How to build a Social Intranet with Adobe Sites and 3rd Party products ... us...How to build a Social Intranet with Adobe Sites and 3rd Party products ... us...
How to build a Social Intranet with Adobe Sites and 3rd Party products ... us...AEM HUB
 
How do you build flexible platforms that focuses on business needs? by Fahim...
How do you build flexible platforms that focuses on business needs?  by Fahim...How do you build flexible platforms that focuses on business needs?  by Fahim...
How do you build flexible platforms that focuses on business needs? by Fahim...AEM HUB
 
AEM Apps Enhanced: In-app Messaging and Beacons by John Fait
AEM Apps Enhanced: In-app Messaging and Beacons by John FaitAEM Apps Enhanced: In-app Messaging and Beacons by John Fait
AEM Apps Enhanced: In-app Messaging and Beacons by John FaitAEM HUB
 
Effectively Scale and Operate AEM with MongoDB by Norberto Leite
Effectively Scale and Operate AEM with MongoDB by Norberto LeiteEffectively Scale and Operate AEM with MongoDB by Norberto Leite
Effectively Scale and Operate AEM with MongoDB by Norberto LeiteAEM HUB
 
Adobe Managed Services: Complicated Cloud Deployments by Adam Pazik, Mike Til...
Adobe Managed Services: Complicated Cloud Deployments by Adam Pazik, Mike Til...Adobe Managed Services: Complicated Cloud Deployments by Adam Pazik, Mike Til...
Adobe Managed Services: Complicated Cloud Deployments by Adam Pazik, Mike Til...AEM HUB
 
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger MarsenAdobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger MarsenAEM HUB
 
Responsive Websites and Grid-Based Layouts by Gabriel Walt
Responsive Websites and Grid-Based Layouts by Gabriel Walt Responsive Websites and Grid-Based Layouts by Gabriel Walt
Responsive Websites and Grid-Based Layouts by Gabriel Walt AEM HUB
 
When Sightly Meets Slice by Tomasz Niedźwiedź
When Sightly Meets Slice by Tomasz NiedźwiedźWhen Sightly Meets Slice by Tomasz Niedźwiedź
When Sightly Meets Slice by Tomasz NiedźwiedźAEM HUB
 
Creativity without comprise by Cleve Gibbon
Creativity without comprise by Cleve Gibbon Creativity without comprise by Cleve Gibbon
Creativity without comprise by Cleve Gibbon AEM HUB
 
REST in AEM by Roy Fielding
REST in AEM by Roy FieldingREST in AEM by Roy Fielding
REST in AEM by Roy FieldingAEM HUB
 
Adobe Summit 2015 - Penguin Random House - Accelerating Digital Transformation
Adobe Summit 2015 - Penguin Random House - Accelerating Digital TransformationAdobe Summit 2015 - Penguin Random House - Accelerating Digital Transformation
Adobe Summit 2015 - Penguin Random House - Accelerating Digital TransformationAEM HUB
 
Socialize your Exceptional Web Experience – Adobe AEM & IBM Connections by He...
Socialize your Exceptional Web Experience – Adobe AEM & IBM Connections by He...Socialize your Exceptional Web Experience – Adobe AEM & IBM Connections by He...
Socialize your Exceptional Web Experience – Adobe AEM & IBM Connections by He...AEM HUB
 

Plus de AEM HUB (20)

Microservices for AEM by Maciej Majchrzak
Microservices for AEM by Maciej MajchrzakMicroservices for AEM by Maciej Majchrzak
Microservices for AEM by Maciej Majchrzak
 
When dispatcher caching is not enough by Jakub Wądołowski
When dispatcher caching is not enough by Jakub WądołowskiWhen dispatcher caching is not enough by Jakub Wądołowski
When dispatcher caching is not enough by Jakub Wądołowski
 
Sling Models Using Sightly and JSP by Deepak Khetawat
Sling Models Using Sightly and JSP by Deepak KhetawatSling Models Using Sightly and JSP by Deepak Khetawat
Sling Models Using Sightly and JSP by Deepak Khetawat
 
PhoneGap Enterprise Viewer by Anthony Rumsey
PhoneGap Enterprise Viewer by Anthony RumseyPhoneGap Enterprise Viewer by Anthony Rumsey
PhoneGap Enterprise Viewer by Anthony Rumsey
 
Integrating Apache Wookie with AEM by Rima Mittal and Ankit Gubrani
Integrating Apache Wookie with AEM by Rima Mittal and Ankit GubraniIntegrating Apache Wookie with AEM by Rima Mittal and Ankit Gubrani
Integrating Apache Wookie with AEM by Rima Mittal and Ankit Gubrani
 
Mastering the Sling Rewriter by Justin Edelson
Mastering the Sling Rewriter by Justin EdelsonMastering the Sling Rewriter by Justin Edelson
Mastering the Sling Rewriter by Justin Edelson
 
Building Quality into the AEM Publication Workflow with Active Standards by D...
Building Quality into the AEM Publication Workflow with Active Standards by D...Building Quality into the AEM Publication Workflow with Active Standards by D...
Building Quality into the AEM Publication Workflow with Active Standards by D...
 
Touching the AEM component dialog by Mateusz Chromiński
Touching the AEM component dialog by Mateusz ChromińskiTouching the AEM component dialog by Mateusz Chromiński
Touching the AEM component dialog by Mateusz Chromiński
 
How to build a Social Intranet with Adobe Sites and 3rd Party products ... us...
How to build a Social Intranet with Adobe Sites and 3rd Party products ... us...How to build a Social Intranet with Adobe Sites and 3rd Party products ... us...
How to build a Social Intranet with Adobe Sites and 3rd Party products ... us...
 
How do you build flexible platforms that focuses on business needs? by Fahim...
How do you build flexible platforms that focuses on business needs?  by Fahim...How do you build flexible platforms that focuses on business needs?  by Fahim...
How do you build flexible platforms that focuses on business needs? by Fahim...
 
AEM Apps Enhanced: In-app Messaging and Beacons by John Fait
AEM Apps Enhanced: In-app Messaging and Beacons by John FaitAEM Apps Enhanced: In-app Messaging and Beacons by John Fait
AEM Apps Enhanced: In-app Messaging and Beacons by John Fait
 
Effectively Scale and Operate AEM with MongoDB by Norberto Leite
Effectively Scale and Operate AEM with MongoDB by Norberto LeiteEffectively Scale and Operate AEM with MongoDB by Norberto Leite
Effectively Scale and Operate AEM with MongoDB by Norberto Leite
 
Adobe Managed Services: Complicated Cloud Deployments by Adam Pazik, Mike Til...
Adobe Managed Services: Complicated Cloud Deployments by Adam Pazik, Mike Til...Adobe Managed Services: Complicated Cloud Deployments by Adam Pazik, Mike Til...
Adobe Managed Services: Complicated Cloud Deployments by Adam Pazik, Mike Til...
 
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger MarsenAdobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
 
Responsive Websites and Grid-Based Layouts by Gabriel Walt
Responsive Websites and Grid-Based Layouts by Gabriel Walt Responsive Websites and Grid-Based Layouts by Gabriel Walt
Responsive Websites and Grid-Based Layouts by Gabriel Walt
 
When Sightly Meets Slice by Tomasz Niedźwiedź
When Sightly Meets Slice by Tomasz NiedźwiedźWhen Sightly Meets Slice by Tomasz Niedźwiedź
When Sightly Meets Slice by Tomasz Niedźwiedź
 
Creativity without comprise by Cleve Gibbon
Creativity without comprise by Cleve Gibbon Creativity without comprise by Cleve Gibbon
Creativity without comprise by Cleve Gibbon
 
REST in AEM by Roy Fielding
REST in AEM by Roy FieldingREST in AEM by Roy Fielding
REST in AEM by Roy Fielding
 
Adobe Summit 2015 - Penguin Random House - Accelerating Digital Transformation
Adobe Summit 2015 - Penguin Random House - Accelerating Digital TransformationAdobe Summit 2015 - Penguin Random House - Accelerating Digital Transformation
Adobe Summit 2015 - Penguin Random House - Accelerating Digital Transformation
 
Socialize your Exceptional Web Experience – Adobe AEM & IBM Connections by He...
Socialize your Exceptional Web Experience – Adobe AEM & IBM Connections by He...Socialize your Exceptional Web Experience – Adobe AEM & IBM Connections by He...
Socialize your Exceptional Web Experience – Adobe AEM & IBM Connections by He...
 

Dernier

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Dernier (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

Effective Searching by Dominik Kornas

  • 1. Effective searching Integrating External Search Engines with Adobe AEM Dominik Kornaś
  • 2. 3 years in Cognifide – exactly today  Senior software engineer & technical lead Focused on systems integration tasks The ”search guy” in Cognifide Who am I?
  • 3. What we won’t talk about Sorting Document structure Indexing Managed relevancy model Input data processingHighlighter Faceted search Wildcard search Statistics Autocomplete Spellchecking Lemmatization Sentence search Pagination Content normalization Metadata Data collections & views
  • 4. The goal of searching
  • 5. „What is the best British football team?” If we ask such a question, will the search engine find the answer? The goal of searching
  • 6. „What is the best British football team?” The search engine will find the question, not the answer. The goal of searching
  • 7. „What is the best British football team?” vs. „best team football UK” Are we asking questions or issuing queries? The goal of searching
  • 8. The goal of searching Effective searching is about finding keywords: • in the shortest possible time • close to each other in a block of text • that are in a desired context and being sure the engine knows about the data we are looking for!
  • 11. Microsoft FAST The first major external search integration with AEM (then: CQ 5.4) in Cognifide. Push-like indexing using CQ-FAST connector from Adobe.
  • 12. Microsoft FAST Implemented as a dedicated replication agent, triggered by the content replication. http://wem.help.adobe.com/enterprise/en_US/10-0/wem/administering/cq2fast.html
  • 13. Content builder Transport handler MS FAST Microsoft FAST Replication agent processing workflow: HTTP request for a content Metadata Markup
  • 14. Microsoft FAST We can decide which instance the content should be read from.
  • 15. Content builder Transport handler MS FAST Microsoft FAST Replication agent processing workflow: metadata.ecma evaluation Markup Metadata
  • 16. Content builder Transport handler MS FAST Microsoft FAST Replication agent processing workflow: data upload Markup Metadata
  • 17. Microsoft FAST Sends content to MS FAST. The ”cq5” suffix in the URI is a document collection. A named subset of documents in the entire FAST index. http://wem.help.adobe.com/enterprise/en_US/10-0/wem/administering/cq2fast.html
  • 18. Content builder Transport handler MS FAST Microsoft FAST Replication agent processing workflow: indexing Markup Metadata
  • 19. Microsoft FAST The replication agent is OK for one site, stored in a single FAST collection of documents. It becomes complicated in the multi-site environment where each site must be located in a separate index area. And when the search results should not contain data coming from the different sites.
  • 21. Microsoft FAST The complex ACL configuration has been used to ensure that only one proper agent will deliver the document to FAST. It was hard to set and maintain without the proper tools that have automated the whole process.
  • 23. Google Search Appliance For the AEM & GSA integration, we have considered reusing of the CQ-FAST connector approach. But aware of the issues, we have decided to develop our own micro-framework that takes care about the indexing process. Installed as a single OSGi bundle. Provides a set of services and utilities to help with the indexing.
  • 24. Google Search Appliance Content replication Filtering Push to Publish Indexing queue (-s) Content gathering Metadata processing Push to external engine The indexing process spans between the author and the publish AEM instances. All stages are tracked and it is possible to recover from the failure and retry the indexing. AuthorPublish Process status tracking & persistence
  • 25. Google Search Appliance Content replication Filtering Push to Publish Indexing queue (-s) Content gathering Metadata processing Push to external engine The process starts with the content replication. OR Programatically from the backend, e.g. triggered by the scheduler service. AuthorPublish Process status tracking & persistence
  • 26. Google Search Appliance Content replication Filtering Push to Publish Indexing queue (-s) Content gathering Metadata processing Push to external engine Each replicated content path is filtered against a whitelist & a blacklist. There’s an option to use a custom OSGi service able to decide if the content should be indexed, removed or ignored. AuthorPublish Process status tracking & persistence
  • 27. Google Search Appliance Content replication Filtering Push to Publish Indexing queue (-s) Content gathering Metadata processing Push to external engine The indexing information is persisted in a special kind of repository node and replicated to the publish instance. We can choose which publish instance(-s) will receive the data. AuthorPublish Process status tracking & persistence
  • 28. Google Search Appliance Content replication Filtering Push to Publish Indexing queue (-s) Content gathering Metadata processing Push to external engine The information is received and instantly dispatched to the indexing queue(-s). We can handle indexing in a single or multiple different search engines. AuthorPublish Process status tracking & persistence
  • 29. Google Search Appliance Content replication Filtering Push to Publish Indexing queue (-s) Content gathering Metadata processing Push to external engine The content is gathered using the SlingRequestProcessor OSGi service. It’s like a request for an HTML page sent from the Java code and consumed by itself. AuthorPublish Process status tracking & persistence
  • 30. Google Search Appliance Content replication Filtering Push to Publish Indexing queue (-s) Content gathering Metadata processing Push to external engine Metadata is collected according to multiple different rules: • the content resource type • the content path • values of the component properties • custom rules AuthorPublish Process status tracking & persistence
  • 31. Google Search Appliance Content replication Filtering Push to Publish Indexing queue (-s) Content gathering Metadata processing Push to external engine The content and metadata are combined together and sent to the search engine. Depending on the implementation it can be done for each single document or in batches. AuthorPublish Process status tracking & persistence
  • 32. Google Search Appliance Content replication Filtering Push to Publish Indexing queue (-s) Content gathering Metadata processing Failure or timeout Retry In case of any failure, indexing is rescheduled and launched again as many times as it is configured. If the server goes down, indexing will restart when the machine is up again. AuthorPublish Process status tracking & persistence
  • 33. Google Search Appliance The flexible nature of our solution saved us when some fancy requirements came.
  • 35. Apache Solr The search engine, which is: • free & open source • powerful • customizable • scalable And what is the most important, it is a part of the Jackrabbit Oak (JCR 3), the repository engine which has been used for AEM 6. AEM with the integrated Solr is right there.
  • 36. Apache Solr The solution developed for GSA has been ported to work with Solr. Changes: • Replaced the ”glue code” that does the final data push, with one that uses SolrJ Java library. • Names of the document metadata fields has been changed to follow the Solr naming convention for dynamic fields. Everything else remained untouched.
  • 38. Search driven components No server-side processing. Search engine used as a mini database of metadata. Configuration via query parameters. Pure front-end implementation.
  • 39. Search driven components The whole page can be read from the dispatcher cache. An AJAX request gets the content directly from the search engine. The response is JSON-structured, easy to parse and to display, using JavaScript. { "id": "223344", "firstName": "Michael", "lastName": "Johnson", "phone": "(123)-777-8888", "office": "Office UK", "department": "504", "title": "Lead Architect" }
  • 40. Search driven components Search results component configured to return employee data.
  • 41. Search driven components User profile. The name, mobile, email, image path etc. are all metadata values of the document.
  • 42. Search driven components Carousel with news. By changing the maximum number of search results, we can control the number of slides in the carousel.