SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
Apache Solr & TYPO3
        Ingo Renner   TYPO3 Core Developer,
                      Release Manager TYPO3 4.2
3. org
                ypo er
            o @t enn
        i ng gor
     il @in
 ma r
tw itte
Indexed
 Search
Indexed Search
 •   Indexing Frontend / Crawler
 •   Respects access rights
 •   Respects languages
 •   Index in Database
 •   Totally OK for smaller websites



Slo ooooooooo ooowww
Apache Solr
So what is Apache Solr?

•   Enterprise Search Server
•   Based on Lucene Index
•   Apache Software Foundation Project
•   Many powerful features


•   CNet, Netflix, ilocal.nl, Zappos.com
Solr Concepts

•   Index = Collection of Documents
•   Document = Data stored in Fields
•   Field Type defines processing through
    Analizers, Tokenizers, Filters
•   Dynamic Fields

                                     bi li ty
•   Copy Fields
                        l ex i
                       F
Why Apache Solr?
•   Speed: Many times faster than IS
•   Better search results
•   Faceted search
•   Spellchecker: Did you mean ... ?
•   Similarity search: More like this ...
•
                                        &
    Editorial Content / paid search results

                                     ed
                               pe
•   Synonyms, Stopwords
                            S            r
                                       e
•   Boosting of specific index fields
•
                                P
    Replication, distributed search
                                    o w
How it works

•   REST like interface
•   Indexing of XML Documents through
    HTTP POST
•   Querying through HTTP GET
•   Results as XML, JSON, PHP
                                   AP I
                          E a sy
Disadvantages


•   Needs Java



                                     rs
•   We donʻt want to deal with Java
    Solr shields us from Java once e
•
                         e lo   p  set-up

               D  e   v              P   H P
                        w    i th
            s  ta   y
Advantages

   •   Multiple times faster than IS
   •   NO database queries
   •   Easy installation / Configuration
   •   Respects access rights
   •   Respects languages
   •
           se erful
   t y to u w
       Cutomizability

 as as
F E       P o
EXT:solr
    +
Current Status
•   „Acts like Indexed Search“
•   Indexing through Frontend / Crawler
•   Search
•   Search Word Highlighting
•   Sorting
•   Spellchecker: Did you mean ... ?
•   Similarity Search: More like this ...
•   Faceted Search
•   Suggest / Autocompletion
Outlook
•   Backend Modul
•   API, indexing through BE
•   Related Searches
•   Last Searches
•   Smart Reranking through user usage
•   Editorial Search Results
•   Editing of Stopwords, Synonyms
Development Model
•   Private financing of new features
•   Financing partners get
    Early Access and Support
•   Minimum stake of 5 man days
•   v2.0 end of Q2 next year
•   Development as Community
    Project in parrallel
Community Edition

•   Released v1.0 on TER
•   Project on TYPO3 Forge
•   Open Development
•   Only few differences
    compared to „our“ version
Showcases
Showcases
Showcases
Showcases
Showcases
Showcases
Making the
sun shine on
your search
Requirements, Setup

•   Requires any J2EE container:
    Tomcat, Jetty, Resin, ...


•   Run setup scripts provided with EXT:solr
•   Copy provided configuration files to Solr
•   config.index_enable = 1
Customization


•   Indexing of additional Data through
    hooks, interfaces, TS configuration
•   Individual index schema
•   En/Disable features through TS
•   Individual, flexible rendering of results
More than Solr
Projects around Solr


•   Lucene - Search Index Library


•   Tika - Content Extraction from Files


•   Nutch - Crawl External Sites
Thanks for listening.
3. org
                ypo er
            o @t enn
        i ng gor
     il @in
 ma r
tw itte

Contenu connexe

Tendances

The WordPress University 2012
The WordPress University 2012The WordPress University 2012
The WordPress University 2012
Stephanie Leary
 
How to start developing your own ExpressionEngine addons
How to start developing your own ExpressionEngine addonsHow to start developing your own ExpressionEngine addons
How to start developing your own ExpressionEngine addons
Leevi Graham
 

Tendances (10)

flickr's architecture & php
flickr's architecture & php flickr's architecture & php
flickr's architecture & php
 
Creating Fixed-Layout EPUBs
Creating Fixed-Layout EPUBsCreating Fixed-Layout EPUBs
Creating Fixed-Layout EPUBs
 
The WordPress University 2012
The WordPress University 2012The WordPress University 2012
The WordPress University 2012
 
Apache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaApache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 Acquia
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5
 
Deep Dive: Structured XML Authoring with George Bina, oXygen XML Editor
Deep Dive: Structured XML Authoring with George Bina, oXygen XML EditorDeep Dive: Structured XML Authoring with George Bina, oXygen XML Editor
Deep Dive: Structured XML Authoring with George Bina, oXygen XML Editor
 
How to start developing your own ExpressionEngine addons
How to start developing your own ExpressionEngine addonsHow to start developing your own ExpressionEngine addons
How to start developing your own ExpressionEngine addons
 
Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for Drupal
 
Apache solr
Apache solrApache solr
Apache solr
 
Building data centric applications for web, desktop and mobile with Entity Fr...
Building data centric applications for web, desktop and mobile with Entity Fr...Building data centric applications for web, desktop and mobile with Entity Fr...
Building data centric applications for web, desktop and mobile with Entity Fr...
 

Similaire à Apache Solr for TYPO3 at TYPO3 Usergroup Day Netherlands

2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr
Lucidworks (Archived)
 
Practical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+SolrPractical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+Solr
Jake Mannix
 

Similaire à Apache Solr for TYPO3 at TYPO3 Usergroup Day Netherlands (20)

BP-8 Global Federation and Search
BP-8 Global Federation and SearchBP-8 Global Federation and Search
BP-8 Global Federation and Search
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swagger
 
Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)
Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)
Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platform
 
PLAT-4 Understanding the SOLR Integration
PLAT-4 Understanding the SOLR IntegrationPLAT-4 Understanding the SOLR Integration
PLAT-4 Understanding the SOLR Integration
 
NoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache HadoopNoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache Hadoop
 
2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr
 
Api FUNdamentals #MHA2017
Api FUNdamentals #MHA2017Api FUNdamentals #MHA2017
Api FUNdamentals #MHA2017
 
Api fundamentals
Api fundamentalsApi fundamentals
Api fundamentals
 
Practical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and SparkPractical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and Spark
 
Practical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+SolrPractical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+Solr
 
SOLR
SOLRSOLR
SOLR
 
Performance and Abstractions
Performance and AbstractionsPerformance and Abstractions
Performance and Abstractions
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
Using LWE/Solr/Lucene for eCom
Using LWE/Solr/Lucene for eComUsing LWE/Solr/Lucene for eCom
Using LWE/Solr/Lucene for eCom
 
Drupal and Apache Stanbol
Drupal and Apache StanbolDrupal and Apache Stanbol
Drupal and Apache Stanbol
 
Solr site search makes shopping simple
Solr site search makes shopping simpleSolr site search makes shopping simple
Solr site search makes shopping simple
 
Drupal7 and Apache Solr
Drupal7 and Apache SolrDrupal7 and Apache Solr
Drupal7 and Apache Solr
 
Apache Solr 5.0 and beyond
Apache Solr 5.0 and beyondApache Solr 5.0 and beyond
Apache Solr 5.0 and beyond
 
Enterprise search in Plone using Solr
Enterprise search in Plone using SolrEnterprise search in Plone using Solr
Enterprise search in Plone using Solr
 

Dernier

Dernier (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

Apache Solr for TYPO3 at TYPO3 Usergroup Day Netherlands

  • 1. Apache Solr & TYPO3 Ingo Renner TYPO3 Core Developer, Release Manager TYPO3 4.2
  • 2. 3. org ypo er o @t enn i ng gor il @in ma r tw itte
  • 4. Indexed Search • Indexing Frontend / Crawler • Respects access rights • Respects languages • Index in Database • Totally OK for smaller websites Slo ooooooooo ooowww
  • 6. So what is Apache Solr? • Enterprise Search Server • Based on Lucene Index • Apache Software Foundation Project • Many powerful features • CNet, Netflix, ilocal.nl, Zappos.com
  • 7. Solr Concepts • Index = Collection of Documents • Document = Data stored in Fields • Field Type defines processing through Analizers, Tokenizers, Filters • Dynamic Fields bi li ty • Copy Fields l ex i F
  • 8. Why Apache Solr? • Speed: Many times faster than IS • Better search results • Faceted search • Spellchecker: Did you mean ... ? • Similarity search: More like this ... • & Editorial Content / paid search results ed pe • Synonyms, Stopwords S r e • Boosting of specific index fields • P Replication, distributed search o w
  • 9. How it works • REST like interface • Indexing of XML Documents through HTTP POST • Querying through HTTP GET • Results as XML, JSON, PHP AP I E a sy
  • 10. Disadvantages • Needs Java rs • We donʻt want to deal with Java Solr shields us from Java once e • e lo p set-up D e v P H P w i th s ta y
  • 11. Advantages • Multiple times faster than IS • NO database queries • Easy installation / Configuration • Respects access rights • Respects languages • se erful t y to u w Cutomizability as as F E P o
  • 12. EXT:solr +
  • 13. Current Status • „Acts like Indexed Search“ • Indexing through Frontend / Crawler • Search • Search Word Highlighting • Sorting • Spellchecker: Did you mean ... ? • Similarity Search: More like this ... • Faceted Search • Suggest / Autocompletion
  • 14. Outlook • Backend Modul • API, indexing through BE • Related Searches • Last Searches • Smart Reranking through user usage • Editorial Search Results • Editing of Stopwords, Synonyms
  • 15. Development Model • Private financing of new features • Financing partners get Early Access and Support • Minimum stake of 5 man days • v2.0 end of Q2 next year • Development as Community Project in parrallel
  • 16. Community Edition • Released v1.0 on TER • Project on TYPO3 Forge • Open Development • Only few differences compared to „our“ version
  • 23. Making the sun shine on your search
  • 24. Requirements, Setup • Requires any J2EE container: Tomcat, Jetty, Resin, ... • Run setup scripts provided with EXT:solr • Copy provided configuration files to Solr • config.index_enable = 1
  • 25. Customization • Indexing of additional Data through hooks, interfaces, TS configuration • Individual index schema • En/Disable features through TS • Individual, flexible rendering of results
  • 27. Projects around Solr • Lucene - Search Index Library • Tika - Content Extraction from Files • Nutch - Crawl External Sites
  • 29. 3. org ypo er o @t enn i ng gor il @in ma r tw itte