SlideShare une entreprise Scribd logo
1  sur  19
Solr
What is it?
•   Text search index (engine)
•   Open source
•   Not a search product
•   A tool that allows you to create a search
    solution
What is it like?
•   Google, Google Appliance.
•   FAST
•   Oracle Secure Enterprise Search
•   etc.
Google Appliance:
•   Sucks data in
•   Can’t really configure
•   Stuck with results
•   Bonnet is locked
Solr:
•   You need to feed data in
•   Highly configurable
•   Search results can be tuned
•   There is no bonnet
Why am I doing a talk?
•   Did a course
•   LucidWorks content
•   Presented by FindWise
•   FindWise are a search specialist that use a
    range of search engines
Caveats
• Course was in Solr 4.1.0, we use 3.6.1 for
  APVMA
• Course focussed on search, not ingestion or
  presentation
• Java API recommended for ingestion
• ‘Browse’ interface uses Velocity templates for
  presentation, but probably isn’t good enough
  for most projects.
Where does Solr fit?
Application Architecture
Apache Tika
•   Data import handler
•   Used to be part of Lucene
•   XML
•   PDF
•   Word
•   Excel
•   etc.
Manifold CF
•   Apache
•   Connector framework
•   Used to connect to content repositories (source)
•   Sharepoint
•   Documentum
•   CMIS
•   JDBC
•   RSS
Hydra
• FindWise
• Although Solr supports validation (e.g.
  ‘required’), don’t use it for data cleanup.
• Validation failure inconvenient: whole job fails
• Feed in clean data.
• Use Hydra for cleanup.
Apache ZooKeeper
•   Used for SolrCloud
•   Clustering and sharding
•   Solr 4.1.0 only
•   Side project for Hadoop
•   Used to manage Hadoop clusters
Inside
General Approach
• Design schema
• Prototyping
• Integration
Design Schema
• A data modelling exercise
• schema.xml
• Dynamic fields can be useful in the first pass:
  <dynamicField name=“*" type="string"
  indexed="true" />
Prototyping
• Get the data in (index)
• csv, XML, JSON
• post.jar
• URL to search and inspect raw results
• ‘browse’ interface allows developer to
  understand how the search is working
• solrconfig.xml
Integration
•   Not covered
•   Content ingestion
•   Presentation of results
•   Up to you…
Demo

Contenu connexe

Tendances

Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesRahul Singh
 
Episerver and search engines
Episerver and search enginesEpiserver and search engines
Episerver and search enginesMikko Huilaja
 
Elastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraElastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraMikko Huilaja
 
Survey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data LandscapeSurvey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data LandscapeIke Ellis
 
Elasticsearch { "Meetup" : "talk" }
Elasticsearch { "Meetup" : "talk" }Elasticsearch { "Meetup" : "talk" }
Elasticsearch { "Meetup" : "talk" }Lutf Ur Rehman
 
Building Search Engines - Lucene, SolR and Elasticsearch
Building Search Engines - Lucene, SolR and ElasticsearchBuilding Search Engines - Lucene, SolR and Elasticsearch
Building Search Engines - Lucene, SolR and ElasticsearchRahul Singh
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solrsagar chaturvedi
 
Scot Hacker: Building a Killer Bucketlist Site with Python/Django
Scot Hacker: Building a Killer Bucketlist Site with Python/DjangoScot Hacker: Building a Killer Bucketlist Site with Python/Django
Scot Hacker: Building a Killer Bucketlist Site with Python/DjangoBayCHI
 
Schema less table & dynamic schema
Schema less table & dynamic schemaSchema less table & dynamic schema
Schema less table & dynamic schemaDavide Mauri
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearchAnton Udovychenko
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-WebinarEdureka!
 
Alfresco Day Stockholm 2015 - Rapid UI Development
Alfresco Day Stockholm 2015 - Rapid UI DevelopmentAlfresco Day Stockholm 2015 - Rapid UI Development
Alfresco Day Stockholm 2015 - Rapid UI DevelopmentNicole Szigeti
 
AtlasCamp 2014: Preparing Your Plugin for JIRA Data Center
AtlasCamp 2014: Preparing Your Plugin for JIRA Data CenterAtlasCamp 2014: Preparing Your Plugin for JIRA Data Center
AtlasCamp 2014: Preparing Your Plugin for JIRA Data CenterAtlassian
 
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...Lucidworks
 
Elasticsearch for Autosuggest in Clojure at Workframe
Elasticsearch for Autosuggest in Clojure at WorkframeElasticsearch for Autosuggest in Clojure at Workframe
Elasticsearch for Autosuggest in Clojure at WorkframeBrian Ballantine
 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveDavide Mauri
 
Digital Publishing Made Easy with the OSCI Toolkit
 Digital Publishing Made Easy with the OSCI Toolkit Digital Publishing Made Easy with the OSCI Toolkit
Digital Publishing Made Easy with the OSCI ToolkitKyle Jaebker
 
Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014Ike Ellis
 
SQL Server 2016 What's New For Developers
SQL Server 2016  What's New For DevelopersSQL Server 2016  What's New For Developers
SQL Server 2016 What's New For DevelopersDavide Mauri
 

Tendances (20)

Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
 
Episerver and search engines
Episerver and search enginesEpiserver and search engines
Episerver and search engines
 
Elastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraElastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case Evira
 
Survey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data LandscapeSurvey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data Landscape
 
Elasticsearch { "Meetup" : "talk" }
Elasticsearch { "Meetup" : "talk" }Elasticsearch { "Meetup" : "talk" }
Elasticsearch { "Meetup" : "talk" }
 
Building Search Engines - Lucene, SolR and Elasticsearch
Building Search Engines - Lucene, SolR and ElasticsearchBuilding Search Engines - Lucene, SolR and Elasticsearch
Building Search Engines - Lucene, SolR and Elasticsearch
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solr
 
Scot Hacker: Building a Killer Bucketlist Site with Python/Django
Scot Hacker: Building a Killer Bucketlist Site with Python/DjangoScot Hacker: Building a Killer Bucketlist Site with Python/Django
Scot Hacker: Building a Killer Bucketlist Site with Python/Django
 
Schema less table & dynamic schema
Schema less table & dynamic schemaSchema less table & dynamic schema
Schema less table & dynamic schema
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearch
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-Webinar
 
Alfresco Day Stockholm 2015 - Rapid UI Development
Alfresco Day Stockholm 2015 - Rapid UI DevelopmentAlfresco Day Stockholm 2015 - Rapid UI Development
Alfresco Day Stockholm 2015 - Rapid UI Development
 
AtlasCamp 2014: Preparing Your Plugin for JIRA Data Center
AtlasCamp 2014: Preparing Your Plugin for JIRA Data CenterAtlasCamp 2014: Preparing Your Plugin for JIRA Data Center
AtlasCamp 2014: Preparing Your Plugin for JIRA Data Center
 
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
 
Elasticsearch for Autosuggest in Clojure at Workframe
Elasticsearch for Autosuggest in Clojure at WorkframeElasticsearch for Autosuggest in Clojure at Workframe
Elasticsearch for Autosuggest in Clojure at Workframe
 
Dev ops-presentation
Dev ops-presentationDev ops-presentation
Dev ops-presentation
 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep Dive
 
Digital Publishing Made Easy with the OSCI Toolkit
 Digital Publishing Made Easy with the OSCI Toolkit Digital Publishing Made Easy with the OSCI Toolkit
Digital Publishing Made Easy with the OSCI Toolkit
 
Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014
 
SQL Server 2016 What's New For Developers
SQL Server 2016  What's New For DevelopersSQL Server 2016  What's New For Developers
SQL Server 2016 What's New For Developers
 

En vedette (10)

Hilton
HiltonHilton
Hilton
 
Gender in Media
Gender in MediaGender in Media
Gender in Media
 
Kids these days (at work)
Kids these days (at work)Kids these days (at work)
Kids these days (at work)
 
Rhetoric in Popular Culture
Rhetoric in Popular CultureRhetoric in Popular Culture
Rhetoric in Popular Culture
 
Diseños bioclimaticos
Diseños bioclimaticosDiseños bioclimaticos
Diseños bioclimaticos
 
Frank gehry
Frank gehryFrank gehry
Frank gehry
 
Interpersonal Communication in Cars
Interpersonal Communication in CarsInterpersonal Communication in Cars
Interpersonal Communication in Cars
 
Bad advises for broken heart
Bad advises for broken heartBad advises for broken heart
Bad advises for broken heart
 
boilers
boilersboilers
boilers
 
La belleza.
La belleza.La belleza.
La belleza.
 

Similaire à Solr

Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for DrupalChris Caple
 
Search api d8
Search api d8Search api d8
Search api d8Dropsolid
 
QueryPath, Mash-ups, and Web Services
QueryPath, Mash-ups, and Web ServicesQueryPath, Mash-ups, and Web Services
QueryPath, Mash-ups, and Web ServicesMatt Butcher
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoopgregchanan
 
Middleware in Golang: InVision's Rye
Middleware in Golang: InVision's RyeMiddleware in Golang: InVision's Rye
Middleware in Golang: InVision's RyeCale Hoopes
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes WorkshopErik Hatcher
 
Search in the Apache Hadoop Ecosystem: Thoughts from the Field
Search in the Apache Hadoop Ecosystem: Thoughts from the FieldSearch in the Apache Hadoop Ecosystem: Thoughts from the Field
Search in the Apache Hadoop Ecosystem: Thoughts from the FieldAlex Moundalexis
 
Intro to SharePoint 2010 development for .NET developers
Intro to SharePoint 2010 development for .NET developersIntro to SharePoint 2010 development for .NET developers
Intro to SharePoint 2010 development for .NET developersJohn Ferringer
 
Search all the things
Search all the thingsSearch all the things
Search all the thingscyberswat
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Data Science at Scale: Using Apache Spark for Data Science at Bitly
Data Science at Scale: Using Apache Spark for Data Science at BitlyData Science at Scale: Using Apache Spark for Data Science at Bitly
Data Science at Scale: Using Apache Spark for Data Science at BitlySarah Guido
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with LuceneWO Community
 
Intro to Solr in Drupal
Intro to Solr in Drupal Intro to Solr in Drupal
Intro to Solr in Drupal Mediacurrent
 
Drupal for programmers
Drupal for programmersDrupal for programmers
Drupal for programmersMichael Shahov
 
Zero to Sixty with Oracle ApEx
Zero to Sixty with Oracle ApExZero to Sixty with Oracle ApEx
Zero to Sixty with Oracle ApExBradley Brown
 

Similaire à Solr (20)

Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for Drupal
 
Search api d8
Search api d8Search api d8
Search api d8
 
QueryPath, Mash-ups, and Web Services
QueryPath, Mash-ups, and Web ServicesQueryPath, Mash-ups, and Web Services
QueryPath, Mash-ups, and Web Services
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
 
Middleware in Golang: InVision's Rye
Middleware in Golang: InVision's RyeMiddleware in Golang: InVision's Rye
Middleware in Golang: InVision's Rye
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 
Apereo OAE - Bootcamp
Apereo OAE - BootcampApereo OAE - Bootcamp
Apereo OAE - Bootcamp
 
Search in the Apache Hadoop Ecosystem: Thoughts from the Field
Search in the Apache Hadoop Ecosystem: Thoughts from the FieldSearch in the Apache Hadoop Ecosystem: Thoughts from the Field
Search in the Apache Hadoop Ecosystem: Thoughts from the Field
 
Intro to SharePoint 2010 development for .NET developers
Intro to SharePoint 2010 development for .NET developersIntro to SharePoint 2010 development for .NET developers
Intro to SharePoint 2010 development for .NET developers
 
Search On Hadoop
Search On HadoopSearch On Hadoop
Search On Hadoop
 
Search all the things
Search all the thingsSearch all the things
Search all the things
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Data Science at Scale: Using Apache Spark for Data Science at Bitly
Data Science at Scale: Using Apache Spark for Data Science at BitlyData Science at Scale: Using Apache Spark for Data Science at Bitly
Data Science at Scale: Using Apache Spark for Data Science at Bitly
 
SolrCloud on Hadoop
SolrCloud on HadoopSolrCloud on Hadoop
SolrCloud on Hadoop
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with Lucene
 
Intro to Solr in Drupal
Intro to Solr in Drupal Intro to Solr in Drupal
Intro to Solr in Drupal
 
Wikipedia Cloud Search Webinar
Wikipedia Cloud Search WebinarWikipedia Cloud Search Webinar
Wikipedia Cloud Search Webinar
 
Drupal for programmers
Drupal for programmersDrupal for programmers
Drupal for programmers
 
Zero to Sixty with Oracle ApEx
Zero to Sixty with Oracle ApExZero to Sixty with Oracle ApEx
Zero to Sixty with Oracle ApEx
 

Dernier

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 

Dernier (20)

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Solr

  • 2. What is it? • Text search index (engine) • Open source • Not a search product • A tool that allows you to create a search solution
  • 3. What is it like? • Google, Google Appliance. • FAST • Oracle Secure Enterprise Search • etc.
  • 4. Google Appliance: • Sucks data in • Can’t really configure • Stuck with results • Bonnet is locked
  • 5. Solr: • You need to feed data in • Highly configurable • Search results can be tuned • There is no bonnet
  • 6. Why am I doing a talk? • Did a course • LucidWorks content • Presented by FindWise • FindWise are a search specialist that use a range of search engines
  • 7. Caveats • Course was in Solr 4.1.0, we use 3.6.1 for APVMA • Course focussed on search, not ingestion or presentation • Java API recommended for ingestion • ‘Browse’ interface uses Velocity templates for presentation, but probably isn’t good enough for most projects.
  • 10. Apache Tika • Data import handler • Used to be part of Lucene • XML • PDF • Word • Excel • etc.
  • 11. Manifold CF • Apache • Connector framework • Used to connect to content repositories (source) • Sharepoint • Documentum • CMIS • JDBC • RSS
  • 12. Hydra • FindWise • Although Solr supports validation (e.g. ‘required’), don’t use it for data cleanup. • Validation failure inconvenient: whole job fails • Feed in clean data. • Use Hydra for cleanup.
  • 13. Apache ZooKeeper • Used for SolrCloud • Clustering and sharding • Solr 4.1.0 only • Side project for Hadoop • Used to manage Hadoop clusters
  • 15. General Approach • Design schema • Prototyping • Integration
  • 16. Design Schema • A data modelling exercise • schema.xml • Dynamic fields can be useful in the first pass: <dynamicField name=“*" type="string" indexed="true" />
  • 17. Prototyping • Get the data in (index) • csv, XML, JSON • post.jar • URL to search and inspect raw results • ‘browse’ interface allows developer to understand how the search is working • solrconfig.xml
  • 18. Integration • Not covered • Content ingestion • Presentation of results • Up to you…
  • 19. Demo