SlideShare une entreprise Scribd logo
1  sur  13
Interfaces to Xapian
Open source search day 2009
C++
#include <xapian.h>
Xapian::WritableDatabase db(path, Xapian::DB_OPEN);
Xapian::Document doc;
doc.add_term(“foo”);
db.add_document(doc);
Python: xapian
import xapian
db = xapian.WritableDatabase(path, xapian.DB_OPEN)
doc = xapian.Document()
doc.add_term(“foo”)
db.add_document(doc)
Python: xappy
from xappy import IndexerConnection, FieldActions
db = xappy.IndexerConnection(path)
db.add_field_action(“text”, FieldActions.INDEX_FREETEXT)
doc = xappy.UnprocessedDocument()
doc.append(“text”, “foo”)
db.add(doc)
Python: xappy
from xappy import IndexerConnection, FieldActions
db = xappy.IndexerConnection(path)
db.add_field_action(“text”, FieldActions.INDEX_FREETEXT,
language=”french”)
doc = xappy.UnprocessedDocument()
doc.append(“text”, “foo”)
db.add(doc)
from xappy2.core import *
db = xappy.IndexerConnection(path)
db.add_field_type(“text”, TEXT, language=”french”)
db.add_index(“text”, StandardAnalyser)
doc = xappy.UnprocessedDocument()
doc.append(“text”, “foo”)
db.add(doc)
Python: xappy2.core
Python: xappy2.server
REST based API
Python: xappy2.server
PUT to /v1/dbs/dbname
POST to /v1/dbs/dbname/schema/fields/text
{ 'type': 'text', 'freetext': {'language': 'en'} } }
POST to /v1/dbs/dbname/docs
{ 'text': ['foo'] }
(or PUT to /v1/dbs/dbname/docs/docid)
Python: Zope: ore.xapian
Zope style layer on top of xappy:
class Content( object ):
... implements( interfaces.IIndexable )
Asynchronous loading/updating, event integration,
etc
Python: Django: Djapian
Django integration layer on top of xapian
import djapian
class EntryIndexer(djapian.Indexer):
fields=["text"]
Tags=[ ("content", "content.text" ) ]
Python: Django: Haystack
Another Django integration layer on top of xapian
from haystack import indexes
class TextIndex(indexes.SearchIndex):
text = indexes.CharField(document=True,
use_template=True)
Other
Similar stack of interfaces for Ruby, PHP
Java, C# just have bindings, so far
Image Searching with Xappy
db.add_field_action('image', FieldActions.IMGSEEK,
terms = True)
doc.fields.append('image', path_to_image_file)
db.add(doc)
query = sconn.query_image_similarity('image', docid='0')

Contenu connexe

Tendances

Data science at the command line
Data science at the command lineData science at the command line
Data science at the command lineSharat Chikkerur
 
EuroPython 2017 - Bonono - Simple ETL in python 3.5+
EuroPython 2017 - Bonono - Simple ETL in python 3.5+EuroPython 2017 - Bonono - Simple ETL in python 3.5+
EuroPython 2017 - Bonono - Simple ETL in python 3.5+Romain Dorgueil
 
Cascading at the Lyon Hadoop User Group
Cascading at the Lyon Hadoop User GroupCascading at the Lyon Hadoop User Group
Cascading at the Lyon Hadoop User Groupacogoluegnes
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and PythonMike Bright
 
Building social network with Neo4j and Python
Building social network with Neo4j and PythonBuilding social network with Neo4j and Python
Building social network with Neo4j and PythonAndrii Soldatenko
 
ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.Tatiana Tarasova
 
Apache Hadoop India Summit 2011 talk "Pig - Making Hadoop Easy" by Alan Gate
Apache Hadoop India Summit 2011 talk "Pig - Making Hadoop Easy" by Alan GateApache Hadoop India Summit 2011 talk "Pig - Making Hadoop Easy" by Alan Gate
Apache Hadoop India Summit 2011 talk "Pig - Making Hadoop Easy" by Alan GateYahoo Developer Network
 
Parse, scale to millions
Parse, scale to millionsParse, scale to millions
Parse, scale to millionsFlorent Vilmart
 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wranglingjakehofman
 
useR! 2012 Talk
useR! 2012 TalkuseR! 2012 Talk
useR! 2012 Talkrtelmore
 
AWS Hadoop and PIG and overview
AWS Hadoop and PIG and overviewAWS Hadoop and PIG and overview
AWS Hadoop and PIG and overviewDan Morrill
 

Tendances (20)

Data science at the command line
Data science at the command lineData science at the command line
Data science at the command line
 
EuroPython 2017 - Bonono - Simple ETL in python 3.5+
EuroPython 2017 - Bonono - Simple ETL in python 3.5+EuroPython 2017 - Bonono - Simple ETL in python 3.5+
EuroPython 2017 - Bonono - Simple ETL in python 3.5+
 
Database Homework Help
Database Homework HelpDatabase Homework Help
Database Homework Help
 
Kibana: Real-World Examples
Kibana: Real-World ExamplesKibana: Real-World Examples
Kibana: Real-World Examples
 
MongoDB and Python
MongoDB and PythonMongoDB and Python
MongoDB and Python
 
Cascading at the Lyon Hadoop User Group
Cascading at the Lyon Hadoop User GroupCascading at the Lyon Hadoop User Group
Cascading at the Lyon Hadoop User Group
 
Dapper
DapperDapper
Dapper
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
 
Database Homework Help
Database Homework HelpDatabase Homework Help
Database Homework Help
 
Building social network with Neo4j and Python
Building social network with Neo4j and PythonBuilding social network with Neo4j and Python
Building social network with Neo4j and Python
 
ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.
 
Apache Hadoop India Summit 2011 talk "Pig - Making Hadoop Easy" by Alan Gate
Apache Hadoop India Summit 2011 talk "Pig - Making Hadoop Easy" by Alan GateApache Hadoop India Summit 2011 talk "Pig - Making Hadoop Easy" by Alan Gate
Apache Hadoop India Summit 2011 talk "Pig - Making Hadoop Easy" by Alan Gate
 
Web Scrapping with Python
Web Scrapping with PythonWeb Scrapping with Python
Web Scrapping with Python
 
Parse, scale to millions
Parse, scale to millionsParse, scale to millions
Parse, scale to millions
 
Power shell
Power shellPower shell
Power shell
 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wrangling
 
useR! 2012 Talk
useR! 2012 TalkuseR! 2012 Talk
useR! 2012 Talk
 
Latinoware
LatinowareLatinoware
Latinoware
 
AWS Hadoop and PIG and overview
AWS Hadoop and PIG and overviewAWS Hadoop and PIG and overview
AWS Hadoop and PIG and overview
 
Pig workshop
Pig workshopPig workshop
Pig workshop
 

Similaire à Interfaces to xapian

Know how to redirect input and output- and know how to append to an ex.docx
Know how to redirect input and output- and know how to append to an ex.docxKnow how to redirect input and output- and know how to append to an ex.docx
Know how to redirect input and output- and know how to append to an ex.docxwkelli
 
ITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingIstanbul Tech Talks
 
Building .NET Apps using Couchbase Lite
Building .NET Apps using Couchbase LiteBuilding .NET Apps using Couchbase Lite
Building .NET Apps using Couchbase Litegramana
 
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInVitaly Gordon
 
Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...Samuel Lampa
 
Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...
Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...
Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...GeeksLab Odessa
 
Java 7, 8 & 9 - Moving the language forward
Java 7, 8 & 9 - Moving the language forwardJava 7, 8 & 9 - Moving the language forward
Java 7, 8 & 9 - Moving the language forwardMario Fusco
 
APPENEDING OF DATA TO AN EXISTING FILES.
APPENEDING OF DATA TO AN EXISTING FILES.APPENEDING OF DATA TO AN EXISTING FILES.
APPENEDING OF DATA TO AN EXISTING FILES.anushaashraf20
 
Python Google Cloud Function with CORS
Python Google Cloud Function with CORSPython Google Cloud Function with CORS
Python Google Cloud Function with CORSRapidValue
 
Stream or not to Stream?

Stream or not to Stream?
Stream or not to Stream?

Stream or not to Stream?
Lukasz Byczynski
 
Ingesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScriptIngesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScriptLucidworks
 
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design PathshalaAdvance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design PathshalaDesing Pathshala
 
4 - Files and Directories - Pemrograman Internet Lanjut.pptx
4 - Files and Directories - Pemrograman Internet Lanjut.pptx4 - Files and Directories - Pemrograman Internet Lanjut.pptx
4 - Files and Directories - Pemrograman Internet Lanjut.pptxMasSam13
 
File handling in C++
File handling in C++File handling in C++
File handling in C++Hitesh Kumar
 

Similaire à Interfaces to xapian (20)

Hadoop
HadoopHadoop
Hadoop
 
Know how to redirect input and output- and know how to append to an ex.docx
Know how to redirect input and output- and know how to append to an ex.docxKnow how to redirect input and output- and know how to append to an ex.docx
Know how to redirect input and output- and know how to append to an ex.docx
 
ITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function Programming
 
Building .NET Apps using Couchbase Lite
Building .NET Apps using Couchbase LiteBuilding .NET Apps using Couchbase Lite
Building .NET Apps using Couchbase Lite
 
Building a Search Engine Using Lucene
Building a Search Engine Using LuceneBuilding a Search Engine Using Lucene
Building a Search Engine Using Lucene
 
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedIn
 
Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...
 
Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...
Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...
Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...
 
Java 7, 8 & 9 - Moving the language forward
Java 7, 8 & 9 - Moving the language forwardJava 7, 8 & 9 - Moving the language forward
Java 7, 8 & 9 - Moving the language forward
 
Jug java7
Jug java7Jug java7
Jug java7
 
APPENEDING OF DATA TO AN EXISTING FILES.
APPENEDING OF DATA TO AN EXISTING FILES.APPENEDING OF DATA TO AN EXISTING FILES.
APPENEDING OF DATA TO AN EXISTING FILES.
 
Python Google Cloud Function with CORS
Python Google Cloud Function with CORSPython Google Cloud Function with CORS
Python Google Cloud Function with CORS
 
Apache Beam de A à Z
 Apache Beam de A à Z Apache Beam de A à Z
Apache Beam de A à Z
 
Stream or not to Stream?

Stream or not to Stream?
Stream or not to Stream?

Stream or not to Stream?

 
SWT Lecture Session 4 - Sesame
SWT Lecture Session 4 - SesameSWT Lecture Session 4 - Sesame
SWT Lecture Session 4 - Sesame
 
4 sesame
4 sesame4 sesame
4 sesame
 
Ingesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScriptIngesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScript
 
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design PathshalaAdvance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
 
4 - Files and Directories - Pemrograman Internet Lanjut.pptx
4 - Files and Directories - Pemrograman Internet Lanjut.pptx4 - Files and Directories - Pemrograman Internet Lanjut.pptx
4 - Files and Directories - Pemrograman Internet Lanjut.pptx
 
File handling in C++
File handling in C++File handling in C++
File handling in C++
 

Plus de Richard Boulton

Improving relevance with log information
Improving relevance with log informationImproving relevance with log information
Improving relevance with log informationRichard Boulton
 
Designing a generic Python Search Engine API - BarCampLondon 8
Designing a generic Python Search Engine API - BarCampLondon 8Designing a generic Python Search Engine API - BarCampLondon 8
Designing a generic Python Search Engine API - BarCampLondon 8Richard Boulton
 
Making a simple question into a complicated query
Making a simple question into a complicated queryMaking a simple question into a complicated query
Making a simple question into a complicated queryRichard Boulton
 
Search as a Service with Xapian - Search Solutions 2009
Search as a Service with Xapian - Search Solutions 2009Search as a Service with Xapian - Search Solutions 2009
Search as a Service with Xapian - Search Solutions 2009Richard Boulton
 
Comparing open source search engines
Comparing open source search enginesComparing open source search engines
Comparing open source search enginesRichard Boulton
 
The Xapian Open Source Search Engine
The Xapian Open Source Search EngineThe Xapian Open Source Search Engine
The Xapian Open Source Search EngineRichard Boulton
 

Plus de Richard Boulton (8)

Improving relevance with log information
Improving relevance with log informationImproving relevance with log information
Improving relevance with log information
 
Designing a generic Python Search Engine API - BarCampLondon 8
Designing a generic Python Search Engine API - BarCampLondon 8Designing a generic Python Search Engine API - BarCampLondon 8
Designing a generic Python Search Engine API - BarCampLondon 8
 
Making a simple question into a complicated query
Making a simple question into a complicated queryMaking a simple question into a complicated query
Making a simple question into a complicated query
 
Haystack
HaystackHaystack
Haystack
 
Search as a Service with Xapian - Search Solutions 2009
Search as a Service with Xapian - Search Solutions 2009Search as a Service with Xapian - Search Solutions 2009
Search as a Service with Xapian - Search Solutions 2009
 
Comparing open source search engines
Comparing open source search enginesComparing open source search engines
Comparing open source search engines
 
Optimising Xapian
Optimising XapianOptimising Xapian
Optimising Xapian
 
The Xapian Open Source Search Engine
The Xapian Open Source Search EngineThe Xapian Open Source Search Engine
The Xapian Open Source Search Engine
 

Dernier

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Interfaces to xapian

  • 1. Interfaces to Xapian Open source search day 2009
  • 2. C++ #include <xapian.h> Xapian::WritableDatabase db(path, Xapian::DB_OPEN); Xapian::Document doc; doc.add_term(“foo”); db.add_document(doc);
  • 3. Python: xapian import xapian db = xapian.WritableDatabase(path, xapian.DB_OPEN) doc = xapian.Document() doc.add_term(“foo”) db.add_document(doc)
  • 4. Python: xappy from xappy import IndexerConnection, FieldActions db = xappy.IndexerConnection(path) db.add_field_action(“text”, FieldActions.INDEX_FREETEXT) doc = xappy.UnprocessedDocument() doc.append(“text”, “foo”) db.add(doc)
  • 5. Python: xappy from xappy import IndexerConnection, FieldActions db = xappy.IndexerConnection(path) db.add_field_action(“text”, FieldActions.INDEX_FREETEXT, language=”french”) doc = xappy.UnprocessedDocument() doc.append(“text”, “foo”) db.add(doc)
  • 6. from xappy2.core import * db = xappy.IndexerConnection(path) db.add_field_type(“text”, TEXT, language=”french”) db.add_index(“text”, StandardAnalyser) doc = xappy.UnprocessedDocument() doc.append(“text”, “foo”) db.add(doc) Python: xappy2.core
  • 8. Python: xappy2.server PUT to /v1/dbs/dbname POST to /v1/dbs/dbname/schema/fields/text { 'type': 'text', 'freetext': {'language': 'en'} } } POST to /v1/dbs/dbname/docs { 'text': ['foo'] } (or PUT to /v1/dbs/dbname/docs/docid)
  • 9. Python: Zope: ore.xapian Zope style layer on top of xappy: class Content( object ): ... implements( interfaces.IIndexable ) Asynchronous loading/updating, event integration, etc
  • 10. Python: Django: Djapian Django integration layer on top of xapian import djapian class EntryIndexer(djapian.Indexer): fields=["text"] Tags=[ ("content", "content.text" ) ]
  • 11. Python: Django: Haystack Another Django integration layer on top of xapian from haystack import indexes class TextIndex(indexes.SearchIndex): text = indexes.CharField(document=True, use_template=True)
  • 12. Other Similar stack of interfaces for Ruby, PHP Java, C# just have bindings, so far
  • 13. Image Searching with Xappy db.add_field_action('image', FieldActions.IMGSEEK, terms = True) doc.fields.append('image', path_to_image_file) db.add(doc) query = sconn.query_image_similarity('image', docid='0')