SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Enrico Risa
The Dynamic Duo
OrientDB & Lucene
Outline
❖ Apache Lucene in a nutshell!
❖ OrientDB Indexing!
❖ OrientDB-Lucene

- Full Text Index

- Spatial Index!
❖ Roadmap 2.0
What Is Lucene?
❖ Free-text indexing library!
❖ Implements standard IR/search functionality

● Query models, ranking, indexing!
❖ Written in Java!
❖ Simple Api!
❖ Fast, Mature and constantly evolving!
❖ Many extension points
Who uses Lucene?
❖ Twitter!
❖ Linkedin!
❖ Apple!
❖ Solr!
❖ Elastic Search!
❖ Neo4J!
❖ and now OrientDB
Base Lucene workflow
Documents
❖ Basic Unit for indexing and searching!
❖ Contains a list of Fields!
❖ Schema-less
Fields
❖ Basic component of a Document!
❖ Fields

- name

- value

- store

- analyzed

Fields Types & Options
❖ Types

-Field

-StringField

-TextField

-StoredField

-IntField

-…More!
❖ Options

-Stored or Not

-Indexed or not

-Analyzed or not



Directory
❖ RAMDirectory

Ram based index!
❖ FSDirectory

File-based index!
❖ NIOFSDirectory

Same as FSDirectory but using NIO api.

Indexing Documents
Searching Index
Inverted Index
Luke: a graphical user interface
❖ Open Lucene Index!
❖ Browse documents!
❖ Run query!
❖ ….
OrientDB Indexing
❖ SBTree 

(Unique,Not unique, Full Text, Dictionary)!
❖ HashIndex

(Unique,Not unique, Full Text, Dictionary)!
❖ MVRB-Tree (Deprecated since 1.6)!
❖ Lucene (OrientDB-Lucene)!
❖ … https://github.com/orientechnologies/orientdb/
wiki/Custom-Index-Engine
OrientDB Lucene
❖ Open Source at 

https://github.com/orientechnologies/orientdb-lucene!
❖ This project aims to bring the power of Lucene index
into OrientDB.!
❖ Supports only Spatial Index And Full Text
Installing OrientDB Lucene
❖ Embedded Mode







❖ Server Mode

Grab a jar build and copy it into $ORIENTDB_HOME/plugins
Spatial Index
❖ No native implementation.!
❖ Build on top Lucene-Spatial Module.!
❖ Currently only points are supported.!
❖ Near and Within query.
Lucene Spatial
❖ Spatial4j

- Handle Shapes (Point,Circle,Rectangle, Polygon)

- Distance and Area math utitilities

- Read WKT format!
❖ Provide Indexing Strategy

- RecursivePrefixTree!
❖ Spatial Query using Shapes
Creating a Spatial Index
❖ SQL



❖ JAVA
Spatial Operators
❖ NEAR

Find all Points near a given location (latitude,longitude)!
❖ WITHIN

Find all Points within a Given Bounding Box
Near Operator
❖ Custom Operator that rely on Lucene Index!
❖ Special Syntax to support spatial args ($spatial)!
❖ Context variable $distance!
❖ Result set sorted from nearest to farthest.
Within Operator
❖ Bounding Box Search!
❖ Currently Points within Box!
❖ Result set not sorted
Full Text Index
❖ Native Full Text Implementation.!
❖ Supports multiple fields.!
❖ Supports Lucene query syntax.!
❖ Lucene Analyzers
Creating a Full Text Index
❖ SQL



❖ JAVA
Full Text Operators
❖ LUCENE

[<fields>] LUCENE <exp>



- Query your index using Query Parser syntax

- Support Multiple fields

- Target all fields (MultiFieldQueryParser)

- Target specific field (QueryParser)

Lucene Operator
❖ MultiFieldQueryParser

Target all fields



❖ QueryParser

Target specific field
Indexing Performance
❖ Full Text

- 9M records in ~300s with StandardAnalyzer and one field!
❖ Spatial 

9M records in ~500s with two field (Point)
Roadmap 2.0
❖ Production Ready!
❖ Monitoring lucene index!
❖ More configuration!
❖ Gui tool integrated in Studio
Roadmap 2.0 (Spatial Index)
❖ Index more shape!
❖ More operators (Intersect..)!
❖ Not only BBox!
❖ Support for GeoJson

http://geojson.org
Roadmap 2.0 (Full Text)
❖ Document & Field Boosting!
❖ Score in result set!
❖ Custom Analyzers & Filters!
❖ Search Engine
Thank You
Questions?
❖ Contact Me

- Enrico Risa e.risa@orientechnologies.com

- Twitter https://twitter.com/wolf4ood

Contenu connexe

Tendances

Day 7 - Make it Fast
Day 7 - Make it FastDay 7 - Make it Fast
Day 7 - Make it FastBarry Jones
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearchAnton Udovychenko
 
Kubernetes and AWS Lambda can play nicely together
Kubernetes and AWS Lambda can play nicely togetherKubernetes and AWS Lambda can play nicely together
Kubernetes and AWS Lambda can play nicely togetherEdward Wilde
 
Automation with phing
Automation with phingAutomation with phing
Automation with phingJoey Rivera
 
Alexey Golub - Writing parsers in c# | 3Shape Meetup
Alexey Golub - Writing parsers in c# | 3Shape MeetupAlexey Golub - Writing parsers in c# | 3Shape Meetup
Alexey Golub - Writing parsers in c# | 3Shape MeetupOleksii Holub
 
I18nize Scala programs à la gettext
I18nize Scala programs à la gettextI18nize Scala programs à la gettext
I18nize Scala programs à la gettextNgoc Dao
 
Find it, possibly also near you!
Find it, possibly also near you!Find it, possibly also near you!
Find it, possibly also near you!Paul Borgermans
 
Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2Manish Pandit
 
Indic threads pune12-typesafe stack software development on the jvm
Indic threads pune12-typesafe stack software development on the jvmIndic threads pune12-typesafe stack software development on the jvm
Indic threads pune12-typesafe stack software development on the jvmIndicThreads
 
Gizzard, DAL and more
Gizzard, DAL and moreGizzard, DAL and more
Gizzard, DAL and morefulin tang
 
Taking eZ Find beyond full-text search
Taking eZ Find beyond  full-text searchTaking eZ Find beyond  full-text search
Taking eZ Find beyond full-text searchPaul Borgermans
 
A rubyist's naive comparison of some database systems and toolkits
A rubyist's naive comparison of some database systems and toolkitsA rubyist's naive comparison of some database systems and toolkits
A rubyist's naive comparison of some database systems and toolkitsBelighted
 
Dns system Ahmadullah Alnoor at AfSIG 2017 by NITPAA
Dns system Ahmadullah Alnoor at AfSIG 2017 by NITPAADns system Ahmadullah Alnoor at AfSIG 2017 by NITPAA
Dns system Ahmadullah Alnoor at AfSIG 2017 by NITPAAAhmad Waleed Khaliqi
 
Go from a PHP Perspective
Go from a PHP PerspectiveGo from a PHP Perspective
Go from a PHP PerspectiveBarry Jones
 
Ballerina- A programming language for the networked world
Ballerina- A programming language for the networked worldBallerina- A programming language for the networked world
Ballerina- A programming language for the networked worldAsangi Jasenthuliyana
 

Tendances (19)

Day 8 - jRuby
Day 8 - jRubyDay 8 - jRuby
Day 8 - jRuby
 
Day 7 - Make it Fast
Day 7 - Make it FastDay 7 - Make it Fast
Day 7 - Make it Fast
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearch
 
Kubernetes and AWS Lambda can play nicely together
Kubernetes and AWS Lambda can play nicely togetherKubernetes and AWS Lambda can play nicely together
Kubernetes and AWS Lambda can play nicely together
 
Automation with phing
Automation with phingAutomation with phing
Automation with phing
 
Taming Text
Taming TextTaming Text
Taming Text
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Alexey Golub - Writing parsers in c# | 3Shape Meetup
Alexey Golub - Writing parsers in c# | 3Shape MeetupAlexey Golub - Writing parsers in c# | 3Shape Meetup
Alexey Golub - Writing parsers in c# | 3Shape Meetup
 
I18nize Scala programs à la gettext
I18nize Scala programs à la gettextI18nize Scala programs à la gettext
I18nize Scala programs à la gettext
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Find it, possibly also near you!
Find it, possibly also near you!Find it, possibly also near you!
Find it, possibly also near you!
 
Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2
 
Indic threads pune12-typesafe stack software development on the jvm
Indic threads pune12-typesafe stack software development on the jvmIndic threads pune12-typesafe stack software development on the jvm
Indic threads pune12-typesafe stack software development on the jvm
 
Gizzard, DAL and more
Gizzard, DAL and moreGizzard, DAL and more
Gizzard, DAL and more
 
Taking eZ Find beyond full-text search
Taking eZ Find beyond  full-text searchTaking eZ Find beyond  full-text search
Taking eZ Find beyond full-text search
 
A rubyist's naive comparison of some database systems and toolkits
A rubyist's naive comparison of some database systems and toolkitsA rubyist's naive comparison of some database systems and toolkits
A rubyist's naive comparison of some database systems and toolkits
 
Dns system Ahmadullah Alnoor at AfSIG 2017 by NITPAA
Dns system Ahmadullah Alnoor at AfSIG 2017 by NITPAADns system Ahmadullah Alnoor at AfSIG 2017 by NITPAA
Dns system Ahmadullah Alnoor at AfSIG 2017 by NITPAA
 
Go from a PHP Perspective
Go from a PHP PerspectiveGo from a PHP Perspective
Go from a PHP Perspective
 
Ballerina- A programming language for the networked world
Ballerina- A programming language for the networked worldBallerina- A programming language for the networked world
Ballerina- A programming language for the networked world
 

En vedette

Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Lucidworks
 
Webinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphWebinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphLucidworks
 
Apache UIMA and Semantic Search
Apache UIMA and Semantic SearchApache UIMA and Semantic Search
Apache UIMA and Semantic SearchTommaso Teofili
 
Tackling a 1 billion member social network
Tackling a 1 billion member social networkTackling a 1 billion member social network
Tackling a 1 billion member social networkArtur Bańkowski
 
Dynamic Application Development by NodeJS ,AngularJS with OrientDB
Dynamic Application Development by NodeJS ,AngularJS with OrientDBDynamic Application Development by NodeJS ,AngularJS with OrientDB
Dynamic Application Development by NodeJS ,AngularJS with OrientDBApaichon Punopas
 
OrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesOrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesCurtis Mosters
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4jKenny Bastani
 
Solr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologySolr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologyLucidworks
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityCurtis Mosters
 

En vedette (10)

Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
 
Webinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphWebinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and Graph
 
Apache UIMA and Semantic Search
Apache UIMA and Semantic SearchApache UIMA and Semantic Search
Apache UIMA and Semantic Search
 
Tackling a 1 billion member social network
Tackling a 1 billion member social networkTackling a 1 billion member social network
Tackling a 1 billion member social network
 
Dynamic Application Development by NodeJS ,AngularJS with OrientDB
Dynamic Application Development by NodeJS ,AngularJS with OrientDBDynamic Application Development by NodeJS ,AngularJS with OrientDB
Dynamic Application Development by NodeJS ,AngularJS with OrientDB
 
OrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesOrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databases
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4j
 
Solr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologySolr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW Technology
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionality
 

Similaire à OrientDB & Lucene

NoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache HadoopNoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache HadoopDmitry Kan
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.Jurriaan Persyn
 
Portable UDFs: Write Once, Run Anywhere
Portable UDFs: Write Once, Run AnywherePortable UDFs: Write Once, Run Anywhere
Portable UDFs: Write Once, Run AnywhereDatabricks
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with LuceneWO Community
 
Sphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLSphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLNguyen Van Vuong
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?lucenerevolution
 
Finite State Queries In Lucene
Finite State Queries In LuceneFinite State Queries In Lucene
Finite State Queries In Luceneotisg
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch BasicsShifa Khan
 
Turning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseTurning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseMatthias Wahl
 
Musings on Secondary Indexing in HBase
Musings on Secondary Indexing in HBaseMusings on Secondary Indexing in HBase
Musings on Secondary Indexing in HBaseJesse Yates
 
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...OpenBlend society
 
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrLet's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrSease
 
Elasticsearch JVM-MX Meetup April 2016
Elasticsearch JVM-MX Meetup April 2016Elasticsearch JVM-MX Meetup April 2016
Elasticsearch JVM-MX Meetup April 2016Domingo Suarez Torres
 
Powering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQLPowering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQLNic Raboy
 
Using JPA applications in the era of NoSQL: Introducing Hibernate OGM
Using JPA applications in the era of NoSQL: Introducing Hibernate OGMUsing JPA applications in the era of NoSQL: Introducing Hibernate OGM
Using JPA applications in the era of NoSQL: Introducing Hibernate OGMPT.JUG
 
Doctrine 2.0 Enterprise Persistence Layer for PHP
Doctrine 2.0 Enterprise Persistence Layer for PHPDoctrine 2.0 Enterprise Persistence Layer for PHP
Doctrine 2.0 Enterprise Persistence Layer for PHPGuilherme Blanco
 
If You Have The Content, Then Apache Has The Technology!
If You Have The Content, Then Apache Has The Technology!If You Have The Content, Then Apache Has The Technology!
If You Have The Content, Then Apache Has The Technology!gagravarr
 
ElasticSearch - Search done right
ElasticSearch - Search done rightElasticSearch - Search done right
ElasticSearch - Search done rightbwullems
 
Techorama 2018 - Elasticsearch - search done right - Bart Wullems
Techorama 2018 - Elasticsearch - search done right - Bart WullemsTechorama 2018 - Elasticsearch - search done right - Bart Wullems
Techorama 2018 - Elasticsearch - search done right - Bart WullemsN Core
 

Similaire à OrientDB & Lucene (20)

Lucene 101
Lucene 101Lucene 101
Lucene 101
 
NoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache HadoopNoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache Hadoop
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Portable UDFs: Write Once, Run Anywhere
Portable UDFs: Write Once, Run AnywherePortable UDFs: Write Once, Run Anywhere
Portable UDFs: Write Once, Run Anywhere
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with Lucene
 
Sphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLSphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQL
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?
 
Finite State Queries In Lucene
Finite State Queries In LuceneFinite State Queries In Lucene
Finite State Queries In Lucene
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
 
Turning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseTurning a Search Engine into a Relational Database
Turning a Search Engine into a Relational Database
 
Musings on Secondary Indexing in HBase
Musings on Secondary Indexing in HBaseMusings on Secondary Indexing in HBase
Musings on Secondary Indexing in HBase
 
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
 
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrLet's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
 
Elasticsearch JVM-MX Meetup April 2016
Elasticsearch JVM-MX Meetup April 2016Elasticsearch JVM-MX Meetup April 2016
Elasticsearch JVM-MX Meetup April 2016
 
Powering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQLPowering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQL
 
Using JPA applications in the era of NoSQL: Introducing Hibernate OGM
Using JPA applications in the era of NoSQL: Introducing Hibernate OGMUsing JPA applications in the era of NoSQL: Introducing Hibernate OGM
Using JPA applications in the era of NoSQL: Introducing Hibernate OGM
 
Doctrine 2.0 Enterprise Persistence Layer for PHP
Doctrine 2.0 Enterprise Persistence Layer for PHPDoctrine 2.0 Enterprise Persistence Layer for PHP
Doctrine 2.0 Enterprise Persistence Layer for PHP
 
If You Have The Content, Then Apache Has The Technology!
If You Have The Content, Then Apache Has The Technology!If You Have The Content, Then Apache Has The Technology!
If You Have The Content, Then Apache Has The Technology!
 
ElasticSearch - Search done right
ElasticSearch - Search done rightElasticSearch - Search done right
ElasticSearch - Search done right
 
Techorama 2018 - Elasticsearch - search done right - Bart Wullems
Techorama 2018 - Elasticsearch - search done right - Bart WullemsTechorama 2018 - Elasticsearch - search done right - Bart Wullems
Techorama 2018 - Elasticsearch - search done right - Bart Wullems
 

Dernier

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Dernier (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

OrientDB & Lucene

  • 1. Enrico Risa The Dynamic Duo OrientDB & Lucene
  • 2. Outline ❖ Apache Lucene in a nutshell! ❖ OrientDB Indexing! ❖ OrientDB-Lucene
 - Full Text Index
 - Spatial Index! ❖ Roadmap 2.0
  • 3. What Is Lucene? ❖ Free-text indexing library! ❖ Implements standard IR/search functionality
 ● Query models, ranking, indexing! ❖ Written in Java! ❖ Simple Api! ❖ Fast, Mature and constantly evolving! ❖ Many extension points
  • 4. Who uses Lucene? ❖ Twitter! ❖ Linkedin! ❖ Apple! ❖ Solr! ❖ Elastic Search! ❖ Neo4J! ❖ and now OrientDB
  • 6. Documents ❖ Basic Unit for indexing and searching! ❖ Contains a list of Fields! ❖ Schema-less
  • 7. Fields ❖ Basic component of a Document! ❖ Fields
 - name
 - value
 - store
 - analyzed

  • 8. Fields Types & Options ❖ Types
 -Field
 -StringField
 -TextField
 -StoredField
 -IntField
 -…More! ❖ Options
 -Stored or Not
 -Indexed or not
 -Analyzed or not
 

  • 9. Directory ❖ RAMDirectory
 Ram based index! ❖ FSDirectory
 File-based index! ❖ NIOFSDirectory
 Same as FSDirectory but using NIO api.

  • 13. Luke: a graphical user interface ❖ Open Lucene Index! ❖ Browse documents! ❖ Run query! ❖ ….
  • 14. OrientDB Indexing ❖ SBTree 
 (Unique,Not unique, Full Text, Dictionary)! ❖ HashIndex
 (Unique,Not unique, Full Text, Dictionary)! ❖ MVRB-Tree (Deprecated since 1.6)! ❖ Lucene (OrientDB-Lucene)! ❖ … https://github.com/orientechnologies/orientdb/ wiki/Custom-Index-Engine
  • 15. OrientDB Lucene ❖ Open Source at 
 https://github.com/orientechnologies/orientdb-lucene! ❖ This project aims to bring the power of Lucene index into OrientDB.! ❖ Supports only Spatial Index And Full Text
  • 16. Installing OrientDB Lucene ❖ Embedded Mode
 
 
 
 ❖ Server Mode
 Grab a jar build and copy it into $ORIENTDB_HOME/plugins
  • 17. Spatial Index ❖ No native implementation.! ❖ Build on top Lucene-Spatial Module.! ❖ Currently only points are supported.! ❖ Near and Within query.
  • 18. Lucene Spatial ❖ Spatial4j
 - Handle Shapes (Point,Circle,Rectangle, Polygon)
 - Distance and Area math utitilities
 - Read WKT format! ❖ Provide Indexing Strategy
 - RecursivePrefixTree! ❖ Spatial Query using Shapes
  • 19. Creating a Spatial Index ❖ SQL
 
 ❖ JAVA
  • 20. Spatial Operators ❖ NEAR
 Find all Points near a given location (latitude,longitude)! ❖ WITHIN
 Find all Points within a Given Bounding Box
  • 21. Near Operator ❖ Custom Operator that rely on Lucene Index! ❖ Special Syntax to support spatial args ($spatial)! ❖ Context variable $distance! ❖ Result set sorted from nearest to farthest.
  • 22. Within Operator ❖ Bounding Box Search! ❖ Currently Points within Box! ❖ Result set not sorted
  • 23. Full Text Index ❖ Native Full Text Implementation.! ❖ Supports multiple fields.! ❖ Supports Lucene query syntax.! ❖ Lucene Analyzers
  • 24. Creating a Full Text Index ❖ SQL
 
 ❖ JAVA
  • 25. Full Text Operators ❖ LUCENE
 [<fields>] LUCENE <exp>
 
 - Query your index using Query Parser syntax
 - Support Multiple fields
 - Target all fields (MultiFieldQueryParser)
 - Target specific field (QueryParser)

  • 26. Lucene Operator ❖ MultiFieldQueryParser
 Target all fields
 
 ❖ QueryParser
 Target specific field
  • 27. Indexing Performance ❖ Full Text
 - 9M records in ~300s with StandardAnalyzer and one field! ❖ Spatial 
 9M records in ~500s with two field (Point)
  • 28. Roadmap 2.0 ❖ Production Ready! ❖ Monitoring lucene index! ❖ More configuration! ❖ Gui tool integrated in Studio
  • 29. Roadmap 2.0 (Spatial Index) ❖ Index more shape! ❖ More operators (Intersect..)! ❖ Not only BBox! ❖ Support for GeoJson
 http://geojson.org
  • 30. Roadmap 2.0 (Full Text) ❖ Document & Field Boosting! ❖ Score in result set! ❖ Custom Analyzers & Filters! ❖ Search Engine
  • 31. Thank You Questions? ❖ Contact Me
 - Enrico Risa e.risa@orientechnologies.com
 - Twitter https://twitter.com/wolf4ood