SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
Kendall Clark, CEO
                           Clark & Parsia, LLC


Thursday, March 17, 2011                         1
About C&P
                    • We build semantic technology infrastructure
                           and enterprise solutions
                           • Pellet, the leading OWL reasoner
                           • POPS Expertise Location system
                    • Bootstrapped since 2005
                    • Offices in DC and Cambridge, MA
                    • Government & enterprise customers
                    • First talk ever was at LOC in 2005 :)
Thursday, March 17, 2011                                            2
Thursday, March 17, 2011   3
TLDR?
                    • Java RDF database (“quad store”) (no
                           native code)
                    • Freemium model:
                     • enterprise & community editions
                     • OEM
                    • Performance for complex SPARQL queries
                    • Best available reasoning support

Thursday, March 17, 2011                                       4
NoSQL and SemWeb
                    • Semweb is schemaless and schema-rich
                     • As agile as NoSQL stores
                     • More expressive than SQL
                    • Standards based
                     • Graph DBs are all ad hoc
                    • Query Language and, you know, joins
                     • Do you really want to write map-reduce
                           programs...only?! We sure don’t...!

Thursday, March 17, 2011                                         5
Why another RDF DB?
                    • We’re scratching our itch for fast query for
                           integration & decision support apps
                           • aimed at db-reasoner “tweener” space
                           • operationally agile
                    • There’s a hole in the market; or: markets
                           are normal distributions (probably)
                    • Gives us a complete semantic application
                           platform

Thursday, March 17, 2011                                             6
Commercial Market
                    • 6 products
                    • Technically homogenous:
                     • Sagan-like scale obsession
                     • Mostly ad hoc reasoning
                     • Weak perf on complex queries
                     • Ho-hum feature sets & integrations
                    • See http://bit.ly/92P8eN for more
Thursday, March 17, 2011                                    7
Stardog1.0: Overview
                    • Fast
                    • Lightweight
                    • Rich API support
                    • Logical & statistical inference
                    • Transactions
                    • Full-text search
                    • Graph algorithms and path language
                    • awesome mascot!
Thursday, March 17, 2011                                   8
Fast? No, Really Fast!
                    • First design goal in Stardog is performance
                           of complex SPARQL query eval on single
                           machine in the default configuration
                    • Next, total total queries per second
                    • In-memory mode available, when needed
                    • Early testing is promising: fastest RDF DB
                           on SP2B benchmark. Often several times
                           faster.


Thursday, March 17, 2011                                            9
Performance
                    • Do yr own testing; the only queries that
                           matter are yours; don’t trust, test.
                    • It’s not ready till it’s very, very fast.
                    • Flatten the RDF performance tax
                    • About 256 GB for ~2B triples in main-
                           memory mode, i.e., $20k Dell box.
                    • When in doubt: Add. More. RAM.

Thursday, March 17, 2011                                          10
Scalability
                    • Stardog 1.0: scale up
                     • Disk-based joins for very large
                             intermediate structures
                           • Triples compression
                           • Ideally efficient on-disk indices
                    • Stardog 2.0: scale out (shared-disk cluster)
                    • We think it’s easier to scale a fast DB than
                           to speed up a scalable one...

Thursday, March 17, 2011                                             11
Lightweight
                    • ~34 KLOC for core system, ~10 KLOC of
                           tests (1034 unit tests)
                    • Trivially simple installation:
                     • copy JAR & restart servlet container
                     • If you’ve ever used Sesame...
                    • May run: embedded, client-server; main
                           memory or disk-backed modes; any
                           combination of these

Thursday, March 17, 2011                                       12
Interfaces
                    • SNARL (Stardog Native API for RDF
                           Language)
                    • Avro RPC—esp. the low-level TCP
                           transport (coming soon...)—for Java & non-
                           Java
                    • Sesame & Jena
                    • SPARQL Protocol (HTTP)

Thursday, March 17, 2011                                                13
Logical Inference
                    1. OWL 2 QL, EL, and RL “query-time”
                       reasoning
                           • No materialization (so: fast bulk loading)
                           • reasoning enabled per-query
                    2. OWL 2 DL reasoning via Pellet 3.0
                           • in-memory, schema reasoning
                    3. Integrity Constraint Validation via OWL2
                    4. user-defined & SWRL rules

Thursday, March 17, 2011                                                  14
OWL validation of RDF
                    • Use OWL ontologies to validate RDF
                           instance data in Stardog.
                    • May be used as a guard to database
                           modifications (so, if resulting data is invalid,
                           transaction fails).
                    • W3C Member Submission to formalize this
                           approach; stay tuned for details.
                    • See http://clarkparsia.com/pellet/icv/ for
                           details

Thursday, March 17, 2011                                                     15
OWL 2 Support
                    • Stardog 1.0: query-time, query rewriting
                           reasoner for SPARQL entailment regimes
                    • It will support all of OWL 2 QL, EL, and
                           RL, with exceptions:
                           • limited support for datatypes reasoning
                           • i.e., won’t support user-defined datatypes
                            • will depend on customer demand

Thursday, March 17, 2011                                                 16
Statistical Inference
                    • Corleone is a machine learning system for
                           RDF and OWL
                    • Optimized for Stardog
                    • Multiple classifier & cluster algorithms
                    • Clusters (similarity) and classifies (predicts)
                           by RDF class & individual
                    • Machine learning must still be tuned; no
                           magic bullets


Thursday, March 17, 2011                                               17
Transactions
                    • Supports optional ACID transactions on
                           database mutations
                    • 2-phase commit based on Java Transaction
                           API
                    • Tx’d writes 2x to 8x slower, depending on
                           lots of variables
                    • Writes may be asynchronous & queued

Thursday, March 17, 2011                                          18
Search
                    • Indexes RDF individuals and literals
                    • Results are 2-tuples (url|value, score)
                    • Based on Lucene: very fast, very scalable
                    • Can use 1 of 6 algorithms to partition RDF
                           individuals from a graph
                           • via SPARQL DESCRIBE hook
                    • Will be integrated with SPARQL syntax...

Thursday, March 17, 2011                                           19
RDF as Graph
                    • SPARQL isn’t ideal for every use case
                    • Graph algorithm processing on RDF purely
                           as a graph
                    • Stardog supports Gremlin, the ad hoc
                           standard for graph database query
                           languages
                    • Gremlin makes graph algorithms easy to
                           write
                    • More optimized Gremlin support for 1.0
Thursday, March 17, 2011                                         20
Implementations
                            Sesame                         Jena                           Empire

                                                        Stardog API
                           HTTP API                     Native API                        Avro API
                                                        Stardog Core
                                                        SPI Runtime
                                                        Transactions
                                                        Stardog RDF
                                          Query
                                           Exec
                                         Plan API                                   Query Rewriting/
                                        Optimizer                                      Reasoning

                                      Plan Filter API

                                        Index API                                         SPI
                           CP Util           IO Util                 Stardog Util          Sesame Ext



Thursday, March 17, 2011                                                                                21
Status

                    • Stardog 0.4.6 alpha release to alpha testers
                           on 15 March 2011
                    • It feels damn good to ship code, even if it’s
                           just an alpha! :)
                    • Weekly updates till beta period starts, then
                           bimonthly updates till 1.0 release



Thursday, March 17, 2011                                              22
The Private Beta
                    • Doin’ it old school: private beta, invitation
                           only
                    • Helps us keep commercial focus
                    • ~1 April to 30 May
                    • kendall@clarkparsia.com if yr interested:
                           give name, org, area of interest, etc.
                    • Rolling releases, new features, bug fixes, etc
                    • ~90 organizations signed up for beta so far
Thursday, March 17, 2011                                              23
Roadmap
                    • 1.0 in mid-Summer
                    • SPARQL 1.1, MRMW
                    • stored procedures in any JVM lang
                    • Shiro-based security layer
                    • native OWL 2 RL reasoner
                    • provenance API
                    • graph algorithms & an RDF path language
                    • performance improvements continuously
Thursday, March 17, 2011                                        24
Thanks! Questions?
                      •    http://stardog.com/        •   http://twitter.com/
                                                          stardog_db
                      •    http://clarkparsia.com/

                      •    http://twitter.com/candp




Thursday, March 17, 2011                                                        25

Contenu connexe

Tendances

Introducing Infinispan
Introducing InfinispanIntroducing Infinispan
Introducing InfinispanPT.JUG
 
Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL DatabasesEmanuel Calvo
 
Java Persistence API (JPA) - A Brief Overview
Java Persistence API (JPA) - A Brief OverviewJava Persistence API (JPA) - A Brief Overview
Java Persistence API (JPA) - A Brief OverviewCraig Dickson
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherJohn Wood
 
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...Alex Gorbachev
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration HustleOracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration HustleEDB
 
ORM, JPA, & Hibernate Overview
ORM, JPA, & Hibernate OverviewORM, JPA, & Hibernate Overview
ORM, JPA, & Hibernate OverviewBrett Meyer
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009fschupp
 
New life inside monolithic application
New life inside monolithic applicationNew life inside monolithic application
New life inside monolithic applicationTaras Matyashovsky
 
Stumbling stones when migrating from Oracle
 Stumbling stones when migrating from Oracle Stumbling stones when migrating from Oracle
Stumbling stones when migrating from OracleEDB
 
Migration from FAST ESP to Lucene Solr - Apache Lucene Eurocon Barcelona 2011
Migration from FAST ESP to Lucene Solr - Apache Lucene Eurocon Barcelona 2011Migration from FAST ESP to Lucene Solr - Apache Lucene Eurocon Barcelona 2011
Migration from FAST ESP to Lucene Solr - Apache Lucene Eurocon Barcelona 2011Michael McIntosh
 
Json and Jsonpath in Postgres 12
Json and Jsonpath in Postgres 12Json and Jsonpath in Postgres 12
Json and Jsonpath in Postgres 12EDB
 
Introduction to Machine Learning for Oracle Database Professionals
Introduction to Machine Learning for Oracle Database ProfessionalsIntroduction to Machine Learning for Oracle Database Professionals
Introduction to Machine Learning for Oracle Database ProfessionalsAlex Gorbachev
 
keyvi the key value index @ Cliqz
keyvi the key value index @ Cliqzkeyvi the key value index @ Cliqz
keyvi the key value index @ CliqzHendrik Muhs
 
Hibernate ORM: Tips, Tricks, and Performance Techniques
Hibernate ORM: Tips, Tricks, and Performance TechniquesHibernate ORM: Tips, Tricks, and Performance Techniques
Hibernate ORM: Tips, Tricks, and Performance TechniquesBrett Meyer
 
In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014Hazelcast
 
An Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformAn Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformMongoDB
 
Which postgres is_right_for_me_20130517
Which postgres is_right_for_me_20130517Which postgres is_right_for_me_20130517
Which postgres is_right_for_me_20130517EDB
 
AMIS Beyond the Horizon - High density deployments using weblogic multitenancy
AMIS Beyond the Horizon - High density deployments using weblogic multitenancyAMIS Beyond the Horizon - High density deployments using weblogic multitenancy
AMIS Beyond the Horizon - High density deployments using weblogic multitenancyJaap Poot
 

Tendances (19)

Introducing Infinispan
Introducing InfinispanIntroducing Infinispan
Introducing Infinispan
 
Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL Databases
 
Java Persistence API (JPA) - A Brief Overview
Java Persistence API (JPA) - A Brief OverviewJava Persistence API (JPA) - A Brief Overview
Java Persistence API (JPA) - A Brief Overview
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
 
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration HustleOracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration Hustle
 
ORM, JPA, & Hibernate Overview
ORM, JPA, & Hibernate OverviewORM, JPA, & Hibernate Overview
ORM, JPA, & Hibernate Overview
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009
 
New life inside monolithic application
New life inside monolithic applicationNew life inside monolithic application
New life inside monolithic application
 
Stumbling stones when migrating from Oracle
 Stumbling stones when migrating from Oracle Stumbling stones when migrating from Oracle
Stumbling stones when migrating from Oracle
 
Migration from FAST ESP to Lucene Solr - Apache Lucene Eurocon Barcelona 2011
Migration from FAST ESP to Lucene Solr - Apache Lucene Eurocon Barcelona 2011Migration from FAST ESP to Lucene Solr - Apache Lucene Eurocon Barcelona 2011
Migration from FAST ESP to Lucene Solr - Apache Lucene Eurocon Barcelona 2011
 
Json and Jsonpath in Postgres 12
Json and Jsonpath in Postgres 12Json and Jsonpath in Postgres 12
Json and Jsonpath in Postgres 12
 
Introduction to Machine Learning for Oracle Database Professionals
Introduction to Machine Learning for Oracle Database ProfessionalsIntroduction to Machine Learning for Oracle Database Professionals
Introduction to Machine Learning for Oracle Database Professionals
 
keyvi the key value index @ Cliqz
keyvi the key value index @ Cliqzkeyvi the key value index @ Cliqz
keyvi the key value index @ Cliqz
 
Hibernate ORM: Tips, Tricks, and Performance Techniques
Hibernate ORM: Tips, Tricks, and Performance TechniquesHibernate ORM: Tips, Tricks, and Performance Techniques
Hibernate ORM: Tips, Tricks, and Performance Techniques
 
In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014
 
An Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformAn Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media Platform
 
Which postgres is_right_for_me_20130517
Which postgres is_right_for_me_20130517Which postgres is_right_for_me_20130517
Which postgres is_right_for_me_20130517
 
AMIS Beyond the Horizon - High density deployments using weblogic multitenancy
AMIS Beyond the Horizon - High density deployments using weblogic multitenancyAMIS Beyond the Horizon - High density deployments using weblogic multitenancy
AMIS Beyond the Horizon - High density deployments using weblogic multitenancy
 

Similaire à Stardog talk-dc-march-17

Building high traffic http front-ends. theo schlossnagle. зал 1
Building high traffic http front-ends. theo schlossnagle. зал 1Building high traffic http front-ends. theo schlossnagle. зал 1
Building high traffic http front-ends. theo schlossnagle. зал 1rit2011
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseAndreas Jung
 
Esp2solr eurocon-2011-presentation-111021215049-phpapp02
Esp2solr eurocon-2011-presentation-111021215049-phpapp02Esp2solr eurocon-2011-presentation-111021215049-phpapp02
Esp2solr eurocon-2011-presentation-111021215049-phpapp02TNR Global
 
Data Segmenting in Anzo
Data Segmenting in AnzoData Segmenting in Anzo
Data Segmenting in AnzoLeeFeigenbaum
 
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347Manik Surtani
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Ron Broersma dren-stavanger-22 nov2011
Ron Broersma dren-stavanger-22 nov2011Ron Broersma dren-stavanger-22 nov2011
Ron Broersma dren-stavanger-22 nov2011IPv6no
 
Oracle strategy for_information_management
Oracle strategy for_information_managementOracle strategy for_information_management
Oracle strategy for_information_managementInSync Conference
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?DATAVERSITY
 
Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesCharlie Hull
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Spark Summit
 
20160524 ibm fast data meetup
20160524 ibm fast data meetup20160524 ibm fast data meetup
20160524 ibm fast data meetupshinolajla
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQLTony Tam
 
Stardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF DatabaseStardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF DatabaseClark & Parsia LLC
 
Java EE and Google App Engine
Java EE and Google App EngineJava EE and Google App Engine
Java EE and Google App EngineArun Gupta
 
Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Entel
 
Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Espen Brækken
 

Similaire à Stardog talk-dc-march-17 (20)

Http front-ends
Http front-endsHttp front-ends
Http front-ends
 
Building high traffic http front-ends. theo schlossnagle. зал 1
Building high traffic http front-ends. theo schlossnagle. зал 1Building high traffic http front-ends. theo schlossnagle. зал 1
Building high traffic http front-ends. theo schlossnagle. зал 1
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL Database
 
Esp2solr eurocon-2011-presentation-111021215049-phpapp02
Esp2solr eurocon-2011-presentation-111021215049-phpapp02Esp2solr eurocon-2011-presentation-111021215049-phpapp02
Esp2solr eurocon-2011-presentation-111021215049-phpapp02
 
Data Segmenting in Anzo
Data Segmenting in AnzoData Segmenting in Anzo
Data Segmenting in Anzo
 
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Ron Broersma dren-stavanger-22 nov2011
Ron Broersma dren-stavanger-22 nov2011Ron Broersma dren-stavanger-22 nov2011
Ron Broersma dren-stavanger-22 nov2011
 
Oracle strategy for_information_management
Oracle strategy for_information_managementOracle strategy for_information_management
Oracle strategy for_information_management
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?
 
Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challenges
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
 
20160524 ibm fast data meetup
20160524 ibm fast data meetup20160524 ibm fast data meetup
20160524 ibm fast data meetup
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
Stardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF DatabaseStardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF Database
 
Java EE and Google App Engine
Java EE and Google App EngineJava EE and Google App Engine
Java EE and Google App Engine
 
Drill njhug -19 feb2013
Drill njhug -19 feb2013Drill njhug -19 feb2013
Drill njhug -19 feb2013
 
Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Oracle en Entel Summit 2010
Oracle en Entel Summit 2010
 
Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex
 

Plus de Clark & Parsia LLC

Validating Linked Data with OWL
Validating Linked Data with OWLValidating Linked Data with OWL
Validating Linked Data with OWLClark & Parsia LLC
 
Sem tech 2010_integrity_constraints
Sem tech 2010_integrity_constraintsSem tech 2010_integrity_constraints
Sem tech 2010_integrity_constraintsClark & Parsia LLC
 
PelletDb: Scalable Reasoning for Enterprise Semantics
PelletDb: Scalable Reasoning for Enterprise SemanticsPelletDb: Scalable Reasoning for Enterprise Semantics
PelletDb: Scalable Reasoning for Enterprise SemanticsClark & Parsia LLC
 
Automated Planning as a Semantic Technology
Automated Planning as a Semantic TechnologyAutomated Planning as a Semantic Technology
Automated Planning as a Semantic TechnologyClark & Parsia LLC
 
SemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformSemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformClark & Parsia LLC
 

Plus de Clark & Parsia LLC (9)

Stardog Linked Data Catalog
Stardog Linked Data CatalogStardog Linked Data Catalog
Stardog Linked Data Catalog
 
RR2010 Keynote
RR2010 KeynoteRR2010 Keynote
RR2010 Keynote
 
Validating Linked Data with OWL
Validating Linked Data with OWLValidating Linked Data with OWL
Validating Linked Data with OWL
 
Sem tech 2010_integrity_constraints
Sem tech 2010_integrity_constraintsSem tech 2010_integrity_constraints
Sem tech 2010_integrity_constraints
 
Terp: An OWL-friendly SPARQL
Terp: An OWL-friendly SPARQLTerp: An OWL-friendly SPARQL
Terp: An OWL-friendly SPARQL
 
PelletDb: Scalable Reasoning for Enterprise Semantics
PelletDb: Scalable Reasoning for Enterprise SemanticsPelletDb: Scalable Reasoning for Enterprise Semantics
PelletDb: Scalable Reasoning for Enterprise Semantics
 
Automated Planning as a Semantic Technology
Automated Planning as a Semantic TechnologyAutomated Planning as a Semantic Technology
Automated Planning as a Semantic Technology
 
Empire: JPA for RDF & SPARQL
Empire: JPA for RDF & SPARQLEmpire: JPA for RDF & SPARQL
Empire: JPA for RDF & SPARQL
 
SemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformSemTech 2010: Pelorus Platform
SemTech 2010: Pelorus Platform
 

Dernier

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 

Dernier (20)

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 

Stardog talk-dc-march-17

  • 1. Kendall Clark, CEO Clark & Parsia, LLC Thursday, March 17, 2011 1
  • 2. About C&P • We build semantic technology infrastructure and enterprise solutions • Pellet, the leading OWL reasoner • POPS Expertise Location system • Bootstrapped since 2005 • Offices in DC and Cambridge, MA • Government & enterprise customers • First talk ever was at LOC in 2005 :) Thursday, March 17, 2011 2
  • 4. TLDR? • Java RDF database (“quad store”) (no native code) • Freemium model: • enterprise & community editions • OEM • Performance for complex SPARQL queries • Best available reasoning support Thursday, March 17, 2011 4
  • 5. NoSQL and SemWeb • Semweb is schemaless and schema-rich • As agile as NoSQL stores • More expressive than SQL • Standards based • Graph DBs are all ad hoc • Query Language and, you know, joins • Do you really want to write map-reduce programs...only?! We sure don’t...! Thursday, March 17, 2011 5
  • 6. Why another RDF DB? • We’re scratching our itch for fast query for integration & decision support apps • aimed at db-reasoner “tweener” space • operationally agile • There’s a hole in the market; or: markets are normal distributions (probably) • Gives us a complete semantic application platform Thursday, March 17, 2011 6
  • 7. Commercial Market • 6 products • Technically homogenous: • Sagan-like scale obsession • Mostly ad hoc reasoning • Weak perf on complex queries • Ho-hum feature sets & integrations • See http://bit.ly/92P8eN for more Thursday, March 17, 2011 7
  • 8. Stardog1.0: Overview • Fast • Lightweight • Rich API support • Logical & statistical inference • Transactions • Full-text search • Graph algorithms and path language • awesome mascot! Thursday, March 17, 2011 8
  • 9. Fast? No, Really Fast! • First design goal in Stardog is performance of complex SPARQL query eval on single machine in the default configuration • Next, total total queries per second • In-memory mode available, when needed • Early testing is promising: fastest RDF DB on SP2B benchmark. Often several times faster. Thursday, March 17, 2011 9
  • 10. Performance • Do yr own testing; the only queries that matter are yours; don’t trust, test. • It’s not ready till it’s very, very fast. • Flatten the RDF performance tax • About 256 GB for ~2B triples in main- memory mode, i.e., $20k Dell box. • When in doubt: Add. More. RAM. Thursday, March 17, 2011 10
  • 11. Scalability • Stardog 1.0: scale up • Disk-based joins for very large intermediate structures • Triples compression • Ideally efficient on-disk indices • Stardog 2.0: scale out (shared-disk cluster) • We think it’s easier to scale a fast DB than to speed up a scalable one... Thursday, March 17, 2011 11
  • 12. Lightweight • ~34 KLOC for core system, ~10 KLOC of tests (1034 unit tests) • Trivially simple installation: • copy JAR & restart servlet container • If you’ve ever used Sesame... • May run: embedded, client-server; main memory or disk-backed modes; any combination of these Thursday, March 17, 2011 12
  • 13. Interfaces • SNARL (Stardog Native API for RDF Language) • Avro RPC—esp. the low-level TCP transport (coming soon...)—for Java & non- Java • Sesame & Jena • SPARQL Protocol (HTTP) Thursday, March 17, 2011 13
  • 14. Logical Inference 1. OWL 2 QL, EL, and RL “query-time” reasoning • No materialization (so: fast bulk loading) • reasoning enabled per-query 2. OWL 2 DL reasoning via Pellet 3.0 • in-memory, schema reasoning 3. Integrity Constraint Validation via OWL2 4. user-defined & SWRL rules Thursday, March 17, 2011 14
  • 15. OWL validation of RDF • Use OWL ontologies to validate RDF instance data in Stardog. • May be used as a guard to database modifications (so, if resulting data is invalid, transaction fails). • W3C Member Submission to formalize this approach; stay tuned for details. • See http://clarkparsia.com/pellet/icv/ for details Thursday, March 17, 2011 15
  • 16. OWL 2 Support • Stardog 1.0: query-time, query rewriting reasoner for SPARQL entailment regimes • It will support all of OWL 2 QL, EL, and RL, with exceptions: • limited support for datatypes reasoning • i.e., won’t support user-defined datatypes • will depend on customer demand Thursday, March 17, 2011 16
  • 17. Statistical Inference • Corleone is a machine learning system for RDF and OWL • Optimized for Stardog • Multiple classifier & cluster algorithms • Clusters (similarity) and classifies (predicts) by RDF class & individual • Machine learning must still be tuned; no magic bullets Thursday, March 17, 2011 17
  • 18. Transactions • Supports optional ACID transactions on database mutations • 2-phase commit based on Java Transaction API • Tx’d writes 2x to 8x slower, depending on lots of variables • Writes may be asynchronous & queued Thursday, March 17, 2011 18
  • 19. Search • Indexes RDF individuals and literals • Results are 2-tuples (url|value, score) • Based on Lucene: very fast, very scalable • Can use 1 of 6 algorithms to partition RDF individuals from a graph • via SPARQL DESCRIBE hook • Will be integrated with SPARQL syntax... Thursday, March 17, 2011 19
  • 20. RDF as Graph • SPARQL isn’t ideal for every use case • Graph algorithm processing on RDF purely as a graph • Stardog supports Gremlin, the ad hoc standard for graph database query languages • Gremlin makes graph algorithms easy to write • More optimized Gremlin support for 1.0 Thursday, March 17, 2011 20
  • 21. Implementations Sesame Jena Empire Stardog API HTTP API Native API Avro API Stardog Core SPI Runtime Transactions Stardog RDF Query Exec Plan API Query Rewriting/ Optimizer Reasoning Plan Filter API Index API SPI CP Util IO Util Stardog Util Sesame Ext Thursday, March 17, 2011 21
  • 22. Status • Stardog 0.4.6 alpha release to alpha testers on 15 March 2011 • It feels damn good to ship code, even if it’s just an alpha! :) • Weekly updates till beta period starts, then bimonthly updates till 1.0 release Thursday, March 17, 2011 22
  • 23. The Private Beta • Doin’ it old school: private beta, invitation only • Helps us keep commercial focus • ~1 April to 30 May • kendall@clarkparsia.com if yr interested: give name, org, area of interest, etc. • Rolling releases, new features, bug fixes, etc • ~90 organizations signed up for beta so far Thursday, March 17, 2011 23
  • 24. Roadmap • 1.0 in mid-Summer • SPARQL 1.1, MRMW • stored procedures in any JVM lang • Shiro-based security layer • native OWL 2 RL reasoner • provenance API • graph algorithms & an RDF path language • performance improvements continuously Thursday, March 17, 2011 24
  • 25. Thanks! Questions? • http://stardog.com/ • http://twitter.com/ stardog_db • http://clarkparsia.com/ • http://twitter.com/candp Thursday, March 17, 2011 25