SlideShare une entreprise Scribd logo
1  sur  27
LibreCat::Catmandu
Philosophy
Kahn-Wilensky Web

                        Handle


                            I search
                            a paper
                            about...

           Repository

                                   Service
           Repository
                                  Provider
Hypothesis 1: One Network & Common Schema
Hypothesis 2: Object Oriented Design
Hypothesis 3: The Resource is the Message
Hypothesis 1:
	

 	

 	

 One Network & Common Schema
                                                         OpenAIRE


        Student
        Papers
                               Google
                                           Europeana

                                                          CRIS


  PhD
                     Inst.
                  Repository
                               Digitized               Data
                               Material                Sets
Hypothesis 2:
	

 	

 	

 Object Oriented Design

 •(Complex) Digital Objects
 • Unique identifiers
 • Key-Metadata + 1 or more data streams
 • Metadata schemes invariant over repositories
 • Specialized data types + service bindings on these types



 • Fragmented input
 • Incomplete
 • Metadata, Files, Metadata + Files
 • Many schemas
 • Simple data types (String, Array, Map) many native functions on these
 types
Researcher
Hypothesis 3: The Resource is the Message
                               DNS



                                  Dr Müller




            Repository                   Google

ISI
                                       Researcher
  PubMed                               Homepage
Extract, Transform & Load
                      Fulltext Search
Relational Database




                          Report


      Excel
Schemaless databases




            $store->add({ ...} );
            $store->search({ ...} );
Copy & Paste

     Functional Style of Programming

            on native hashes and arrays


“It is better to have 100 functions operate on one data structure
     than to have 10 functions operate on 10 data structures.”

                                                      Alan J. Perlis
Project Catmandu
Download, Install & Play


• Perl
• http://librecat.org
• cpan Catmandu
• https://github.com/LibreCat/Catmandu
Anatomy of Search
database   export

                      index def   fix def


           convert    filter/map    fix




                        index     store




           field def    display
Store
                                       Schemaless storage




                                        Elastic     Solr
                                        Search
       JSON


                                                                   store
                                       Mongo         DBI
                                        DB
                                                            $store->add({})
                                                            $store->search()
     title:“krank” and subject.local:“cycle”

title any “krank” and subject.local any “cycle”
Import
                    Import from many sources




Atom   LDAP   DBI
                              JSON

                                                   importer
MARC    OAI   SRU

                                               $store->each({})
                                               $store->first()
                                               $store->rest()
                                               $store->select({})
                                               $store->any({})
                                               $store->many({})
ETL
       upcase('job');
JSON
       capitalize('first');
       capitalize('last');
       capitalize('my.deep.nested.0');

       upcase('my.deep.nested.0');
fix     downcase('my.deep.nested.0');
       substring('my.deep.nested.0',0,2);

fix     add_field('test');
       add_field('income',0);
       add_field('a.0.0.0',1);

       marc_map('100','my.authors.$append');
JSON   marc_map('710','my.authors.$append');
       marc_map('600x','my.subjects.$append');
       marc_map('008_/35-37','my.language');

       join_field('colors.0','/');
MVC
                                                Dancer
app/
 |--bin/
 |--public/
                                                          hello.tt
 | |-- images/                                          <html>
 | |-- css/                                             <body>
 | -- javascript/                                      <h1>[% txt %]</h1>
 |--views/                                              [% FOREACH obj IN res %]
 | -- hello.tt                                          <p>[% obj.title %]</p>
 --environments/                                       [% END %]
                                                        </body>
                                                        </html>



    get ‘/hello’ => sub {
     my $res = store->bag
                 ->search(query => ...)
                 ->reduce( ... );
     template ‘hello’ , { res => $res, txt => “Hello, World!” };
    };
Project LibreCat
http://lup.lub.lu.se



                                   ISI
                                 PubMed
LibreCat Catalog



                               Fedora

    Mongo
                               Store
    Store
http://pub.uni-bielefeld.de/

      Any
SRU/REST/Lucene/             Fix     CSL
     JSON

     Store
http://biblio.ugent.be
                                        Fix
      Any
SRU/REST/Lucene/
     JSON
                                              Elastic
                                  Mongo
     Store                                    Search
     LDAP


      JCR


     Project

      CRIS
http://adore.ugent.be




                                    MySQL
                         Fix
Aleph Store
                                    SOLR
                         IIPImage




Fedora Store
Project Plan

• Catmandu: Open Source Data Toolkit
• LibreCat: Example Programs:
  • LibreCat-Search, LibreCat-Citation,
    LibreCat-Grim, LibreCat-Archive ...

• Suite of repository add-ons:
  • Project Database, Research Groups,
    Authority Files
• Nicolas Steenlant - Ghent

• Nicolas Franck - Ghent

• Patrick Hochstenbach - Ghent

• Snorri Briem - Lund

• Dave Sherohman - Lund

• Jörgen Eriksson - Lund          Thanks!
• Maria Hedberg - Lund

• Friedrich Summann - Bielefeld

• Najko Jahn - Bielefeld

• Vitali Peil - Bielefeld

• Petra Kohorst - Bielefeld
                                       http://librecat.org

Contenu connexe

Tendances

[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자Donghyeok Kang
 
Avro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSONAvro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSONAlexandre Victoor
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchSperasoft
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDBKishor Parkhe
 
Manifests of Future Past
Manifests of Future PastManifests of Future Past
Manifests of Future PastPuppet
 
SPL - The Undiscovered Library - PHPBarcelona 2015
SPL - The Undiscovered Library - PHPBarcelona 2015SPL - The Undiscovered Library - PHPBarcelona 2015
SPL - The Undiscovered Library - PHPBarcelona 2015Mark Baker
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"George Stathis
 
SPL: The Undiscovered Library - DataStructures
SPL: The Undiscovered Library -  DataStructuresSPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library - DataStructuresMark Baker
 
Solr Anti - patterns
Solr Anti - patternsSolr Anti - patterns
Solr Anti - patternsRafał Kuć
 
MongoDB Advanced Topics
MongoDB Advanced TopicsMongoDB Advanced Topics
MongoDB Advanced TopicsCésar Rodas
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersBen van Mol
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDBantoinegirbal
 
Getting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETGetting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETTomas Jansson
 
PostgreSQL's Secret NoSQL Superpowers
PostgreSQL's Secret NoSQL SuperpowersPostgreSQL's Secret NoSQL Superpowers
PostgreSQL's Secret NoSQL SuperpowersAmanda Gilmore
 

Tendances (20)

[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
 
Avro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSONAvro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSON
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
 
Manifests of Future Past
Manifests of Future PastManifests of Future Past
Manifests of Future Past
 
SPL - The Undiscovered Library - PHPBarcelona 2015
SPL - The Undiscovered Library - PHPBarcelona 2015SPL - The Undiscovered Library - PHPBarcelona 2015
SPL - The Undiscovered Library - PHPBarcelona 2015
 
Intro to The PHP SPL
Intro to The PHP SPLIntro to The PHP SPL
Intro to The PHP SPL
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
 
SPL: The Undiscovered Library - DataStructures
SPL: The Undiscovered Library -  DataStructuresSPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library - DataStructures
 
Jersey
JerseyJersey
Jersey
 
Solr Anti - patterns
Solr Anti - patternsSolr Anti - patterns
Solr Anti - patterns
 
MongoDB Advanced Topics
MongoDB Advanced TopicsMongoDB Advanced Topics
MongoDB Advanced Topics
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB
 
Sql cheat sheet
Sql cheat sheetSql cheat sheet
Sql cheat sheet
 
Getting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETGetting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NET
 
it's just search
it's just searchit's just search
it's just search
 
PostgreSQL's Secret NoSQL Superpowers
PostgreSQL's Secret NoSQL SuperpowersPostgreSQL's Secret NoSQL Superpowers
PostgreSQL's Secret NoSQL Superpowers
 
Save Repository From Save
Save Repository From SaveSave Repository From Save
Save Repository From Save
 

En vedette

En vedette (20)

‘CHINA BEYOND’ STUDY OFFERS INSIGHT INTO CONSUMERS,
‘CHINA BEYOND’ STUDY OFFERS INSIGHT INTO CONSUMERS,‘CHINA BEYOND’ STUDY OFFERS INSIGHT INTO CONSUMERS,
‘CHINA BEYOND’ STUDY OFFERS INSIGHT INTO CONSUMERS,
 
MARC Died
MARC DiedMARC Died
MARC Died
 
Data Salon 3 - Ghent
Data Salon 3 - GhentData Salon 3 - Ghent
Data Salon 3 - Ghent
 
Element Presentation - Your True Nature Revealed
Element Presentation - Your True Nature RevealedElement Presentation - Your True Nature Revealed
Element Presentation - Your True Nature Revealed
 
Big Data e analisi economica
Big Data e analisi economicaBig Data e analisi economica
Big Data e analisi economica
 
San Mateo County Fair Overview
San Mateo County Fair OverviewSan Mateo County Fair Overview
San Mateo County Fair Overview
 
用户体验设计
用户体验设计用户体验设计
用户体验设计
 
Research Steps 2016
Research Steps 2016Research Steps 2016
Research Steps 2016
 
35 歲前要做的33件事
35 歲前要做的33件事35 歲前要做的33件事
35 歲前要做的33件事
 
OHRI MERC RINFRA 09 July 2011
OHRI MERC RINFRA 09 July 2011OHRI MERC RINFRA 09 July 2011
OHRI MERC RINFRA 09 July 2011
 
20100306 Datasalon 4 : code4lib
20100306 Datasalon 4 : code4lib20100306 Datasalon 4 : code4lib
20100306 Datasalon 4 : code4lib
 
Women Of Algiers 3
Women Of Algiers 3Women Of Algiers 3
Women Of Algiers 3
 
Blindness
BlindnessBlindness
Blindness
 
Steps To Building A Change Accepting Environment
Steps To Building A Change Accepting EnvironmentSteps To Building A Change Accepting Environment
Steps To Building A Change Accepting Environment
 
Test
TestTest
Test
 
Chicago Chemists
Chicago ChemistsChicago Chemists
Chicago Chemists
 
2ST.net Corporate Overview 2012
2ST.net Corporate Overview 20122ST.net Corporate Overview 2012
2ST.net Corporate Overview 2012
 
Whip it
Whip itWhip it
Whip it
 
SOP_TIPPERS
SOP_TIPPERSSOP_TIPPERS
SOP_TIPPERS
 
Directing Traffic - Design Production - HOW Magazine
Directing Traffic - Design Production - HOW Magazine Directing Traffic - Design Production - HOW Magazine
Directing Traffic - Design Production - HOW Magazine
 

Similaire à LibreCat::Catmandu

Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring DataEric Bottard
 
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGMInfinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGMJBug Italy
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeDan Brickley
 
The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...
The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...
The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...Peter Keane
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning ElasticsearchAnurag Patel
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBJustin Smestad
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasMapR Technologies
 
Managing Ontologies
Managing OntologiesManaging Ontologies
Managing OntologiesIWMW
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
OrientDB introduction - NoSQL
OrientDB introduction - NoSQLOrientDB introduction - NoSQL
OrientDB introduction - NoSQLLuca Garulli
 
ElasticSearch, Elastica, ElasticaBundle
ElasticSearch, Elastica, ElasticaBundleElasticSearch, Elastica, ElasticaBundle
ElasticSearch, Elastica, ElasticaBundleNicolas Badey
 
Client-side MVC with Backbone.js
Client-side MVC with Backbone.js Client-side MVC with Backbone.js
Client-side MVC with Backbone.js iloveigloo
 
Client-side MVC with Backbone.js (reloaded)
Client-side MVC with Backbone.js (reloaded)Client-side MVC with Backbone.js (reloaded)
Client-side MVC with Backbone.js (reloaded)iloveigloo
 
Scaling php applications with redis
Scaling php applications with redisScaling php applications with redis
Scaling php applications with redisjimbojsb
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneRahul Jain
 
Catmandu presentation at SWIB 2013
Catmandu presentation at SWIB 2013Catmandu presentation at SWIB 2013
Catmandu presentation at SWIB 2013nicsteenlant
 

Similaire à LibreCat::Catmandu (20)

Catmandu Librecat
Catmandu LibrecatCatmandu Librecat
Catmandu Librecat
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring Data
 
Catmandu / LibreCat Project
Catmandu / LibreCat ProjectCatmandu / LibreCat Project
Catmandu / LibreCat Project
 
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGMInfinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGM
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in Practice
 
The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...
The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...
The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
 
Managing Ontologies
Managing OntologiesManaging Ontologies
Managing Ontologies
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
OrientDB introduction - NoSQL
OrientDB introduction - NoSQLOrientDB introduction - NoSQL
OrientDB introduction - NoSQL
 
ElasticSearch, Elastica, ElasticaBundle
ElasticSearch, Elastica, ElasticaBundleElasticSearch, Elastica, ElasticaBundle
ElasticSearch, Elastica, ElasticaBundle
 
Client-side MVC with Backbone.js
Client-side MVC with Backbone.js Client-side MVC with Backbone.js
Client-side MVC with Backbone.js
 
SWT Lecture Session 4 - Sesame
SWT Lecture Session 4 - SesameSWT Lecture Session 4 - Sesame
SWT Lecture Session 4 - Sesame
 
4 sesame
4 sesame4 sesame
4 sesame
 
Client-side MVC with Backbone.js (reloaded)
Client-side MVC with Backbone.js (reloaded)Client-side MVC with Backbone.js (reloaded)
Client-side MVC with Backbone.js (reloaded)
 
Scaling php applications with redis
Scaling php applications with redisScaling php applications with redis
Scaling php applications with redis
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Catmandu presentation at SWIB 2013
Catmandu presentation at SWIB 2013Catmandu presentation at SWIB 2013
Catmandu presentation at SWIB 2013
 

Plus de Patrick Hochstenbach (17)

Elag2015
Elag2015Elag2015
Elag2015
 
Processing Linked Data with Catmandu
Processing Linked Data with CatmanduProcessing Linked Data with Catmandu
Processing Linked Data with Catmandu
 
The Library in 2050
The Library in 2050The Library in 2050
The Library in 2050
 
20130308 webstrategie
20130308 webstrategie20130308 webstrategie
20130308 webstrategie
 
UGent Datacenter of waarom we 140TB kopen
UGent Datacenter of waarom we 140TB kopenUGent Datacenter of waarom we 140TB kopen
UGent Datacenter of waarom we 140TB kopen
 
देवनागरी Devanāgarī
 देवनागरी Devanāgarī  देवनागरी Devanāgarī
देवनागरी Devanāgarī
 
Informatie Aan Zee - TTT E-Research
Informatie Aan Zee - TTT E-ResearchInformatie Aan Zee - TTT E-Research
Informatie Aan Zee - TTT E-Research
 
Informatie Aan Zee - TTT Digital Architecture
Informatie Aan Zee - TTT Digital ArchitectureInformatie Aan Zee - TTT Digital Architecture
Informatie Aan Zee - TTT Digital Architecture
 
ELAG2011 Bootcamp
ELAG2011 BootcampELAG2011 Bootcamp
ELAG2011 Bootcamp
 
Gent_M 2011-04-26
Gent_M 2011-04-26Gent_M 2011-04-26
Gent_M 2011-04-26
 
Biblio
BiblioBiblio
Biblio
 
GREP - Ghent University Repository
GREP - Ghent University RepositoryGREP - Ghent University Repository
GREP - Ghent University Repository
 
Open | Linked | Open Linked data
Open | Linked | Open Linked dataOpen | Linked | Open Linked data
Open | Linked | Open Linked data
 
20100831 igelu mobilise_ugent
20100831 igelu mobilise_ugent20100831 igelu mobilise_ugent
20100831 igelu mobilise_ugent
 
20100618 Datasalon5 Vooruit Gent
20100618 Datasalon5 Vooruit Gent20100618 Datasalon5 Vooruit Gent
20100618 Datasalon5 Vooruit Gent
 
20091120 Vlengel Maastricht
20091120 Vlengel Maastricht20091120 Vlengel Maastricht
20091120 Vlengel Maastricht
 
20081007 Workshop BOM-VL WP3
20081007  Workshop BOM-VL WP320081007  Workshop BOM-VL WP3
20081007 Workshop BOM-VL WP3
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

LibreCat::Catmandu

  • 2.
  • 4. Kahn-Wilensky Web Handle I search a paper about... Repository Service Repository Provider
  • 5. Hypothesis 1: One Network & Common Schema Hypothesis 2: Object Oriented Design Hypothesis 3: The Resource is the Message
  • 6. Hypothesis 1: One Network & Common Schema OpenAIRE Student Papers Google Europeana CRIS PhD Inst. Repository Digitized Data Material Sets
  • 7. Hypothesis 2: Object Oriented Design •(Complex) Digital Objects • Unique identifiers • Key-Metadata + 1 or more data streams • Metadata schemes invariant over repositories • Specialized data types + service bindings on these types • Fragmented input • Incomplete • Metadata, Files, Metadata + Files • Many schemas • Simple data types (String, Array, Map) many native functions on these types
  • 8. Researcher Hypothesis 3: The Resource is the Message DNS Dr Müller Repository Google ISI Researcher PubMed Homepage
  • 9. Extract, Transform & Load Fulltext Search Relational Database Report Excel
  • 10. Schemaless databases $store->add({ ...} ); $store->search({ ...} );
  • 11. Copy & Paste Functional Style of Programming on native hashes and arrays “It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures.” Alan J. Perlis
  • 13. Download, Install & Play • Perl • http://librecat.org • cpan Catmandu • https://github.com/LibreCat/Catmandu
  • 14. Anatomy of Search database export index def fix def convert filter/map fix index store field def display
  • 15. Store Schemaless storage Elastic Solr Search JSON store Mongo DBI DB $store->add({}) $store->search() title:“krank” and subject.local:“cycle” title any “krank” and subject.local any “cycle”
  • 16. Import Import from many sources Atom LDAP DBI JSON importer MARC OAI SRU $store->each({}) $store->first() $store->rest() $store->select({}) $store->any({}) $store->many({})
  • 17. ETL upcase('job'); JSON capitalize('first'); capitalize('last'); capitalize('my.deep.nested.0'); upcase('my.deep.nested.0'); fix downcase('my.deep.nested.0'); substring('my.deep.nested.0',0,2); fix add_field('test'); add_field('income',0); add_field('a.0.0.0',1); marc_map('100','my.authors.$append'); JSON marc_map('710','my.authors.$append'); marc_map('600x','my.subjects.$append'); marc_map('008_/35-37','my.language'); join_field('colors.0','/');
  • 18. MVC Dancer app/ |--bin/ |--public/ hello.tt | |-- images/ <html> | |-- css/ <body> | -- javascript/ <h1>[% txt %]</h1> |--views/ [% FOREACH obj IN res %] | -- hello.tt <p>[% obj.title %]</p> --environments/ [% END %] </body> </html> get ‘/hello’ => sub { my $res = store->bag ->search(query => ...) ->reduce( ... ); template ‘hello’ , { res => $res, txt => “Hello, World!” }; };
  • 20. http://lup.lub.lu.se ISI PubMed LibreCat Catalog Fedora Mongo Store Store
  • 21. http://pub.uni-bielefeld.de/ Any SRU/REST/Lucene/ Fix CSL JSON Store
  • 22. http://biblio.ugent.be Fix Any SRU/REST/Lucene/ JSON Elastic Mongo Store Search LDAP JCR Project CRIS
  • 23. http://adore.ugent.be MySQL Fix Aleph Store SOLR IIPImage Fedora Store
  • 24.
  • 25.
  • 26. Project Plan • Catmandu: Open Source Data Toolkit • LibreCat: Example Programs: • LibreCat-Search, LibreCat-Citation, LibreCat-Grim, LibreCat-Archive ... • Suite of repository add-ons: • Project Database, Research Groups, Authority Files
  • 27. • Nicolas Steenlant - Ghent • Nicolas Franck - Ghent • Patrick Hochstenbach - Ghent • Snorri Briem - Lund • Dave Sherohman - Lund • Jörgen Eriksson - Lund Thanks! • Maria Hedberg - Lund • Friedrich Summann - Bielefeld • Najko Jahn - Bielefeld • Vitali Peil - Bielefeld • Petra Kohorst - Bielefeld http://librecat.org

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n