SlideShare une entreprise Scribd logo
1  sur  22
Cluj Napoca, 28 August 2008
2008 IEEE International Conference on Intelligent Computer Communication and Processing
Digital Libraries Workshop
Towards a GRID-Based Digital Library
Management System.
Gheorghe Sebestyén-Pál1
, Doina Banciu2
, Tünde Bálint1
,
Bogdan Moscaiuc1
, and Ágnes Sebestyén-Pál1
1- Technical University of Cluj-Napoca
2 - ICI Bucharest
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Content
 Classical vs. Digital Libraries
 Recent research on Digital Libraries (DL)
 Main issues and requirements for DLs
 An ontology-based DL model
 Grid-enabled DL
 Implementation considerations of a pilot DL
 Experiments
 Conclusions
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Classical vs. Digital Libraries
 Classical library
 a repository of knowledge organized mainly on
paper
 Digital library
 Not only a digitized version of a classical library
 A new set of functionalities and services are added (e.g.
access control, resources management and allocation,
complex search and processing services, etc.)
 A data exchange and cooperation environment
 DLs are becoming digital content management systems
 Incorporates a wide variety of formats and data types ( text,
audio, video, multi-document complex digital objects)
 Uses a variety of communication and data-exchange
protocols and standards
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
IT and Communication technologies involved in
the implementation of digital libraries
http://mapageweb.umontreal.ca/turner/meta/english/metamap.html
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Goals for modern DLs
 DELOS project’s vision –
 “to enable any person to access all human knowledge
anytime and anywhere, in a friendly, multi-modal,
efficient, and effective way, by overcoming barriers of
distance, language, and culture and by using multiple
Internet-connected devices”
 DL - a knowledge repository and an information
exchange infrastructure that allows:
 data generation,
 processing and
 seamless access to relevant information, regardless of the
geographic distribution of hardware resources, databases
or persons.
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Research in digital libraries
 Delos Network of Excellence –
 Goals: to define and implement digital libraries on new computing and
communication technologies
 Achievements: definition of functional and architectural
requirements for DL implementation
 BRICKS project
 Goals: to design a user and service-oriented space to share
knowledge and resources in a multi-cultural heritage.
 Achievements:
 Definition of a digital library architecture for a very broad and
heterogeneous user community; automatic indexing and annotation
functionalities
 OpenDlib project
 Goal: development of a software toolkit for dedicated DLs generation
 Achievements: tools for content harvesting form existing resources
 Fedora, DSpace – open source software for DLs
 Lucene – open source Search engines
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Research in digital libraries (cont.)
 Diligent project (part of EGEE project)
 Goal: the use of GRID infrastructure for DL implementation
 Achievements: a new vision about the DL concept:
 DL = a dynamic digital content repository and management system
dedicated for a purpose (e.g. a project, an art collection, an academic
course)
 Definition of generic DL services mapped on GRID services
 DLs dedicated for different domains – with powerful processing
capabilities
 SINRED project – National Excellency project
 Goal: development of a national framework for DLs specialized on
technical sciences and research
 Achievements: evaluation of requirements, evaluation of existing
software, infrastructure development, DL model definition,
implementation of a pilot DL
 SIPADOC project – National research program
 Goal: reevaluation of the national patrimony through DLs
 Achievements: evaluation of digitizing tools
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Key issues in DL implementation
 Architectural issues:
 distributed nature of storage, processing and access resources
 Scalability, flexibility, interoperability
 Functional requirements:
 Core functions: storage, indexing and annotation, data-search, content
retrieval, users management
 Content organization should reflect semantic connections
 Processing facilities
 Data processing services – specialized for different fields
 Pattern search and recognition
 QoS issues
 Restricted time to obtain relevant information
 Reasonable time for complex data processing
 User and access control management
 Virtual organizations
 Role-based access
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
DL = Essence & Metadata Management
Text
Audio
Video
Text
Digital content
generation and
harvesting
Management of
essence
Automatic feature
(metadata) extraction
Metadata
Management
Cataloging, indexing,
annotation
Access and
visualization
Cataloging
information system
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
An ontology-based
Digital Library approach
 Ontology: concepts and relations together with a
reasoning engine
 Ontology for technical and scientific domains
 Main concepts:
 Digital objects:
 association of content, metadata and
procedures
 Examples: articles, technical reports,
prospects, PhD Thesis, patents
 Digital collections
 Set of digital objects structured for a
given goal/purpose of based on a
given criterion
 Examples: articles of an author,
documents of a domain
 Events
 Conferences, workshops, seminars
 Processes
 Projects
 Courses
 Virtual organizations
 Roles
 users
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Grid-enabled digital library services
 Why DLs on GRID infrastructure?
 Huge volume of documents/digital objects
 Concurrent access and multiple search engines (see
Google)
 Multimedia streaming
 Automatic indexing and annotation
 Complex processing requires prohibitive time
 User management through virtual organizations
 Job distribution facilities offered by GRID
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
DL functions mapped on GRID services
Computing, storage and communication resources
Digital Library
GRID Services
Collections
management
Catalog and
metadata
management
Digital objects
management
Users’
management
Data
visualization
Virtual
organizations
management
Resource
management
Task
distribution
Processing
Data distribution
and replication
Data processing
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Experiments
 Two approaches:
 DL implementation on Alchemi GRID (Microsoft)
 Job distribution at thread level
 Explicit GRID programming
 Experiments with multimedia streaming (multimedia content
distribution)
 DL implementation on Condor GRID (Open source)
 Job distribution at task level
 Job and data distribution is transparent to the DL application
( distribution is made through separate scripts)
 Experiments with “key-word search” in the whole DL content
 The execution time decreased with the number of executor
computers
 For more than 5 executors the scheduling and communication
time is comparable with the execution time
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
A pilot implementation of a Digital library
framework developed with GRID support
 Goal: implementation of a digital content storage and retrieval
system dedicated for educational and scientific activities (courses,
projects, etc.)
 Main requirements:
 A DL adaptable for a given purpose/goal
 Access controlled and restricted with virtual organizations
 Ontology-based approach (concepts, relations, semantic search)
 Advanced search procedures
 GRID-enabled full-text search services – for better reaction time
 Access through Internet browsers
 The result:
 A distributed digital library application, which allows:
 Management of digital objects (upload, storage, indexing, metadata
creation
 Management of collections
 Management of users and virtual organizations
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Pilot DL details:
(www.bib-dig.utcluj.ro)
 Management of digital objects
 Digital Documents’ upload,
 Annotation, metadata generation according with
Dublin Core
 Distributed Storage of data
 Management of collections
 Define a new collection
 Attach new documents to an existing collection
 Associate access rights to a collection
 Management of users and virtual organizations
 Define new users and new virtual organizations
 Define roles
 Associate roles to users and collections
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Snapshots of the DL application’s interface
bib-dig.utcluj.ro
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Snapshots of the DL application’s interface
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Search techniques in DLs
 through key-word or index search:
 Database techniques
 through semantic Information
Retrieval:
 Semantic graph with documents
and concepts
 through non-semantic Information
Retrieval:
 Naive Bayes Algorithm
 Probabilistic approach
 Based on probabilistic
similarity between documents
 Topic-Based Vector Space
Model Algorithm
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Experimental results
Execution time v. s. number of executor nodes
0
1000
2000
3000
4000
5000
6000
7000
8000
1 2 3 4 5
Nodes
Time(s)
Search execution time
Scheduling and
communication time
(case 1)
Scheduling and
communication time
(case 2)
Total time (case1)
Total time (case2)
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Experiments
Debrecen, 3-5 September 2008, DAPSYS’08
7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS
Conclusions
 DLs are complex content management systems that extend the functionalities of
classical libraries:
 Semantic organization of a wide variety of information formats
 Multiple search and data retrieval techniques (including full-text and
semantic search):
 Key-word full-text search
 Semantic search
 Statistical and probabilistic retrieval and classification
 Access control to distributed and remote data
 DLs are Data exchange and cooperation environments
 Useful for remote and cooperative work
 DLs must include powerful search and data retrieval engines
 GRID infrastructures may be a feasible support in the implementation of DLs
 For more efficient parallel search, classification or automatic annotation
Cluj Napoca, 28 August 2008
2008 IEEE International Conference on Intelligent Computer Communication and Processing
Digital Libraries Workshop
Thank you for your
attention
Questions ?

Contenu connexe

Tendances

User Focused Digital Library: A Practical Guide
User Focused Digital Library: A Practical GuideUser Focused Digital Library: A Practical Guide
User Focused Digital Library: A Practical Guide
Sophia Guevara
 
The open semantic enterprise enterprise data meets web data
The open semantic enterprise   enterprise data meets web dataThe open semantic enterprise   enterprise data meets web data
The open semantic enterprise enterprise data meets web data
Georg Guentner
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies
Shriram Pandey
 

Tendances (19)

Metadata and Scotland’s information environment: potential benefits of Web 2.0
Metadata and Scotland’s information environment: potential benefits of Web 2.0Metadata and Scotland’s information environment: potential benefits of Web 2.0
Metadata and Scotland’s information environment: potential benefits of Web 2.0
 
Dlindia
DlindiaDlindia
Dlindia
 
User Focused Digital Library: A Practical Guide
User Focused Digital Library: A Practical GuideUser Focused Digital Library: A Practical Guide
User Focused Digital Library: A Practical Guide
 
Hartley Presentation on Cataloging & Metadata Trends
Hartley Presentation on Cataloging & Metadata TrendsHartley Presentation on Cataloging & Metadata Trends
Hartley Presentation on Cataloging & Metadata Trends
 
New challenges for digital scholarship and curation in the era of ubiquitous ...
New challenges for digital scholarship and curation in the era of ubiquitous ...New challenges for digital scholarship and curation in the era of ubiquitous ...
New challenges for digital scholarship and curation in the era of ubiquitous ...
 
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
 
Digital library presentation
Digital library presentationDigital library presentation
Digital library presentation
 
Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)
 
Digital library
Digital libraryDigital library
Digital library
 
Introduction to Digital libraries
Introduction to Digital librariesIntroduction to Digital libraries
Introduction to Digital libraries
 
DIGITAL LIBRARY ARCHITECTURE
DIGITAL LIBRARY ARCHITECTUREDIGITAL LIBRARY ARCHITECTURE
DIGITAL LIBRARY ARCHITECTURE
 
Digital library
Digital libraryDigital library
Digital library
 
Website designing company_in_delhi_digitization practices
Website designing company_in_delhi_digitization practicesWebsite designing company_in_delhi_digitization practices
Website designing company_in_delhi_digitization practices
 
The open semantic enterprise enterprise data meets web data
The open semantic enterprise   enterprise data meets web dataThe open semantic enterprise   enterprise data meets web data
The open semantic enterprise enterprise data meets web data
 
Digital Library Initiatives in Philippine Academic Libraries: the Rizal Libra...
Digital Library Initiatives in Philippine Academic Libraries: the Rizal Libra...Digital Library Initiatives in Philippine Academic Libraries: the Rizal Libra...
Digital Library Initiatives in Philippine Academic Libraries: the Rizal Libra...
 
Digital Library
Digital LibraryDigital Library
Digital Library
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies
 
Digital Libraries and the quest for information curation
Digital Libraries and the quest for information curationDigital Libraries and the quest for information curation
Digital Libraries and the quest for information curation
 
Digital Library UNIT-3
Digital Library UNIT-3Digital Library UNIT-3
Digital Library UNIT-3
 

En vedette (10)

Kumra (1)
Kumra (1)Kumra (1)
Kumra (1)
 
lib notes
lib noteslib notes
lib notes
 
Some Differences between Questionnaire Types
Some Differences between Questionnaire TypesSome Differences between Questionnaire Types
Some Differences between Questionnaire Types
 
Questionnaires
QuestionnairesQuestionnaires
Questionnaires
 
Questionnaire designing in a research process
Questionnaire designing in a research processQuestionnaire designing in a research process
Questionnaire designing in a research process
 
Questionnaire Design
Questionnaire DesignQuestionnaire Design
Questionnaire Design
 
steps in Questionnaire design
steps in Questionnaire designsteps in Questionnaire design
steps in Questionnaire design
 
Presentation On Questionnaire
Presentation On QuestionnairePresentation On Questionnaire
Presentation On Questionnaire
 
Questionnaire
QuestionnaireQuestionnaire
Questionnaire
 
Questionnaire Design
Questionnaire DesignQuestionnaire Design
Questionnaire Design
 

Similaire à Dapsys08 dl on_grid

Developments in Access to Art Information: EnCompass Digital Portal. 2003
Developments in Access to Art Information: EnCompass Digital Portal. 2003Developments in Access to Art Information: EnCompass Digital Portal. 2003
Developments in Access to Art Information: EnCompass Digital Portal. 2003
Rose Holley
 
20080903arsenalsofnemesis 04
20080903arsenalsofnemesis 0420080903arsenalsofnemesis 04
20080903arsenalsofnemesis 04
Richard Ovenden
 

Similaire à Dapsys08 dl on_grid (20)

Aggregation as tactic sm new
Aggregation as tactic sm newAggregation as tactic sm new
Aggregation as tactic sm new
 
Aggregation as Tactic
Aggregation as TacticAggregation as Tactic
Aggregation as Tactic
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Consortium on Digitization of Indian Agricultural Library Resources
Consortium on Digitization of Indian Agricultural Library  ResourcesConsortium on Digitization of Indian Agricultural Library  Resources
Consortium on Digitization of Indian Agricultural Library Resources
 
Building Heterogeneous Networks of Digital Libraries on the Semantic Web
Building Heterogeneous Networks of Digital Libraries on the Semantic WebBuilding Heterogeneous Networks of Digital Libraries on the Semantic Web
Building Heterogeneous Networks of Digital Libraries on the Semantic Web
 
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projects
 
Digital Libraries of the Future
Digital Libraries of the Future
Digital Libraries of the Future
Digital Libraries of the Future
 
Developments in Access to Art Information: EnCompass Digital Portal. 2003
Developments in Access to Art Information: EnCompass Digital Portal. 2003Developments in Access to Art Information: EnCompass Digital Portal. 2003
Developments in Access to Art Information: EnCompass Digital Portal. 2003
 
Project management report-on Digital Libraries
Project management report-on Digital LibrariesProject management report-on Digital Libraries
Project management report-on Digital Libraries
 
20080903arsenalsofnemesis 04
20080903arsenalsofnemesis 0420080903arsenalsofnemesis 04
20080903arsenalsofnemesis 04
 
JeromeDL Tutorial
JeromeDL TutorialJeromeDL Tutorial
JeromeDL Tutorial
 
Edinburgh DataShare - DSpace for Data
Edinburgh DataShare - DSpace for DataEdinburgh DataShare - DSpace for Data
Edinburgh DataShare - DSpace for Data
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
A Cultural Heritage Repository as Source for Learning Materials
A Cultural Heritage Repository as Source for Learning MaterialsA Cultural Heritage Repository as Source for Learning Materials
A Cultural Heritage Repository as Source for Learning Materials
 
Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries
 
Saving Queries
Saving QueriesSaving Queries
Saving Queries
 
Ppls mvm2
Ppls mvm2Ppls mvm2
Ppls mvm2
 
Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs
 
Open Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and ExchangeOpen Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and Exchange
 

Plus de madhuvardhan

Plus de madhuvardhan (20)

Mdld show-all
Mdld show-allMdld show-all
Mdld show-all
 
Dspace madhu s
Dspace madhu sDspace madhu s
Dspace madhu s
 
Ecdl2004
Ecdl2004Ecdl2004
Ecdl2004
 
Class 5-introto dl
Class 5-introto dlClass 5-introto dl
Class 5-introto dl
 
E learning
E learningE learning
E learning
 
ugc net
ugc netugc net
ugc net
 
Print net
Print netPrint net
Print net
 
Binary true ppt
Binary true pptBinary true ppt
Binary true ppt
 
Open access
Open accessOpen access
Open access
 
Research methidology
Research methidologyResearch methidology
Research methidology
 
Style manual assingment (1)
Style manual assingment (1)Style manual assingment (1)
Style manual assingment (1)
 
Open access (1)
Open access (1)Open access (1)
Open access (1)
 
Mc computer glossary new
Mc   computer glossary  newMc   computer glossary  new
Mc computer glossary new
 
Binding standards ms
Binding standards msBinding standards ms
Binding standards ms
 
madhu
madhumadhu
madhu
 
553 what are digital libraries
553 what are digital libraries553 what are digital libraries
553 what are digital libraries
 
Dapsys08 dl on_grid
Dapsys08 dl on_gridDapsys08 dl on_grid
Dapsys08 dl on_grid
 
Class 5-introto dl
Class 5-introto dlClass 5-introto dl
Class 5-introto dl
 
553 what are digital libraries
553 what are digital libraries553 what are digital libraries
553 what are digital libraries
 
research methodology
research methodologyresearch methodology
research methodology
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Dapsys08 dl on_grid

  • 1. Cluj Napoca, 28 August 2008 2008 IEEE International Conference on Intelligent Computer Communication and Processing Digital Libraries Workshop Towards a GRID-Based Digital Library Management System. Gheorghe Sebestyén-Pál1 , Doina Banciu2 , Tünde Bálint1 , Bogdan Moscaiuc1 , and Ágnes Sebestyén-Pál1 1- Technical University of Cluj-Napoca 2 - ICI Bucharest
  • 2. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Content  Classical vs. Digital Libraries  Recent research on Digital Libraries (DL)  Main issues and requirements for DLs  An ontology-based DL model  Grid-enabled DL  Implementation considerations of a pilot DL  Experiments  Conclusions
  • 3. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Classical vs. Digital Libraries  Classical library  a repository of knowledge organized mainly on paper  Digital library  Not only a digitized version of a classical library  A new set of functionalities and services are added (e.g. access control, resources management and allocation, complex search and processing services, etc.)  A data exchange and cooperation environment  DLs are becoming digital content management systems  Incorporates a wide variety of formats and data types ( text, audio, video, multi-document complex digital objects)  Uses a variety of communication and data-exchange protocols and standards
  • 4. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS IT and Communication technologies involved in the implementation of digital libraries http://mapageweb.umontreal.ca/turner/meta/english/metamap.html
  • 5. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Goals for modern DLs  DELOS project’s vision –  “to enable any person to access all human knowledge anytime and anywhere, in a friendly, multi-modal, efficient, and effective way, by overcoming barriers of distance, language, and culture and by using multiple Internet-connected devices”  DL - a knowledge repository and an information exchange infrastructure that allows:  data generation,  processing and  seamless access to relevant information, regardless of the geographic distribution of hardware resources, databases or persons.
  • 6. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Research in digital libraries  Delos Network of Excellence –  Goals: to define and implement digital libraries on new computing and communication technologies  Achievements: definition of functional and architectural requirements for DL implementation  BRICKS project  Goals: to design a user and service-oriented space to share knowledge and resources in a multi-cultural heritage.  Achievements:  Definition of a digital library architecture for a very broad and heterogeneous user community; automatic indexing and annotation functionalities  OpenDlib project  Goal: development of a software toolkit for dedicated DLs generation  Achievements: tools for content harvesting form existing resources  Fedora, DSpace – open source software for DLs  Lucene – open source Search engines
  • 7. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Research in digital libraries (cont.)  Diligent project (part of EGEE project)  Goal: the use of GRID infrastructure for DL implementation  Achievements: a new vision about the DL concept:  DL = a dynamic digital content repository and management system dedicated for a purpose (e.g. a project, an art collection, an academic course)  Definition of generic DL services mapped on GRID services  DLs dedicated for different domains – with powerful processing capabilities  SINRED project – National Excellency project  Goal: development of a national framework for DLs specialized on technical sciences and research  Achievements: evaluation of requirements, evaluation of existing software, infrastructure development, DL model definition, implementation of a pilot DL  SIPADOC project – National research program  Goal: reevaluation of the national patrimony through DLs  Achievements: evaluation of digitizing tools
  • 8. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Key issues in DL implementation  Architectural issues:  distributed nature of storage, processing and access resources  Scalability, flexibility, interoperability  Functional requirements:  Core functions: storage, indexing and annotation, data-search, content retrieval, users management  Content organization should reflect semantic connections  Processing facilities  Data processing services – specialized for different fields  Pattern search and recognition  QoS issues  Restricted time to obtain relevant information  Reasonable time for complex data processing  User and access control management  Virtual organizations  Role-based access
  • 9. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS DL = Essence & Metadata Management Text Audio Video Text Digital content generation and harvesting Management of essence Automatic feature (metadata) extraction Metadata Management Cataloging, indexing, annotation Access and visualization Cataloging information system
  • 10. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS An ontology-based Digital Library approach  Ontology: concepts and relations together with a reasoning engine  Ontology for technical and scientific domains  Main concepts:  Digital objects:  association of content, metadata and procedures  Examples: articles, technical reports, prospects, PhD Thesis, patents  Digital collections  Set of digital objects structured for a given goal/purpose of based on a given criterion  Examples: articles of an author, documents of a domain  Events  Conferences, workshops, seminars  Processes  Projects  Courses  Virtual organizations  Roles  users
  • 11. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Grid-enabled digital library services  Why DLs on GRID infrastructure?  Huge volume of documents/digital objects  Concurrent access and multiple search engines (see Google)  Multimedia streaming  Automatic indexing and annotation  Complex processing requires prohibitive time  User management through virtual organizations  Job distribution facilities offered by GRID
  • 12. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS DL functions mapped on GRID services Computing, storage and communication resources Digital Library GRID Services Collections management Catalog and metadata management Digital objects management Users’ management Data visualization Virtual organizations management Resource management Task distribution Processing Data distribution and replication Data processing
  • 13. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Experiments  Two approaches:  DL implementation on Alchemi GRID (Microsoft)  Job distribution at thread level  Explicit GRID programming  Experiments with multimedia streaming (multimedia content distribution)  DL implementation on Condor GRID (Open source)  Job distribution at task level  Job and data distribution is transparent to the DL application ( distribution is made through separate scripts)  Experiments with “key-word search” in the whole DL content  The execution time decreased with the number of executor computers  For more than 5 executors the scheduling and communication time is comparable with the execution time
  • 14. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS A pilot implementation of a Digital library framework developed with GRID support  Goal: implementation of a digital content storage and retrieval system dedicated for educational and scientific activities (courses, projects, etc.)  Main requirements:  A DL adaptable for a given purpose/goal  Access controlled and restricted with virtual organizations  Ontology-based approach (concepts, relations, semantic search)  Advanced search procedures  GRID-enabled full-text search services – for better reaction time  Access through Internet browsers  The result:  A distributed digital library application, which allows:  Management of digital objects (upload, storage, indexing, metadata creation  Management of collections  Management of users and virtual organizations
  • 15. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Pilot DL details: (www.bib-dig.utcluj.ro)  Management of digital objects  Digital Documents’ upload,  Annotation, metadata generation according with Dublin Core  Distributed Storage of data  Management of collections  Define a new collection  Attach new documents to an existing collection  Associate access rights to a collection  Management of users and virtual organizations  Define new users and new virtual organizations  Define roles  Associate roles to users and collections
  • 16. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Snapshots of the DL application’s interface bib-dig.utcluj.ro
  • 17. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Snapshots of the DL application’s interface
  • 18. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Search techniques in DLs  through key-word or index search:  Database techniques  through semantic Information Retrieval:  Semantic graph with documents and concepts  through non-semantic Information Retrieval:  Naive Bayes Algorithm  Probabilistic approach  Based on probabilistic similarity between documents  Topic-Based Vector Space Model Algorithm
  • 19. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Experimental results Execution time v. s. number of executor nodes 0 1000 2000 3000 4000 5000 6000 7000 8000 1 2 3 4 5 Nodes Time(s) Search execution time Scheduling and communication time (case 1) Scheduling and communication time (case 2) Total time (case1) Total time (case2)
  • 20. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Experiments
  • 21. Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Conclusions  DLs are complex content management systems that extend the functionalities of classical libraries:  Semantic organization of a wide variety of information formats  Multiple search and data retrieval techniques (including full-text and semantic search):  Key-word full-text search  Semantic search  Statistical and probabilistic retrieval and classification  Access control to distributed and remote data  DLs are Data exchange and cooperation environments  Useful for remote and cooperative work  DLs must include powerful search and data retrieval engines  GRID infrastructures may be a feasible support in the implementation of DLs  For more efficient parallel search, classification or automatic annotation
  • 22. Cluj Napoca, 28 August 2008 2008 IEEE International Conference on Intelligent Computer Communication and Processing Digital Libraries Workshop Thank you for your attention Questions ?