SlideShare une entreprise Scribd logo
1  sur  82
PhD Defense
Nikos Palavitsinis
SupervisorsSupervisors
Dr. Salvador-Sanchez AlonsoDr. Salvador-Sanchez Alonso,, UAHUAH
Dr. Nikolaos ManouselisDr. Nikolaos Manouselis,, Agro-KnowAgro-Know
Departamento de Ciencias de la ComputaciónDepartamento de Ciencias de la Computación
 Introduction
◦ Problem Definition
◦ Research Questions
 Problem Solution
 Experiments/Results
 Conclusions
◦ Research Contributions
◦ Future Directions of Research
 Lies in the centre of every repository project
◦ Defines and drives the description of content
◦ Allows storage, management & retrieval of content as well
as its preservation
 Content, content, content
◦ Global Learning Objects Brokered Exchange (GLOBE):
>1.000.000 learning resources
◦ Europeana: > 30.000.000 cultural resources
◦ VOA3R: >2.500.000 research resources
 Quality issues can be identified both in
the digital content as well as in the metadata
describing it
Learning Object
Actual DigitalActual Digital
ContentContent
MetadataMetadata
 Significant problems with the quality of the
metadata that describe content of various types
◦ Lack of completeness,
◦ Redundant metadata,
◦ Lack of clarity,
◦ Incorrect use of elements,
◦ Structural inconsistency,
◦ Inaccurate representation,
◦ …
1 2 3
4
5
6
7
8
MMMM QQQQ AAAA CCCC PPPP
MMMM QQQQ AAAA CCCC PPPP
MMMM QQQQ AAAA CCCC PPPP
MMMM QQQQ AAAA CCCC PPPP
 The involvement of domain experts and creators of
digital resources in metadata design & annotation is
crucial. Documentation that supports the metadata
annotation process is essential
 Metadata quality frameworks suggested already
have not proven their adaptability
 The cost of producing quality metadata for digital
objects is significant for any digital repository
project
Description
 Base Metadata Schema
 Metadata Understanding Questionnaire
 Metadata Quality Assessment Grid
 Excel Completeness Assessment Tool
 Good & Bad Metadata Practices Guide
 Metadata Expert (ME)
◦ Metadata design, MQACP application, experiments, results,
supporting all other actors
 Repository Expert (RE)
◦ Tool development, technical support, data exports
 Domain Expert (DE)
◦ Metadata design, annotation & review metadata records
 Metadata Understanding Session
◦ Workshop format
◦ Printed form, application profile
◦ Easiness, usefulness and obligation
◦ Questionnaire, Likert Scale
 Hands-on exercise on metadata annotation
◦ Sample of resources
◦ Annotated on paper
◦ Support from Metadata Expert
Metadata DesignMetadata Design
PhasePhase
 Hands-on annotation experiment
◦ Domain experts go digital
◦ Testing both metadata and system
 Review of metadata records
◦ Metadata experts test the AP
◦ Refining the definitions of elements wherever needed.
judging from real usage data
◦ Core metadata quality criteria
Testing PhaseTesting Phase
 Peer-review experiment
◦ Large scale
◦ Representative sample
◦ Everyone reviews everyone
◦ Metadata Peer Review Grid
◦ Online process
◦ Accuracy; Completeness; Consistency; Objectiveness
Appropriateness; Correctness; Overall
Calibration PhaseCalibration Phase
 Analysis of usage data from tools
◦ Mainly completeness here
 Metadata Quality Certification concept
◦ Concept of validation as a quality mark
Critical Mass PhaseCritical Mass Phase
 Regular Completeness Analysis
◦ Monthly
◦ Automated
 Online Peer-Review of records
◦ Fixed
 Informal Mechanisms
◦ Prizes
◦ Awards
◦ Social
Regular OperationRegular Operation
PhasePhase
Testing
 Initiative: Organic.Edunet
 Topics: Organic Agriculture, Agroecology,
Environmental Education
 Resources: >10.000
 Content Providers: 8
 Languages: 10
 Content Type: Mainly image & text with some videos
Duration: 2 hours
Date: January 2009
Annotated Objects: Not applicable
Involved people: 20 ME & DE
Results
Question [0.1] (1.2] (2.3] (3.4] (4.5]
Is the element easy for you to
understand?
0% 0% 8.8% 71.9% 19.3%
Is this element useful for
describing educational resources?
0% 0% 26.3%26.3% 61.4% 12.3%
Metadata DesignMetadata Design
PhasePhase
Mandatory Recommended Optional
Question Before After Before After Before After
Should this element be
mandatory. recommended or
optional?
6 11 26 17 24 29
Percentile change in overall
number of mandatory /
recommended or optional
elements
+83% -34% +21%
Metadata DesignMetadata Design
PhasePhase
Duration: 1 week
Date: April 2009
Annotated Objects: 500 objects (≈5% of total)
Resources Reviewed: 60 (30 per expert)
Involved people: 2 ME
 Outcomes were documented in the Good & Bad
Metadata Practices document that was drafted
after the peer-review process.
 The Grid was not used
Testing PhaseTesting Phase
Duration: 3 weeks
Date: June 2009
Annotated Objects: 1.000 objects (≈10% of total)
Resources Reviewed: 105 resources (≈5 per expert)
Involved people: 20 DE
Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score
5 40% 51% 50% 69% 41% 69% 40%
4 45% 32% 28% 21% 33% 21% 37%
3 5% 10% 15% 6% 18% 9% 19%
2 9% 3% 1% 2% 5% 0% 0%
1 1% 1% 0% 0% 2% 1% 1%
no 1% 3% 6% 3% 1% 1% 3%
Calibration PhaseCalibration Phase
85 % 83 % 78 % 90 % 74 % 90 % 77 %
G&B
Metadata
Practices
Guide
Duration: 1 week
Date: September 2009
Annotated Objects: 6.653 objects (≈60% of total)
Resources Analyzed: 6.653
Involved people: 1 ME & 1 RE
No Mandatory Elements %
1 General / Title 99.8%
2 …/ Language 93.9%
3 …/ Description 94.8%
4 Rights / Cost 15.7%15.7%
5 Rights / Cost Copyright And Other Restrictions 16.0%16.0%
Critical Mass PhaseCritical Mass Phase
No Recommended Elements %
6 General / Keyword 12.8%12.8%
7 LifeCycle / Contribute / Role 11.5%11.5%
8 Educational / Intended End User Role 12.8%12.8%
9 …/ Context 10.2%10.2%
10 …/ Typical Age Range 3.8%3.8%
11 Rights / Description 7.7%7.7%
12 Classification 11.8%11.8%
No Optional Elements %
13 General / Coverage 0.2%0.2%
14 …/ Structure 7.9%7.9%
15 LifeCycle / Status 0.3%0.3%
16 Educational / Interactivity Type 0.3%0.3%
17 …/ Interactivity Level 0.3%0.3%
18 …/ Semantic Density 0.2%0.2%
19 …/ Difficulty 0.1%0.1%
20 …/ Typical Learning Time 0%0%
21 …/ Language 0.3%0.3%
22 …/ Description 1.5%1.5%
Critical Mass PhaseCritical Mass Phase
Duration: 1 week
Date: September 2010
Annotated Objects: 11.000 objects (100%)
Resources Analyzed: 11.000
Involved people: 1 ME
No Mandatory Elements Sep 2009 Sep 2010 Diff.
1 General / Title 99.8% 100% +0.2%
2 …/ Language 93.9% 99.9% +6%
3 …/ Description 94.8% 100% +5.2%
4 Rights / Cost 15.7%15.7% 82.4% +66.7%
5 …/ Cost Copyright & Other Restrictions 16.0%16.0% 99.9% +82.4%
Regular OperationRegular Operation
PhasePhase
No Recommended Elements Sep 2009 Sep 2010 Diff.
6 General / Keyword 12.8%12.8% 99.9% +87.1%
7 LifeCycle / Contribute / Role 11.5%11.5% 77.2% +65.7%
8 Educational / Intended End User Role 12.8%12.8% 82.4% +69.6%
9 …/ Context 10.2%10.2% 81% +70.8%
10 …/ Typical Age Range 3.8%3.8% 63.9% +60.1%
11 Rights / Description 7.7%7.7% 92.4% +84.7%
12 Classification 11.8%11.8% 73.6% +61.8%
No Optional Elements Sep 2009 Sep 2010 Diff.
13 General / Coverage 0.2%0.2% 82.6% +82.4%
14 …/ Structure 7.9%7.9% 82.5% +74.6%
15 LifeCycle / Status 0.3%0.3% 39.7% +39.4%
16 Educational / Interactivity Type 0.3%0.3% 36.9% +36.6%
17 …/ Interactivity Level 0.3%0.3% 37.1% +36.8%
18 …/ Semantic Density 0.2%0.2% 37% +36.8%
19 …/ Difficulty 0.1%0.1% 37.1% +37%
20 …/ Typical Learning Time 0%0% 0.4%0.4% +0.4%
21 …/ Language 0.3%0.3% 52.3% +52%
22 …/ Description 1.5%1.5% 14.7%14.7% +13.2%
Regular OperationRegular Operation
PhasePhase
 49 people (w/duplicates)
 5 QA methods
◦ 1 MUS; 2 Peer-reviews; 2 Usage Data Analyses
 165 records reviewed (unique)
 17.653 records analyzed (w/duplicates)
 134.7 hours in total
Validation
 Initiative: Natural Europe
 Topics: Minerals, Fossils, Plants, etc.
 Resources: >15.000
 Content Providers: 6
 Languages: 10
 Content Type: Mainly image & text with a significant
number of videos
Duration: 2 hours
Date: March 2011
Annotated Objects: Not applicable
Involved people: 11 DE & 1 ME
Results
Question [0.1] (1.2] (2.3] (3.4] (4.5]
Is the element easy for you to
understand?
0% 0% 17.2% 34.5% 48.3%
Is this element useful for
describing cultural resources?
0% 0% 3.4% 37.9% 58.6%
Metadata DesignMetadata Design
PhasePhase
Mandatory Recommended Optional
Question Before After Before After Before After
Should this element be mandatory.
recommended or optional?
13 21 6 2 10 6
Percentile change in overall number of
mandatory / recommended or optional
elements
+62% -66% -40%
Metadata DesignMetadata Design
PhasePhase
Duration: 1 week
Date: August 2011
Annotated Objects: 1000 objects (≈10% of total)
Resources Reviewed: 100 (50 per expert)
Involved people: 2 ME
 Outcomes were documented in the Good & Bad
Metadata Practices document that was drafted
after the peer-review process.
 Significant changes in the AP came out
Testing PhaseTesting Phase
Duration: 3 weeks
Date: March 2012
Annotated Objects: 3.000 objects (≈20% of total)
Resources Reviewed: 99 resources (≈10 per expert)
Involved people: 10 DE
Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score
5 16% 22% 36% 73% 32% 62% 26%
4 42% 43% 40% 13% 29% 22% 38%
3 19% 13% 19% 4% 13% 13% 30%
2 6% 1% 3% 6% 14% 1% 3%
1 13% 9% 1% 4% 11% 1% 2%
no 3% 11% 0% 0% 0% 1% 0%
Calibration PhaseCalibration Phase
58 % 65 % 76 % 86 % 61 % 84 % 64 %
G&B
Metadata
Practices
Guide
Duration: 1 week
Date: May 2012
Annotated Objects: 3.417 objects (≈20% of total)
Resources Analyzed: 3.417
Involved people: 1 ME & 1 RE
No Mandatory Elements %
1 CHO Titles 99.9%
2 CHO Keywords 96.8%
3 Object Titles 100.0%
4 Object Descriptions 89.7%
5 Object URL 99.7%
6 Object Thumbnail URL 97.1%
7 Object Content Type 100.0%
8 Copyrights 100.0%
9 Access 60.4%60.4%
Calibration PhaseCalibration Phase
No Recommended Elements %
10 CHO Types 78.1%
11 Object Creators 88.1%
12 Object Languages 32.4%32.4%
No Optional Elements %
13 Scientific Name 0.3%0.3%
14 Classification 0.3%0.3%
15 Common Names 0.0%0.0%
16 CHO Significant Dates 71.8%
17 CHO Temporal Coverage 21.2%
18 CHO Spatial Coverage 74.7%
19 CHO Mediums 6.6%6.6%
20 CHO Creators 0.0%0.0%
21 Object Creation Dates 16.4%16.4%
22 Object Identifiers 61.8%
23 Object Context URL 16.2%16.2%
24 Related Objects 52.6%
25 Object Formats 89.5%
26 Object Extents 58.2%
Calibration PhaseCalibration Phase
Duration: 1 week
Date: September 2012
Annotated Objects: 9.402 objects (≈62% of total)
Resources Analyzed: 9.402
Involved people: 1 ME
No Mandatory Elements May 2012 Sep 2012
1 CHO Titles 99.9% 98.4%
2 CHO Keywords 96.8% 97.2%
3 Object Titles 100% 100%
4 Object Descriptions 89.7% 99.2%
5 Object URL 99.7% 99.4%
6 Object Thumbnail URL 97.1% 88.2%
7 Object Content Type 100.0% 100%
8 Copyrights 100.0% 100%
9 Access 60.4%60.4% 100%
Critical Mass PhaseCritical Mass Phase
No Recommended Elements % Sep 2012
10 CHO Types 78.1% 78.3%
11 Object Creators 88.1% 86.1%
12 Object Languages 32.4%32.4% 61.7%
No Optional Elements % Sep 2012
13 Scientific Name 0.3%0.3% 59.9%
14 Classification 0.3%0.3% 39.5%
15 Common Names 0.0%0.0% 41.8%
16 CHO Significant Dates 71.8% 46.1%
17 CHO Temporal Coverage 21.2% 18%18%
18 CHO Spatial Coverage 74.7% 70.1%
19 CHO Mediums 6.6%6.6% 3.3%3.3%
20 CHO Creators 0.0%0.0% 49.2%
21 Object Creation Dates 16.4%16.4% 47.5%
22 Object Identifiers 61.8% 63.8%
23 Object Context URL 16.2%16.2% 36.4%
24 Related Objects 52.6% 88.2%
25 Object Formats 89.5% 100%
26 Object Extents 58.2% 47.2%
Critical Mass PhaseCritical Mass Phase
Duration: 3 weeks
Date: September 2012
Annotated Objects: 9.402 objects (≈62% of total)
Resources Reviewed: 34 resources (≈9 per expert)
Involved people: 4 DE
Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score
5 38% 59% 68% 79% 47% 79% 53%
4 29% 21% 12% 6% 29% 9% 29%
3 26% 9% 12% 6% 15% 12% 9%
2 6% 12% 9% 9% 6% 0% 9%
1 0% 0% 0% 0% 3% 0% 0%
no 0% 0% 0% 0% 0% 0% 0%
Critical Mass PhaseCritical Mass Phase
67 % 80 % 80 % 85 % 76 % 88 % 84 %
G&B
Metadata
Practices
Guide
58 % 65 % 76 % 86 % 61 % 84 % 64 %
Duration: 1 week
Date: May 2013
Annotated Objects: 11.375 objects (≈80% of total)
Resources Analyzed: 11.375
Involved people: 1 ME & 1 RE
No Mandatory Elements Sep ‘12 May ‘13 Diff
1 CHO Titles 99.9% 99.5% -0.4%
2 CHO Keywords 96.8% 99.1% +2.3%
3 Object Titles 100% 100% 0%
4 Object Descriptions 89.7% 98.9% +9.2%
5 Object URL 99.7% 99.9% +0.2%
6 Object Thumbnail URL 97.1% 97.2% +0.1%
7 Object Content Type 100% 100% 0%
8 Copyrights 100% 100% 0%
9 Access 60.3% 100% +39.7%
Regular OperationRegular Operation
PhasePhase
No Recommended Elements Sep ‘12 May ’13 Diff
10 CHO Types 78.1% 74.9% -3.2%
11 Object Creators 88.1% 87.3% -0.8%
12 Object Languages 32.4% 78.5% +46.1%
No Optional Elements Sep ‘12 May ‘13 Diff
13 Scientific Name 0.3% 65.1% +64.8%
14 Classification 0.3% 56.7% +56.4%
15 Common Names 0% 49.8% +49.8%
16 CHO Significant Dates 71.8% 48.1% -23.7%
17 CHO Temporal Coverage 21.2% 18.5% -2.7%
18 CHO Spatial Coverage 74.7% 71.4% -3.3%
19 CHO Mediums 6.6% 17.9% +13.3%
20 CHO Creators 0% 58.1% +58.1%
21 Object Creation Dates 16.4% 66.5% +50.1%
22 Object Identifiers 61.8% 68.4% +6.6%
23 Object Context URL 16.2% 42.1% +25.9%
24 Related Objects 52.6% 47.1% -5.5%
25 Object Formats 89.5% 94.3% +4.8%
26 Object Extents 58.2% 89.1% +30.9%
Duration: 3 weeks
Date: May 2013
Annotated Objects: 9.402 objects (≈62% of total)
Resources Reviewed: 90 resources (≈9 per expert)
Involved people: 10 DE
Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness
Overall
score
5 40% 60% 74% 89% 61% 74% 52%
4 26% 24% 10% 6% 24% 13% 30%
3 18% 8% 7% 1% 4% 9% 10%
2 13% 3% 4% 1% 6% 0% 3%
1 3% 4% 4% 3% 4% 3% 5%
no 0% 0% 0% 0% 0% 0% 0%
Regular OperationRegular Operation
PhasePhase
66 % 84 % 84 % 95 % 85 % 87 % 82 %
G&B
Metadata
Practices
Guide
67 % 80 % 80 % 85 % 76 % 88 % 84 %
 45 people (w/duplicates)
 8 QA methods
◦ 1 MUS; 4 Peer-reviews; 3 Usage Data Analyses
 298 records reviewed (unique)
 24.194 records analyzed (w/duplicates)
 166.5 hours in total
Validation
 Initiative: VOA3R
 Topics: Agriculture, Veterinary, Horticulture,
Aquaculture, Viticulture, etc.
 Resources: >71.316 (2.500.000)
 Content Providers: 9
 Languages: 9
 Content Type: Mainly text with some images
Duration: 2 hours
Date: June 2011
Annotated Objects: Not applicable
Involved people: 16 DE
Results
Question [0.1] (1.2] (2.3] (3.4] (4.5]
Is the element easy for you to
understand?
0% 0% 11.4% 25.7% 62.9%
Is this element useful for
describing research resources?
0% 0% 0% 48.6% 51.4%
Metadata DesignMetadata Design
PhasePhase
Mandatory Recommended Optional
Question Before After Before After Before After
Should this element be mandatory.
recommended or optional?
13 12 14 6 8 17
Percentile change in overall number
of mandatory / recommended or
optional elements
-8% -57% +112%
Metadata DesignMetadata Design
PhasePhase
Duration: 1 week
Date: August 2011
Annotated Objects: 25.000 objects (≈35% of total)
Resources Reviewed: 65 (≈7 per expert)
Involved people: 9 DE
Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score
5 20% 31% 26% 60% 14% 65% 18%
4 11% 43% 42% 20% 17% 22% 31%
3 6% 9% 12% 5% 14% 6% 14%
2 20% 2% 0% 3% 22% 2% 18%
1 40% 6% 2% 3% 26% 2% 15%
no 3% 9% 18% 9% 8% 5% 3%
Testing PhaseTesting Phase
31 % 74 % 68 % 80 % 31 % 87 % 49 %
G&B
Metadata
Practices
Guide
Duration: 1 week
Date: December 2011
Annotated Objects: 50.000 objects (≈70% of total)
Resources Reviewed: 61(≈7 per expert)
Involved people: 9 DE
Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score
5 8% 51% 26% 67% 30% 74% 25%
4 28% 31% 46% 26% 48% 21% 44%
3 26% 10% 23% 3% 8% 0% 28%
2 31% 3% 5% 3% 11% 0% 3%
1 7% 2% 0% 0% 3% 2% 0%
no 0% 3% 0% 0% 0% 3% 0%
Calibration PhaseCalibration Phase
G&B
Metadata
Practices
Guide
36 % 82 % 72 % 93 % 78 % 95 % 69 %
31 % 74 % 68 % 80 % 31 % 87 % 49 %
Duration: 1 week
Date: April 2012
Annotated Objects: 51.057 (≈70% of total)
Resources Analyzed: 51.057
Involved people: 2 ME
Critical Mass PhaseCritical Mass Phase
No Element Name %
1 dc.identifier 99.8%
2 dc.title 100%
3 dc.language 85.2%
4 dc.description 94.2%
5 dc.subject 90.9%
6 dc.coverage 41.3%41.3%
7 dc.type 97.6%
8 dc.date 82.5%
9 dc.creator 95.6%
10 dc.contributor 43.0%43.0%
11 dc.publisher 62.8%62.8%
12 dc.format 92.6%
13 dc.rights 47.8%47.8%
14 dc.relation 90.0%
15 dc.source 47.3%47.3%
Duration: 2 hours
Date: May 2012
Annotated Objects: Not applicable
Involved people: 13 DE
Results
Question [0.1] (1.2] (2.3] (3.4] (4.5]
Is the element easy for
you to understand?
0% 0% 16.7% 64.6% 18.8%
Is this element useful for
describing research
resources?
0% 0% 16.7% 56.3% 31.3%
Critical Mass PhaseCritical Mass Phase
Duration: 1 week
Date: June 2013
Annotated Objects: 74.379 objects (100% of total)
Resources Analyzed: 74.379
Involved people: 2 ME & 1 RE
Regular OperationRegular Operation
PhasePhase
No Element Name
%
May 2012
%
June 2013
Diff
1 dc.identifier 99.8% 100% +0.03%
2 dc.title 100% 100% +0.01%
3 dc.language 85.2% 93.3% +8.1%
4 dc.description 94.2% 96.8% +2.6%
5 dc.subject 90.9% 100% +9.1%
6 dc.coverage 41.3% 97.8% +56.5%
7 dc.type 97.6% 91.8% -5.8%
8 dc.date 82.5% 96.3% +13.8%
9 dc.creator 95.6% 100% +4.4%
10 dc.contributor 43% 51.8%51.8% +7.2%+7.2%
11 dc.publisher 62.8% 41.2%41.2% -21.6%-21.6%
12 dc.format 92.6% 81.7% -10.9%
13 dc.rights 47.8% 71.3% +23.5%
14 dc.relation 90% 75.7% -14.3%
15 dc.source 47.3% 100% +52.7%
 58 people (w/duplicates)
 6 QA methods
◦ 2 MUS; 2 Peer-reviews; 2 Usage Data Analyses
 126 records reviewed (unique)
 125.436 records analyzed (w/duplicates)
 154.8 hours in total
 Different domains have different metadata needs
◦ Mandatory
 Learning 11, Research 12, Culture 21 (!)
◦ Recommended
 Learning 17, Research 6, Culture 2
◦ Optional
 Learning 29 (!), Research 17, Culture 6
Metadata DesignMetadata Design
PhasePhase
 Metadata is easy to understand but not always useful
◦ Learning:
 76% Easy,
 51%51% Useful
◦ Research:
 72% Easy,
 79% Useful
◦ Cultural:
 80% Easy,
 88% Useful
Metadata DesignMetadata Design
PhasePhase
Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score
85 % 83 % 78 % 90 % 74 % 90 % 77 %
58 % 65 % 76 % 86 % 61 % 84 % 64 %
36 % 82 % 72 % 93 % 78 % 95 % 69 %
Learning
Culture
Research
>72%
>86%
>84%
Calibration PhaseCalibration Phase
 Completeness VS Easiness VS Usefulness
Regular OperationRegular Operation
PhasePhase
 Completeness VS Easiness VS Usefulness
Regular OperationRegular Operation
PhasePhase
 Completeness VS Easiness VS Usefulness
Regular OperationRegular Operation
PhasePhase
1. Can we set up quality assurance methods for
ensuring metadata record quality that will have a
positive effect on the resulting quality of the
metadata records of a given federation of
repositories?
2. Which are the metrics that can be used to effectively
assess the metadata record quality?
◦ Completeness, accuracy, objectiveness, correctness,
consistency & appropriateness
3. At what levels of the metadata record quality
metrics is a repository considered to have a
satisfying metadata record quality?
◦ All quality metrics being ranked with 4 or above on 75% of
the reviews
Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score
85 % 83 % 78 % 90 % 74 % 90 % 77 %
58 % 65 % 76 % 86 % 61 % 84 % 64 %
36 % 82 % 72 % 93 % 78 % 95 % 69 %
Learning
Culture
Research
4. Can we introduce a quality assurance process that
is transferable through different application
contexts and types of repositories?
5. What are the specific adjustments that have to be
made to apply a quality assurance process in other
contexts?
◦ Variable use of QA method
◦ No major adjustments needed
6. Are the results of the application of the same
metadata quality assurance process in different
repositories comparable in terms of the resulting
metadata quality?
Case Average Median Range Elements
Learning 67% 79.1% 99.6% 22
Cultural 74.1% 76.7% 82.1% 26
Research 85.2% 96.3% 58.8% 15
7. Is the cost involved in the application of a metadata
quality assurance process comparable in terms of
magnitude for different repositories?
Case People Methods Resources Hours
Learning 49 5 11.000 134.7
Cultural 45 8 15.000 166.5
Research 58 6 75.000 154.8
8. Is the improvement in metadata record quality
comparable with the cost of the quality assurance
method?
Case
Average
improvement for
elements
(<60%)
Elements
in AP
Hours
% of average
improvement
per hour spent
Learning +55.7% 22 134.7 0.43%
Cultural +28.8% 26 166.5 0.17%
Research +23.7% 15 154.8 0.15%
 Proposed a Metadata Quality Assurance
Certification Process
 Applied the proposed MQACP to a learning
federation
 Validated its effectiveness in two new cases
 Identified future areas of research
 Palavitsinis. N., Manouselis. N. (2009). “A Survey of Knowledge
Organization Systems in Environmental Sciences”, in I.N. Athanasiadis. P.A.
Mitkas. A.E. Rizzoli & J. Marx-Gómez (eds.). Information Technologies in
Environmental Engineering, Proceedings of the 4th International ICSC
Symposium, Springer Berlin Heidelberg.
 Palavitsinis. N., Kastrantas K. and Manouselis. N. (2009a). "Interoperable
metadata for a federation of learning repositories on organic agriculture
and agroecology". in Proc. of the Joint International Agricultural Conference
2009 (JIAC 2009), Wageningen, The Netherlands, July 2009
 Palavitsinis. N., Manouselis. N. and Sanchez. S. (2010). "Preliminary
Discussion on a Digital Curation Framework for Learning Repositories“, in
Massart D. & Shulman E. (Eds.). Proc. of Workshop on Search and Exchange
of e-le@rning Materials (SE@M’10). Barcelona, Spain, CEUR
681, September 2010
 Palavitsinis. N. and Manouselis. N. (2013). “Agricultural Knowledge
Organisation Systems: An Analysis of an Indicative Sample” in Sicilia M.-A.
(Ed.). Handbook of Metadata, Semantics and Ontologies, World Scientific
Publishing Co.
 Palavitsinis. N., Ebner. H., Manouselis. N., Sanchez S. and Naeve. A.,
(2009b). "Evaluating Metadata Application Profiles Based on Usage Data:
The Case of a Metadata Application Profile for Agricultural Learning
Resources“, in Proc. of the International Conference on Digital Libraries and
the Semantic Web (ICSD 2009). Trento, Italy, September 2009
 Palavitsinis. N.. Manouselis. N. and Sanchez. S.. (2009c). "Evaluation of a
Metadata Application Profile for Learning Resources on Organic
Agriculture“, in Proc. of 3rd International Conference on Metadata and
Semantics Research (MTSR09). Milan, Italy, October 2009
 Palavitsinis. N., Manouselis. N. and Sanchez. S., (2011). "Metadata quality in
learning repositories: Issues and considerations“, in Proc. of the World
Conference on Educational Multimedia, Hypermedia & Telecommunications
(ED-MEDIA 2011), Lisbon, Portugal
 Palavitsinis. N., Manouselis. N. & Sanchez. S., (in press). Metadata Quality in
Learning Object Repositories: A Case Study, The Electronic Library.
 Palavitsinis. N., Manouselis. N. & Sanchez. S., (in
press). Metadata Quality in Digital Repositories:
Empirical Results from the Cross-Domain Transfer
of a Quality Assurance Process, Journal of the
American Society for Information Science and
Technology.
 e-Conference (2010, www.e-agriculture.org)
◦ Learning resources creation: What constitutes a quality
learning resource?
◦ Providing quality metadata: Is the gain worth the effort?
◦ Populating a repository with resources and metadata: The
quality versus quantity dilemma
◦ Managing a portal with thousands of resources and users:
Are communities “attracted” to quality, like bees to honey?
Topic Posts Participants Answers Views
Topic 1. Phase I 23 16 50 957
Topic 2. Phase I 18 13 27 568
Topic 1. Phase II 19 12 17 531
Topic 2. Phase II 12 9 16 393
TOTAL 72 50 110 2.449
 Joint Technology Enhanced Learning Summer
School (2010)
◦ WHY Metadata for MY Resources workshop
◦ http://www.prolearn-academy.org/Events/summer-school-201
 Special Track on Metadata & Semantics for
Learning Infrastructures (2011, Ismir, Turkey)
◦ http://ieru.org/org/mtsr2011/MSLI/
 International Journal of Metadata. Semantics
and Ontologies (2013 Vol. 8, No. 1)
◦ Special Issue on Advances in Metadata and Semantics for
Learning Infrastructures: Part 1
◦ Guest Editors: Nikos Palavitsinis. Dr Joris Klerkx and
Professor Xavier Ochoa
◦ http://www.inderscience.com/info/inarticletoc.php?jcode=ijms
 The organization of Metadata Understanding
Sessions showed:
◦ The need for a better explanation of metadata elements
◦ The need for domain experts to be heavily involved in the
entire process
 In Palavitsinis et al., (2013) we presented a Creative
Metadata Understanding Session that was deployed
and tested already
 2.5 hours instead of
two of a typical MUS
 Worked well with a
focus group of
environmental
experts
 More research is
needed
 Tsiflidou & Manouselis built upon the work carried
out in this Thesis to examine how different
metadata quality metrics can be extracted through
automated methods.
 Completeness, element frequency, entropy,
vocabulary values distribution, metadata
multilinguality
 Used to examine a first set of 2.500 records
 One of the first studies on metadata quality costs
 Combined with annotation costs can provide a
holistic view on the costs associated with populating
a repository with resources
 Complimentary to existing library approaches
No Cost Category
Included in
MQACP
Explanation
1 Metadata Design Yes
Designing an application profile based on
user needs.
2 Resource Selection No
Selecting the resources to be populated
in the repository
3 Metadata Annotation No
Adding metadata to the resources from
scratch
4 Metadata Enrichment No
Enriching/correcting problematic
metadata fields
5 Peer Review of Metadata Yes
Checking the quality of metadata through
peer review processes
6
Automated Metadata
Review
Yes
Checking the quality of metadata through
automated means
7
Development of Training
Material
Yes
Develop guidelines and manuals for the
content providers
8 Training on Metadata No
Hold training sessions on metadata
annotation / spend time on reading
supportive material
 Examination board
 Supervisors
◦ Salvador Sanchez-Alonso; Nikos Manouselis
 Information Engineering Research Unit Personnel
◦ Miguel Refusta; Alberto Abian;…
 Agro-Know colleagues
◦ Effie Tsiflidou; Charalampos Thanopoulos; Vassilis
Protonotarios;…
 Family & Friends
Metadata Quality Issues in Learning Repositories

Contenu connexe

En vedette

Requirements for Managing Unstructured Data
Requirements for Managing Unstructured DataRequirements for Managing Unstructured Data
Requirements for Managing Unstructured DataDATAVERSITY
 
Hitting the Road towards a Greater Digital Destination: Evaluating and Testin...
Hitting the Road towards a Greater Digital Destination: Evaluating and Testin...Hitting the Road towards a Greater Digital Destination: Evaluating and Testin...
Hitting the Road towards a Greater Digital Destination: Evaluating and Testin...Rachel Vacek
 
Maximizing ROI from Digital Asset Management (DAM) Investments
Maximizing ROI from Digital Asset Management (DAM) InvestmentsMaximizing ROI from Digital Asset Management (DAM) Investments
Maximizing ROI from Digital Asset Management (DAM) InvestmentsGleanster Research
 
What are the benefits of digital asset management?
What are the benefits of digital asset management?What are the benefits of digital asset management?
What are the benefits of digital asset management?Brandworkz
 
Goog Books: The Metadata Mess
Goog Books: The Metadata MessGoog Books: The Metadata Mess
Goog Books: The Metadata MessStreetLib
 
Five DAM Essentials in 2016
Five DAM Essentials in 2016Five DAM Essentials in 2016
Five DAM Essentials in 2016Brandworkz
 
Discovery & Reuse of Content
Discovery & Reuse of ContentDiscovery & Reuse of Content
Discovery & Reuse of ContentKlaris IP
 
Managed Metadata - The Good, The Bad, and The Ugly
Managed Metadata - The Good, The Bad, and The UglyManaged Metadata - The Good, The Bad, and The Ugly
Managed Metadata - The Good, The Bad, and The UglyScott Hoag
 
A Metadata Ocean in Chef and Puppet
A Metadata Ocean in Chef and PuppetA Metadata Ocean in Chef and Puppet
A Metadata Ocean in Chef and PuppetMarc Cluet
 
Faceted Metadata for Site Navigation and Search
Faceted Metadata for Site Navigation and SearchFaceted Metadata for Site Navigation and Search
Faceted Metadata for Site Navigation and Searchmarti_hearst
 
Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)Péter Király
 
Machine Learning & Data Science come to DAM
Machine Learning & Data Science come to DAMMachine Learning & Data Science come to DAM
Machine Learning & Data Science come to DAMElliot Sedegah
 
Webcast: Getting Started With DAM
Webcast: Getting Started With DAMWebcast: Getting Started With DAM
Webcast: Getting Started With DAMExtensis
 
Best Practices in Enterprise Video and Digital Asset Management
Best Practices in Enterprise Video and Digital Asset ManagementBest Practices in Enterprise Video and Digital Asset Management
Best Practices in Enterprise Video and Digital Asset ManagementNuxeo
 

En vedette (16)

Metadata quality criteria
Metadata quality criteriaMetadata quality criteria
Metadata quality criteria
 
Requirements for Managing Unstructured Data
Requirements for Managing Unstructured DataRequirements for Managing Unstructured Data
Requirements for Managing Unstructured Data
 
Hitting the Road towards a Greater Digital Destination: Evaluating and Testin...
Hitting the Road towards a Greater Digital Destination: Evaluating and Testin...Hitting the Road towards a Greater Digital Destination: Evaluating and Testin...
Hitting the Road towards a Greater Digital Destination: Evaluating and Testin...
 
Maximizing ROI from Digital Asset Management (DAM) Investments
Maximizing ROI from Digital Asset Management (DAM) InvestmentsMaximizing ROI from Digital Asset Management (DAM) Investments
Maximizing ROI from Digital Asset Management (DAM) Investments
 
What are the benefits of digital asset management?
What are the benefits of digital asset management?What are the benefits of digital asset management?
What are the benefits of digital asset management?
 
Goog Books: The Metadata Mess
Goog Books: The Metadata MessGoog Books: The Metadata Mess
Goog Books: The Metadata Mess
 
Five DAM Essentials in 2016
Five DAM Essentials in 2016Five DAM Essentials in 2016
Five DAM Essentials in 2016
 
Discovery & Reuse of Content
Discovery & Reuse of ContentDiscovery & Reuse of Content
Discovery & Reuse of Content
 
Managed Metadata - The Good, The Bad, and The Ugly
Managed Metadata - The Good, The Bad, and The UglyManaged Metadata - The Good, The Bad, and The Ugly
Managed Metadata - The Good, The Bad, and The Ugly
 
A Metadata Ocean in Chef and Puppet
A Metadata Ocean in Chef and PuppetA Metadata Ocean in Chef and Puppet
A Metadata Ocean in Chef and Puppet
 
Faceted Metadata for Site Navigation and Search
Faceted Metadata for Site Navigation and SearchFaceted Metadata for Site Navigation and Search
Faceted Metadata for Site Navigation and Search
 
Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)
 
Cloud stream webinar
Cloud stream webinarCloud stream webinar
Cloud stream webinar
 
Machine Learning & Data Science come to DAM
Machine Learning & Data Science come to DAMMachine Learning & Data Science come to DAM
Machine Learning & Data Science come to DAM
 
Webcast: Getting Started With DAM
Webcast: Getting Started With DAMWebcast: Getting Started With DAM
Webcast: Getting Started With DAM
 
Best Practices in Enterprise Video and Digital Asset Management
Best Practices in Enterprise Video and Digital Asset ManagementBest Practices in Enterprise Video and Digital Asset Management
Best Practices in Enterprise Video and Digital Asset Management
 

Similaire à Metadata Quality Issues in Learning Repositories

A Hero’s Journey Through Metadata Quality
A Hero’s Journey Through Metadata QualityA Hero’s Journey Through Metadata Quality
A Hero’s Journey Through Metadata QualityNikos Palavitsinis, PhD
 
Curation-Friendly Tools for the Scientific Researcher
Curation-Friendly Tools for the Scientific ResearcherCuration-Friendly Tools for the Scientific Researcher
Curation-Friendly Tools for the Scientific Researcherbwestra
 
PERICLES presentation on Appraisal - IDCC15 workshop
PERICLES presentation on Appraisal - IDCC15 workshopPERICLES presentation on Appraisal - IDCC15 workshop
PERICLES presentation on Appraisal - IDCC15 workshopPERICLES_FP7
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...Dr. Haxel Consult
 
HCI 3e - Ch 9: Evaluation techniques
HCI 3e - Ch 9:  Evaluation techniquesHCI 3e - Ch 9:  Evaluation techniques
HCI 3e - Ch 9: Evaluation techniquesAlan Dix
 
Measuring and Comparing the Reliability of the Structured Walkthrough Evaluat...
Measuring and Comparing the Reliability of the Structured Walkthrough Evaluat...Measuring and Comparing the Reliability of the Structured Walkthrough Evaluat...
Measuring and Comparing the Reliability of the Structured Walkthrough Evaluat...chrisbailey000
 
Howe Street Basic Project Approach
Howe Street Basic Project ApproachHowe Street Basic Project Approach
Howe Street Basic Project ApproachJasonLhota1
 
ISTQB Foundation - Chapter 3
ISTQB Foundation - Chapter 3ISTQB Foundation - Chapter 3
ISTQB Foundation - Chapter 3Chandukar
 
Creating an outcomes framework for your organisation
Creating an outcomes framework for your organisationCreating an outcomes framework for your organisation
Creating an outcomes framework for your organisationMark Planigale
 
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web TestingThe Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web TestingPerfecto by Perforce
 
How to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsHow to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsDavid Talby
 
313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptxsameernsn1
 
B.Sc_.CSIT-8th-sem-syllabus.pdf
B.Sc_.CSIT-8th-sem-syllabus.pdfB.Sc_.CSIT-8th-sem-syllabus.pdf
B.Sc_.CSIT-8th-sem-syllabus.pdfSudarshanSharma43
 

Similaire à Metadata Quality Issues in Learning Repositories (20)

A Hero’s Journey Through Metadata Quality
A Hero’s Journey Through Metadata QualityA Hero’s Journey Through Metadata Quality
A Hero’s Journey Through Metadata Quality
 
8th sem (1)
8th sem (1)8th sem (1)
8th sem (1)
 
Project Tools and Techniques
Project Tools and TechniquesProject Tools and Techniques
Project Tools and Techniques
 
Curation-Friendly Tools for the Scientific Researcher
Curation-Friendly Tools for the Scientific ResearcherCuration-Friendly Tools for the Scientific Researcher
Curation-Friendly Tools for the Scientific Researcher
 
PERICLES presentation on Appraisal - IDCC15 workshop
PERICLES presentation on Appraisal - IDCC15 workshopPERICLES presentation on Appraisal - IDCC15 workshop
PERICLES presentation on Appraisal - IDCC15 workshop
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
 
HCI 3e - Ch 9: Evaluation techniques
HCI 3e - Ch 9:  Evaluation techniquesHCI 3e - Ch 9:  Evaluation techniques
HCI 3e - Ch 9: Evaluation techniques
 
Analytics - Presentation in DkIT
Analytics - Presentation in DkITAnalytics - Presentation in DkIT
Analytics - Presentation in DkIT
 
Measuring and Comparing the Reliability of the Structured Walkthrough Evaluat...
Measuring and Comparing the Reliability of the Structured Walkthrough Evaluat...Measuring and Comparing the Reliability of the Structured Walkthrough Evaluat...
Measuring and Comparing the Reliability of the Structured Walkthrough Evaluat...
 
Howe Street Basic Project Approach
Howe Street Basic Project ApproachHowe Street Basic Project Approach
Howe Street Basic Project Approach
 
ISTQB Foundation - Chapter 3
ISTQB Foundation - Chapter 3ISTQB Foundation - Chapter 3
ISTQB Foundation - Chapter 3
 
Creating an outcomes framework for your organisation
Creating an outcomes framework for your organisationCreating an outcomes framework for your organisation
Creating an outcomes framework for your organisation
 
Machine Learning Impact on IoT - Part 2
Machine Learning Impact on IoT - Part 2Machine Learning Impact on IoT - Part 2
Machine Learning Impact on IoT - Part 2
 
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web TestingThe Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
 
Evaluation of eLearning
Evaluation of eLearningEvaluation of eLearning
Evaluation of eLearning
 
How to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsHow to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical Trials
 
Rkfl Problem Solving
Rkfl Problem SolvingRkfl Problem Solving
Rkfl Problem Solving
 
313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx
 
classmar2.ppt
classmar2.pptclassmar2.ppt
classmar2.ppt
 
B.Sc_.CSIT-8th-sem-syllabus.pdf
B.Sc_.CSIT-8th-sem-syllabus.pdfB.Sc_.CSIT-8th-sem-syllabus.pdf
B.Sc_.CSIT-8th-sem-syllabus.pdf
 

Plus de Nikos Palavitsinis, PhD

Σχολείο ΑΕΠ (Συνάντηση 2)
Σχολείο ΑΕΠ (Συνάντηση 2)Σχολείο ΑΕΠ (Συνάντηση 2)
Σχολείο ΑΕΠ (Συνάντηση 2)Nikos Palavitsinis, PhD
 
Σχολείο ΑΕΠ (Συνάντηση 1)
Σχολείο ΑΕΠ (Συνάντηση 1)Σχολείο ΑΕΠ (Συνάντηση 1)
Σχολείο ΑΕΠ (Συνάντηση 1)Nikos Palavitsinis, PhD
 
Digital Educational Content Quality Assurance Process
Digital Educational Content Quality Assurance ProcessDigital Educational Content Quality Assurance Process
Digital Educational Content Quality Assurance ProcessNikos Palavitsinis, PhD
 
Αξιολόγηση Μαθησιακών Αντικειμένων και Σφραγίδες Ποιότητας [Εργαστήρια ΕΕΛ/ΛΑΚ]
Αξιολόγηση Μαθησιακών Αντικειμένων και Σφραγίδες Ποιότητας [Εργαστήρια ΕΕΛ/ΛΑΚ]Αξιολόγηση Μαθησιακών Αντικειμένων και Σφραγίδες Ποιότητας [Εργαστήρια ΕΕΛ/ΛΑΚ]
Αξιολόγηση Μαθησιακών Αντικειμένων και Σφραγίδες Ποιότητας [Εργαστήρια ΕΕΛ/ΛΑΚ]Nikos Palavitsinis, PhD
 
MetadataTheory: Quality for Learning Resources (11th of 10)
MetadataTheory: Quality for Learning Resources (11th of 10)MetadataTheory: Quality for Learning Resources (11th of 10)
MetadataTheory: Quality for Learning Resources (11th of 10)Nikos Palavitsinis, PhD
 
Παιχνίδι Ανοικτών Εκπαιδευτικών Πόρων (25/11/2015)
Παιχνίδι Ανοικτών Εκπαιδευτικών Πόρων (25/11/2015)Παιχνίδι Ανοικτών Εκπαιδευτικών Πόρων (25/11/2015)
Παιχνίδι Ανοικτών Εκπαιδευτικών Πόρων (25/11/2015)Nikos Palavitsinis, PhD
 
[Lean 101] Solution and Unique Value Proposition
[Lean 101] Solution and Unique Value Proposition[Lean 101] Solution and Unique Value Proposition
[Lean 101] Solution and Unique Value PropositionNikos Palavitsinis, PhD
 
[Lean 101] Channels & Metrics - Reaching and Measuring
[Lean 101]  Channels & Metrics - Reaching and Measuring[Lean 101]  Channels & Metrics - Reaching and Measuring
[Lean 101] Channels & Metrics - Reaching and MeasuringNikos Palavitsinis, PhD
 
[Lean 101] Costs & Revenues - Breaking even or Breaking bad???
[Lean 101] Costs & Revenues - Breaking even or Breaking bad???[Lean 101] Costs & Revenues - Breaking even or Breaking bad???
[Lean 101] Costs & Revenues - Breaking even or Breaking bad???Nikos Palavitsinis, PhD
 
[Lean 101] Bootstrapping & Getting Out of the Building
[Lean 101] Bootstrapping & Getting Out of the Building[Lean 101] Bootstrapping & Getting Out of the Building
[Lean 101] Bootstrapping & Getting Out of the BuildingNikos Palavitsinis, PhD
 
[Lean 101] Introduction to Lean - Preparing a Lean Canvas
[Lean 101] Introduction to Lean - Preparing a Lean Canvas[Lean 101] Introduction to Lean - Preparing a Lean Canvas
[Lean 101] Introduction to Lean - Preparing a Lean CanvasNikos Palavitsinis, PhD
 
Quality of Learning Resources & Metadata through Quality Seals, Badges, Marks...
Quality of Learning Resources & Metadata through Quality Seals, Badges, Marks...Quality of Learning Resources & Metadata through Quality Seals, Badges, Marks...
Quality of Learning Resources & Metadata through Quality Seals, Badges, Marks...Nikos Palavitsinis, PhD
 
MetadataTheory: Repository Operational Models (10th of 10)
MetadataTheory: Repository Operational Models (10th of 10)MetadataTheory: Repository Operational Models (10th of 10)
MetadataTheory: Repository Operational Models (10th of 10)Nikos Palavitsinis, PhD
 
MetadataTheory: Learning Repositories Technologies (9th of 10)
MetadataTheory: Learning Repositories Technologies (9th of 10)MetadataTheory: Learning Repositories Technologies (9th of 10)
MetadataTheory: Learning Repositories Technologies (9th of 10)Nikos Palavitsinis, PhD
 
MetadataTheory: Introduction to Repositories (8th of 10)
MetadataTheory: Introduction to Repositories (8th of 10)MetadataTheory: Introduction to Repositories (8th of 10)
MetadataTheory: Introduction to Repositories (8th of 10)Nikos Palavitsinis, PhD
 

Plus de Nikos Palavitsinis, PhD (20)

Metadata Mapping & Crosswalks
Metadata Mapping & CrosswalksMetadata Mapping & Crosswalks
Metadata Mapping & Crosswalks
 
Σχολείο ΑΕΠ (Συνάντηση 2)
Σχολείο ΑΕΠ (Συνάντηση 2)Σχολείο ΑΕΠ (Συνάντηση 2)
Σχολείο ΑΕΠ (Συνάντηση 2)
 
Σχολείο ΑΕΠ (Συνάντηση 1)
Σχολείο ΑΕΠ (Συνάντηση 1)Σχολείο ΑΕΠ (Συνάντηση 1)
Σχολείο ΑΕΠ (Συνάντηση 1)
 
Making Sense of ISO/IEC 19788
Making Sense of ISO/IEC 19788Making Sense of ISO/IEC 19788
Making Sense of ISO/IEC 19788
 
Digital Educational Content Quality Assurance Process
Digital Educational Content Quality Assurance ProcessDigital Educational Content Quality Assurance Process
Digital Educational Content Quality Assurance Process
 
Αξιολόγηση Μαθησιακών Αντικειμένων και Σφραγίδες Ποιότητας [Εργαστήρια ΕΕΛ/ΛΑΚ]
Αξιολόγηση Μαθησιακών Αντικειμένων και Σφραγίδες Ποιότητας [Εργαστήρια ΕΕΛ/ΛΑΚ]Αξιολόγηση Μαθησιακών Αντικειμένων και Σφραγίδες Ποιότητας [Εργαστήρια ΕΕΛ/ΛΑΚ]
Αξιολόγηση Μαθησιακών Αντικειμένων και Σφραγίδες Ποιότητας [Εργαστήρια ΕΕΛ/ΛΑΚ]
 
MetadataTheory: Quality for Learning Resources (11th of 10)
MetadataTheory: Quality for Learning Resources (11th of 10)MetadataTheory: Quality for Learning Resources (11th of 10)
MetadataTheory: Quality for Learning Resources (11th of 10)
 
The OER Game!
The OER Game!The OER Game!
The OER Game!
 
Παιχνίδι Ανοικτών Εκπαιδευτικών Πόρων (25/11/2015)
Παιχνίδι Ανοικτών Εκπαιδευτικών Πόρων (25/11/2015)Παιχνίδι Ανοικτών Εκπαιδευτικών Πόρων (25/11/2015)
Παιχνίδι Ανοικτών Εκπαιδευτικών Πόρων (25/11/2015)
 
[Lean 101] Solution and Unique Value Proposition
[Lean 101] Solution and Unique Value Proposition[Lean 101] Solution and Unique Value Proposition
[Lean 101] Solution and Unique Value Proposition
 
[Lean 101] Channels & Metrics - Reaching and Measuring
[Lean 101]  Channels & Metrics - Reaching and Measuring[Lean 101]  Channels & Metrics - Reaching and Measuring
[Lean 101] Channels & Metrics - Reaching and Measuring
 
[Lean 101] Costs & Revenues - Breaking even or Breaking bad???
[Lean 101] Costs & Revenues - Breaking even or Breaking bad???[Lean 101] Costs & Revenues - Breaking even or Breaking bad???
[Lean 101] Costs & Revenues - Breaking even or Breaking bad???
 
[Lean 101] Learn, Adapt & Pivot
[Lean 101] Learn, Adapt & Pivot[Lean 101] Learn, Adapt & Pivot
[Lean 101] Learn, Adapt & Pivot
 
[Lean 101] Bootstrapping & Getting Out of the Building
[Lean 101] Bootstrapping & Getting Out of the Building[Lean 101] Bootstrapping & Getting Out of the Building
[Lean 101] Bootstrapping & Getting Out of the Building
 
[Lean 101] Introduction to Lean - Preparing a Lean Canvas
[Lean 101] Introduction to Lean - Preparing a Lean Canvas[Lean 101] Introduction to Lean - Preparing a Lean Canvas
[Lean 101] Introduction to Lean - Preparing a Lean Canvas
 
Quality of Learning Resources & Metadata through Quality Seals, Badges, Marks...
Quality of Learning Resources & Metadata through Quality Seals, Badges, Marks...Quality of Learning Resources & Metadata through Quality Seals, Badges, Marks...
Quality of Learning Resources & Metadata through Quality Seals, Badges, Marks...
 
Presentation of my MSc thesis (Greek)
Presentation of my MSc thesis (Greek)Presentation of my MSc thesis (Greek)
Presentation of my MSc thesis (Greek)
 
MetadataTheory: Repository Operational Models (10th of 10)
MetadataTheory: Repository Operational Models (10th of 10)MetadataTheory: Repository Operational Models (10th of 10)
MetadataTheory: Repository Operational Models (10th of 10)
 
MetadataTheory: Learning Repositories Technologies (9th of 10)
MetadataTheory: Learning Repositories Technologies (9th of 10)MetadataTheory: Learning Repositories Technologies (9th of 10)
MetadataTheory: Learning Repositories Technologies (9th of 10)
 
MetadataTheory: Introduction to Repositories (8th of 10)
MetadataTheory: Introduction to Repositories (8th of 10)MetadataTheory: Introduction to Repositories (8th of 10)
MetadataTheory: Introduction to Repositories (8th of 10)
 

Dernier

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Dernier (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Metadata Quality Issues in Learning Repositories

  • 1. PhD Defense Nikos Palavitsinis SupervisorsSupervisors Dr. Salvador-Sanchez AlonsoDr. Salvador-Sanchez Alonso,, UAHUAH Dr. Nikolaos ManouselisDr. Nikolaos Manouselis,, Agro-KnowAgro-Know Departamento de Ciencias de la ComputaciónDepartamento de Ciencias de la Computación
  • 2.  Introduction ◦ Problem Definition ◦ Research Questions  Problem Solution  Experiments/Results  Conclusions ◦ Research Contributions ◦ Future Directions of Research
  • 3.  Lies in the centre of every repository project ◦ Defines and drives the description of content ◦ Allows storage, management & retrieval of content as well as its preservation  Content, content, content ◦ Global Learning Objects Brokered Exchange (GLOBE): >1.000.000 learning resources ◦ Europeana: > 30.000.000 cultural resources ◦ VOA3R: >2.500.000 research resources
  • 4.  Quality issues can be identified both in the digital content as well as in the metadata describing it Learning Object Actual DigitalActual Digital ContentContent MetadataMetadata
  • 5.  Significant problems with the quality of the metadata that describe content of various types ◦ Lack of completeness, ◦ Redundant metadata, ◦ Lack of clarity, ◦ Incorrect use of elements, ◦ Structural inconsistency, ◦ Inaccurate representation, ◦ …
  • 7. MMMM QQQQ AAAA CCCC PPPP MMMM QQQQ AAAA CCCC PPPP MMMM QQQQ AAAA CCCC PPPP MMMM QQQQ AAAA CCCC PPPP
  • 8.  The involvement of domain experts and creators of digital resources in metadata design & annotation is crucial. Documentation that supports the metadata annotation process is essential  Metadata quality frameworks suggested already have not proven their adaptability  The cost of producing quality metadata for digital objects is significant for any digital repository project
  • 10.
  • 11.
  • 12.  Base Metadata Schema  Metadata Understanding Questionnaire  Metadata Quality Assessment Grid  Excel Completeness Assessment Tool  Good & Bad Metadata Practices Guide
  • 13.  Metadata Expert (ME) ◦ Metadata design, MQACP application, experiments, results, supporting all other actors  Repository Expert (RE) ◦ Tool development, technical support, data exports  Domain Expert (DE) ◦ Metadata design, annotation & review metadata records
  • 14.  Metadata Understanding Session ◦ Workshop format ◦ Printed form, application profile ◦ Easiness, usefulness and obligation ◦ Questionnaire, Likert Scale  Hands-on exercise on metadata annotation ◦ Sample of resources ◦ Annotated on paper ◦ Support from Metadata Expert Metadata DesignMetadata Design PhasePhase
  • 15.  Hands-on annotation experiment ◦ Domain experts go digital ◦ Testing both metadata and system  Review of metadata records ◦ Metadata experts test the AP ◦ Refining the definitions of elements wherever needed. judging from real usage data ◦ Core metadata quality criteria Testing PhaseTesting Phase
  • 16.  Peer-review experiment ◦ Large scale ◦ Representative sample ◦ Everyone reviews everyone ◦ Metadata Peer Review Grid ◦ Online process ◦ Accuracy; Completeness; Consistency; Objectiveness Appropriateness; Correctness; Overall Calibration PhaseCalibration Phase
  • 17.  Analysis of usage data from tools ◦ Mainly completeness here  Metadata Quality Certification concept ◦ Concept of validation as a quality mark Critical Mass PhaseCritical Mass Phase
  • 18.  Regular Completeness Analysis ◦ Monthly ◦ Automated  Online Peer-Review of records ◦ Fixed  Informal Mechanisms ◦ Prizes ◦ Awards ◦ Social Regular OperationRegular Operation PhasePhase
  • 20.  Initiative: Organic.Edunet  Topics: Organic Agriculture, Agroecology, Environmental Education  Resources: >10.000  Content Providers: 8  Languages: 10  Content Type: Mainly image & text with some videos
  • 21. Duration: 2 hours Date: January 2009 Annotated Objects: Not applicable Involved people: 20 ME & DE Results Question [0.1] (1.2] (2.3] (3.4] (4.5] Is the element easy for you to understand? 0% 0% 8.8% 71.9% 19.3% Is this element useful for describing educational resources? 0% 0% 26.3%26.3% 61.4% 12.3% Metadata DesignMetadata Design PhasePhase
  • 22. Mandatory Recommended Optional Question Before After Before After Before After Should this element be mandatory. recommended or optional? 6 11 26 17 24 29 Percentile change in overall number of mandatory / recommended or optional elements +83% -34% +21% Metadata DesignMetadata Design PhasePhase
  • 23. Duration: 1 week Date: April 2009 Annotated Objects: 500 objects (≈5% of total) Resources Reviewed: 60 (30 per expert) Involved people: 2 ME  Outcomes were documented in the Good & Bad Metadata Practices document that was drafted after the peer-review process.  The Grid was not used Testing PhaseTesting Phase
  • 24. Duration: 3 weeks Date: June 2009 Annotated Objects: 1.000 objects (≈10% of total) Resources Reviewed: 105 resources (≈5 per expert) Involved people: 20 DE Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score 5 40% 51% 50% 69% 41% 69% 40% 4 45% 32% 28% 21% 33% 21% 37% 3 5% 10% 15% 6% 18% 9% 19% 2 9% 3% 1% 2% 5% 0% 0% 1 1% 1% 0% 0% 2% 1% 1% no 1% 3% 6% 3% 1% 1% 3% Calibration PhaseCalibration Phase 85 % 83 % 78 % 90 % 74 % 90 % 77 % G&B Metadata Practices Guide
  • 25. Duration: 1 week Date: September 2009 Annotated Objects: 6.653 objects (≈60% of total) Resources Analyzed: 6.653 Involved people: 1 ME & 1 RE No Mandatory Elements % 1 General / Title 99.8% 2 …/ Language 93.9% 3 …/ Description 94.8% 4 Rights / Cost 15.7%15.7% 5 Rights / Cost Copyright And Other Restrictions 16.0%16.0% Critical Mass PhaseCritical Mass Phase
  • 26. No Recommended Elements % 6 General / Keyword 12.8%12.8% 7 LifeCycle / Contribute / Role 11.5%11.5% 8 Educational / Intended End User Role 12.8%12.8% 9 …/ Context 10.2%10.2% 10 …/ Typical Age Range 3.8%3.8% 11 Rights / Description 7.7%7.7% 12 Classification 11.8%11.8% No Optional Elements % 13 General / Coverage 0.2%0.2% 14 …/ Structure 7.9%7.9% 15 LifeCycle / Status 0.3%0.3% 16 Educational / Interactivity Type 0.3%0.3% 17 …/ Interactivity Level 0.3%0.3% 18 …/ Semantic Density 0.2%0.2% 19 …/ Difficulty 0.1%0.1% 20 …/ Typical Learning Time 0%0% 21 …/ Language 0.3%0.3% 22 …/ Description 1.5%1.5% Critical Mass PhaseCritical Mass Phase
  • 27. Duration: 1 week Date: September 2010 Annotated Objects: 11.000 objects (100%) Resources Analyzed: 11.000 Involved people: 1 ME No Mandatory Elements Sep 2009 Sep 2010 Diff. 1 General / Title 99.8% 100% +0.2% 2 …/ Language 93.9% 99.9% +6% 3 …/ Description 94.8% 100% +5.2% 4 Rights / Cost 15.7%15.7% 82.4% +66.7% 5 …/ Cost Copyright & Other Restrictions 16.0%16.0% 99.9% +82.4% Regular OperationRegular Operation PhasePhase
  • 28. No Recommended Elements Sep 2009 Sep 2010 Diff. 6 General / Keyword 12.8%12.8% 99.9% +87.1% 7 LifeCycle / Contribute / Role 11.5%11.5% 77.2% +65.7% 8 Educational / Intended End User Role 12.8%12.8% 82.4% +69.6% 9 …/ Context 10.2%10.2% 81% +70.8% 10 …/ Typical Age Range 3.8%3.8% 63.9% +60.1% 11 Rights / Description 7.7%7.7% 92.4% +84.7% 12 Classification 11.8%11.8% 73.6% +61.8% No Optional Elements Sep 2009 Sep 2010 Diff. 13 General / Coverage 0.2%0.2% 82.6% +82.4% 14 …/ Structure 7.9%7.9% 82.5% +74.6% 15 LifeCycle / Status 0.3%0.3% 39.7% +39.4% 16 Educational / Interactivity Type 0.3%0.3% 36.9% +36.6% 17 …/ Interactivity Level 0.3%0.3% 37.1% +36.8% 18 …/ Semantic Density 0.2%0.2% 37% +36.8% 19 …/ Difficulty 0.1%0.1% 37.1% +37% 20 …/ Typical Learning Time 0%0% 0.4%0.4% +0.4% 21 …/ Language 0.3%0.3% 52.3% +52% 22 …/ Description 1.5%1.5% 14.7%14.7% +13.2% Regular OperationRegular Operation PhasePhase
  • 29.  49 people (w/duplicates)  5 QA methods ◦ 1 MUS; 2 Peer-reviews; 2 Usage Data Analyses  165 records reviewed (unique)  17.653 records analyzed (w/duplicates)  134.7 hours in total
  • 31.  Initiative: Natural Europe  Topics: Minerals, Fossils, Plants, etc.  Resources: >15.000  Content Providers: 6  Languages: 10  Content Type: Mainly image & text with a significant number of videos
  • 32. Duration: 2 hours Date: March 2011 Annotated Objects: Not applicable Involved people: 11 DE & 1 ME Results Question [0.1] (1.2] (2.3] (3.4] (4.5] Is the element easy for you to understand? 0% 0% 17.2% 34.5% 48.3% Is this element useful for describing cultural resources? 0% 0% 3.4% 37.9% 58.6% Metadata DesignMetadata Design PhasePhase
  • 33. Mandatory Recommended Optional Question Before After Before After Before After Should this element be mandatory. recommended or optional? 13 21 6 2 10 6 Percentile change in overall number of mandatory / recommended or optional elements +62% -66% -40% Metadata DesignMetadata Design PhasePhase
  • 34. Duration: 1 week Date: August 2011 Annotated Objects: 1000 objects (≈10% of total) Resources Reviewed: 100 (50 per expert) Involved people: 2 ME  Outcomes were documented in the Good & Bad Metadata Practices document that was drafted after the peer-review process.  Significant changes in the AP came out Testing PhaseTesting Phase
  • 35. Duration: 3 weeks Date: March 2012 Annotated Objects: 3.000 objects (≈20% of total) Resources Reviewed: 99 resources (≈10 per expert) Involved people: 10 DE Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score 5 16% 22% 36% 73% 32% 62% 26% 4 42% 43% 40% 13% 29% 22% 38% 3 19% 13% 19% 4% 13% 13% 30% 2 6% 1% 3% 6% 14% 1% 3% 1 13% 9% 1% 4% 11% 1% 2% no 3% 11% 0% 0% 0% 1% 0% Calibration PhaseCalibration Phase 58 % 65 % 76 % 86 % 61 % 84 % 64 % G&B Metadata Practices Guide
  • 36. Duration: 1 week Date: May 2012 Annotated Objects: 3.417 objects (≈20% of total) Resources Analyzed: 3.417 Involved people: 1 ME & 1 RE No Mandatory Elements % 1 CHO Titles 99.9% 2 CHO Keywords 96.8% 3 Object Titles 100.0% 4 Object Descriptions 89.7% 5 Object URL 99.7% 6 Object Thumbnail URL 97.1% 7 Object Content Type 100.0% 8 Copyrights 100.0% 9 Access 60.4%60.4% Calibration PhaseCalibration Phase
  • 37. No Recommended Elements % 10 CHO Types 78.1% 11 Object Creators 88.1% 12 Object Languages 32.4%32.4% No Optional Elements % 13 Scientific Name 0.3%0.3% 14 Classification 0.3%0.3% 15 Common Names 0.0%0.0% 16 CHO Significant Dates 71.8% 17 CHO Temporal Coverage 21.2% 18 CHO Spatial Coverage 74.7% 19 CHO Mediums 6.6%6.6% 20 CHO Creators 0.0%0.0% 21 Object Creation Dates 16.4%16.4% 22 Object Identifiers 61.8% 23 Object Context URL 16.2%16.2% 24 Related Objects 52.6% 25 Object Formats 89.5% 26 Object Extents 58.2% Calibration PhaseCalibration Phase
  • 38. Duration: 1 week Date: September 2012 Annotated Objects: 9.402 objects (≈62% of total) Resources Analyzed: 9.402 Involved people: 1 ME No Mandatory Elements May 2012 Sep 2012 1 CHO Titles 99.9% 98.4% 2 CHO Keywords 96.8% 97.2% 3 Object Titles 100% 100% 4 Object Descriptions 89.7% 99.2% 5 Object URL 99.7% 99.4% 6 Object Thumbnail URL 97.1% 88.2% 7 Object Content Type 100.0% 100% 8 Copyrights 100.0% 100% 9 Access 60.4%60.4% 100% Critical Mass PhaseCritical Mass Phase
  • 39. No Recommended Elements % Sep 2012 10 CHO Types 78.1% 78.3% 11 Object Creators 88.1% 86.1% 12 Object Languages 32.4%32.4% 61.7% No Optional Elements % Sep 2012 13 Scientific Name 0.3%0.3% 59.9% 14 Classification 0.3%0.3% 39.5% 15 Common Names 0.0%0.0% 41.8% 16 CHO Significant Dates 71.8% 46.1% 17 CHO Temporal Coverage 21.2% 18%18% 18 CHO Spatial Coverage 74.7% 70.1% 19 CHO Mediums 6.6%6.6% 3.3%3.3% 20 CHO Creators 0.0%0.0% 49.2% 21 Object Creation Dates 16.4%16.4% 47.5% 22 Object Identifiers 61.8% 63.8% 23 Object Context URL 16.2%16.2% 36.4% 24 Related Objects 52.6% 88.2% 25 Object Formats 89.5% 100% 26 Object Extents 58.2% 47.2% Critical Mass PhaseCritical Mass Phase
  • 40. Duration: 3 weeks Date: September 2012 Annotated Objects: 9.402 objects (≈62% of total) Resources Reviewed: 34 resources (≈9 per expert) Involved people: 4 DE Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score 5 38% 59% 68% 79% 47% 79% 53% 4 29% 21% 12% 6% 29% 9% 29% 3 26% 9% 12% 6% 15% 12% 9% 2 6% 12% 9% 9% 6% 0% 9% 1 0% 0% 0% 0% 3% 0% 0% no 0% 0% 0% 0% 0% 0% 0% Critical Mass PhaseCritical Mass Phase 67 % 80 % 80 % 85 % 76 % 88 % 84 % G&B Metadata Practices Guide 58 % 65 % 76 % 86 % 61 % 84 % 64 %
  • 41. Duration: 1 week Date: May 2013 Annotated Objects: 11.375 objects (≈80% of total) Resources Analyzed: 11.375 Involved people: 1 ME & 1 RE No Mandatory Elements Sep ‘12 May ‘13 Diff 1 CHO Titles 99.9% 99.5% -0.4% 2 CHO Keywords 96.8% 99.1% +2.3% 3 Object Titles 100% 100% 0% 4 Object Descriptions 89.7% 98.9% +9.2% 5 Object URL 99.7% 99.9% +0.2% 6 Object Thumbnail URL 97.1% 97.2% +0.1% 7 Object Content Type 100% 100% 0% 8 Copyrights 100% 100% 0% 9 Access 60.3% 100% +39.7% Regular OperationRegular Operation PhasePhase
  • 42. No Recommended Elements Sep ‘12 May ’13 Diff 10 CHO Types 78.1% 74.9% -3.2% 11 Object Creators 88.1% 87.3% -0.8% 12 Object Languages 32.4% 78.5% +46.1% No Optional Elements Sep ‘12 May ‘13 Diff 13 Scientific Name 0.3% 65.1% +64.8% 14 Classification 0.3% 56.7% +56.4% 15 Common Names 0% 49.8% +49.8% 16 CHO Significant Dates 71.8% 48.1% -23.7% 17 CHO Temporal Coverage 21.2% 18.5% -2.7% 18 CHO Spatial Coverage 74.7% 71.4% -3.3% 19 CHO Mediums 6.6% 17.9% +13.3% 20 CHO Creators 0% 58.1% +58.1% 21 Object Creation Dates 16.4% 66.5% +50.1% 22 Object Identifiers 61.8% 68.4% +6.6% 23 Object Context URL 16.2% 42.1% +25.9% 24 Related Objects 52.6% 47.1% -5.5% 25 Object Formats 89.5% 94.3% +4.8% 26 Object Extents 58.2% 89.1% +30.9%
  • 43. Duration: 3 weeks Date: May 2013 Annotated Objects: 9.402 objects (≈62% of total) Resources Reviewed: 90 resources (≈9 per expert) Involved people: 10 DE Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score 5 40% 60% 74% 89% 61% 74% 52% 4 26% 24% 10% 6% 24% 13% 30% 3 18% 8% 7% 1% 4% 9% 10% 2 13% 3% 4% 1% 6% 0% 3% 1 3% 4% 4% 3% 4% 3% 5% no 0% 0% 0% 0% 0% 0% 0% Regular OperationRegular Operation PhasePhase 66 % 84 % 84 % 95 % 85 % 87 % 82 % G&B Metadata Practices Guide 67 % 80 % 80 % 85 % 76 % 88 % 84 %
  • 44.  45 people (w/duplicates)  8 QA methods ◦ 1 MUS; 4 Peer-reviews; 3 Usage Data Analyses  298 records reviewed (unique)  24.194 records analyzed (w/duplicates)  166.5 hours in total
  • 46.  Initiative: VOA3R  Topics: Agriculture, Veterinary, Horticulture, Aquaculture, Viticulture, etc.  Resources: >71.316 (2.500.000)  Content Providers: 9  Languages: 9  Content Type: Mainly text with some images
  • 47. Duration: 2 hours Date: June 2011 Annotated Objects: Not applicable Involved people: 16 DE Results Question [0.1] (1.2] (2.3] (3.4] (4.5] Is the element easy for you to understand? 0% 0% 11.4% 25.7% 62.9% Is this element useful for describing research resources? 0% 0% 0% 48.6% 51.4% Metadata DesignMetadata Design PhasePhase
  • 48. Mandatory Recommended Optional Question Before After Before After Before After Should this element be mandatory. recommended or optional? 13 12 14 6 8 17 Percentile change in overall number of mandatory / recommended or optional elements -8% -57% +112% Metadata DesignMetadata Design PhasePhase
  • 49. Duration: 1 week Date: August 2011 Annotated Objects: 25.000 objects (≈35% of total) Resources Reviewed: 65 (≈7 per expert) Involved people: 9 DE Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score 5 20% 31% 26% 60% 14% 65% 18% 4 11% 43% 42% 20% 17% 22% 31% 3 6% 9% 12% 5% 14% 6% 14% 2 20% 2% 0% 3% 22% 2% 18% 1 40% 6% 2% 3% 26% 2% 15% no 3% 9% 18% 9% 8% 5% 3% Testing PhaseTesting Phase 31 % 74 % 68 % 80 % 31 % 87 % 49 % G&B Metadata Practices Guide
  • 50. Duration: 1 week Date: December 2011 Annotated Objects: 50.000 objects (≈70% of total) Resources Reviewed: 61(≈7 per expert) Involved people: 9 DE Score Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score 5 8% 51% 26% 67% 30% 74% 25% 4 28% 31% 46% 26% 48% 21% 44% 3 26% 10% 23% 3% 8% 0% 28% 2 31% 3% 5% 3% 11% 0% 3% 1 7% 2% 0% 0% 3% 2% 0% no 0% 3% 0% 0% 0% 3% 0% Calibration PhaseCalibration Phase G&B Metadata Practices Guide 36 % 82 % 72 % 93 % 78 % 95 % 69 % 31 % 74 % 68 % 80 % 31 % 87 % 49 %
  • 51. Duration: 1 week Date: April 2012 Annotated Objects: 51.057 (≈70% of total) Resources Analyzed: 51.057 Involved people: 2 ME Critical Mass PhaseCritical Mass Phase No Element Name % 1 dc.identifier 99.8% 2 dc.title 100% 3 dc.language 85.2% 4 dc.description 94.2% 5 dc.subject 90.9% 6 dc.coverage 41.3%41.3% 7 dc.type 97.6% 8 dc.date 82.5% 9 dc.creator 95.6% 10 dc.contributor 43.0%43.0% 11 dc.publisher 62.8%62.8% 12 dc.format 92.6% 13 dc.rights 47.8%47.8% 14 dc.relation 90.0% 15 dc.source 47.3%47.3%
  • 52. Duration: 2 hours Date: May 2012 Annotated Objects: Not applicable Involved people: 13 DE Results Question [0.1] (1.2] (2.3] (3.4] (4.5] Is the element easy for you to understand? 0% 0% 16.7% 64.6% 18.8% Is this element useful for describing research resources? 0% 0% 16.7% 56.3% 31.3% Critical Mass PhaseCritical Mass Phase
  • 53. Duration: 1 week Date: June 2013 Annotated Objects: 74.379 objects (100% of total) Resources Analyzed: 74.379 Involved people: 2 ME & 1 RE Regular OperationRegular Operation PhasePhase No Element Name % May 2012 % June 2013 Diff 1 dc.identifier 99.8% 100% +0.03% 2 dc.title 100% 100% +0.01% 3 dc.language 85.2% 93.3% +8.1% 4 dc.description 94.2% 96.8% +2.6% 5 dc.subject 90.9% 100% +9.1% 6 dc.coverage 41.3% 97.8% +56.5% 7 dc.type 97.6% 91.8% -5.8% 8 dc.date 82.5% 96.3% +13.8% 9 dc.creator 95.6% 100% +4.4% 10 dc.contributor 43% 51.8%51.8% +7.2%+7.2% 11 dc.publisher 62.8% 41.2%41.2% -21.6%-21.6% 12 dc.format 92.6% 81.7% -10.9% 13 dc.rights 47.8% 71.3% +23.5% 14 dc.relation 90% 75.7% -14.3% 15 dc.source 47.3% 100% +52.7%
  • 54.  58 people (w/duplicates)  6 QA methods ◦ 2 MUS; 2 Peer-reviews; 2 Usage Data Analyses  126 records reviewed (unique)  125.436 records analyzed (w/duplicates)  154.8 hours in total
  • 55.
  • 56.  Different domains have different metadata needs ◦ Mandatory  Learning 11, Research 12, Culture 21 (!) ◦ Recommended  Learning 17, Research 6, Culture 2 ◦ Optional  Learning 29 (!), Research 17, Culture 6 Metadata DesignMetadata Design PhasePhase
  • 57.  Metadata is easy to understand but not always useful ◦ Learning:  76% Easy,  51%51% Useful ◦ Research:  72% Easy,  79% Useful ◦ Cultural:  80% Easy,  88% Useful Metadata DesignMetadata Design PhasePhase
  • 58. Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score 85 % 83 % 78 % 90 % 74 % 90 % 77 % 58 % 65 % 76 % 86 % 61 % 84 % 64 % 36 % 82 % 72 % 93 % 78 % 95 % 69 % Learning Culture Research >72% >86% >84% Calibration PhaseCalibration Phase
  • 59.  Completeness VS Easiness VS Usefulness Regular OperationRegular Operation PhasePhase
  • 60.  Completeness VS Easiness VS Usefulness Regular OperationRegular Operation PhasePhase
  • 61.  Completeness VS Easiness VS Usefulness Regular OperationRegular Operation PhasePhase
  • 62. 1. Can we set up quality assurance methods for ensuring metadata record quality that will have a positive effect on the resulting quality of the metadata records of a given federation of repositories? 2. Which are the metrics that can be used to effectively assess the metadata record quality? ◦ Completeness, accuracy, objectiveness, correctness, consistency & appropriateness
  • 63. 3. At what levels of the metadata record quality metrics is a repository considered to have a satisfying metadata record quality? ◦ All quality metrics being ranked with 4 or above on 75% of the reviews Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score 85 % 83 % 78 % 90 % 74 % 90 % 77 % 58 % 65 % 76 % 86 % 61 % 84 % 64 % 36 % 82 % 72 % 93 % 78 % 95 % 69 % Learning Culture Research
  • 64. 4. Can we introduce a quality assurance process that is transferable through different application contexts and types of repositories? 5. What are the specific adjustments that have to be made to apply a quality assurance process in other contexts? ◦ Variable use of QA method ◦ No major adjustments needed
  • 65. 6. Are the results of the application of the same metadata quality assurance process in different repositories comparable in terms of the resulting metadata quality? Case Average Median Range Elements Learning 67% 79.1% 99.6% 22 Cultural 74.1% 76.7% 82.1% 26 Research 85.2% 96.3% 58.8% 15
  • 66. 7. Is the cost involved in the application of a metadata quality assurance process comparable in terms of magnitude for different repositories? Case People Methods Resources Hours Learning 49 5 11.000 134.7 Cultural 45 8 15.000 166.5 Research 58 6 75.000 154.8
  • 67. 8. Is the improvement in metadata record quality comparable with the cost of the quality assurance method? Case Average improvement for elements (<60%) Elements in AP Hours % of average improvement per hour spent Learning +55.7% 22 134.7 0.43% Cultural +28.8% 26 166.5 0.17% Research +23.7% 15 154.8 0.15%
  • 68.  Proposed a Metadata Quality Assurance Certification Process  Applied the proposed MQACP to a learning federation  Validated its effectiveness in two new cases  Identified future areas of research
  • 69.  Palavitsinis. N., Manouselis. N. (2009). “A Survey of Knowledge Organization Systems in Environmental Sciences”, in I.N. Athanasiadis. P.A. Mitkas. A.E. Rizzoli & J. Marx-Gómez (eds.). Information Technologies in Environmental Engineering, Proceedings of the 4th International ICSC Symposium, Springer Berlin Heidelberg.  Palavitsinis. N., Kastrantas K. and Manouselis. N. (2009a). "Interoperable metadata for a federation of learning repositories on organic agriculture and agroecology". in Proc. of the Joint International Agricultural Conference 2009 (JIAC 2009), Wageningen, The Netherlands, July 2009  Palavitsinis. N., Manouselis. N. and Sanchez. S. (2010). "Preliminary Discussion on a Digital Curation Framework for Learning Repositories“, in Massart D. & Shulman E. (Eds.). Proc. of Workshop on Search and Exchange of e-le@rning Materials (SE@M’10). Barcelona, Spain, CEUR 681, September 2010  Palavitsinis. N. and Manouselis. N. (2013). “Agricultural Knowledge Organisation Systems: An Analysis of an Indicative Sample” in Sicilia M.-A. (Ed.). Handbook of Metadata, Semantics and Ontologies, World Scientific Publishing Co.
  • 70.  Palavitsinis. N., Ebner. H., Manouselis. N., Sanchez S. and Naeve. A., (2009b). "Evaluating Metadata Application Profiles Based on Usage Data: The Case of a Metadata Application Profile for Agricultural Learning Resources“, in Proc. of the International Conference on Digital Libraries and the Semantic Web (ICSD 2009). Trento, Italy, September 2009  Palavitsinis. N.. Manouselis. N. and Sanchez. S.. (2009c). "Evaluation of a Metadata Application Profile for Learning Resources on Organic Agriculture“, in Proc. of 3rd International Conference on Metadata and Semantics Research (MTSR09). Milan, Italy, October 2009  Palavitsinis. N., Manouselis. N. and Sanchez. S., (2011). "Metadata quality in learning repositories: Issues and considerations“, in Proc. of the World Conference on Educational Multimedia, Hypermedia & Telecommunications (ED-MEDIA 2011), Lisbon, Portugal  Palavitsinis. N., Manouselis. N. & Sanchez. S., (in press). Metadata Quality in Learning Object Repositories: A Case Study, The Electronic Library.
  • 71.  Palavitsinis. N., Manouselis. N. & Sanchez. S., (in press). Metadata Quality in Digital Repositories: Empirical Results from the Cross-Domain Transfer of a Quality Assurance Process, Journal of the American Society for Information Science and Technology.
  • 72.  e-Conference (2010, www.e-agriculture.org) ◦ Learning resources creation: What constitutes a quality learning resource? ◦ Providing quality metadata: Is the gain worth the effort? ◦ Populating a repository with resources and metadata: The quality versus quantity dilemma ◦ Managing a portal with thousands of resources and users: Are communities “attracted” to quality, like bees to honey? Topic Posts Participants Answers Views Topic 1. Phase I 23 16 50 957 Topic 2. Phase I 18 13 27 568 Topic 1. Phase II 19 12 17 531 Topic 2. Phase II 12 9 16 393 TOTAL 72 50 110 2.449
  • 73.  Joint Technology Enhanced Learning Summer School (2010) ◦ WHY Metadata for MY Resources workshop ◦ http://www.prolearn-academy.org/Events/summer-school-201  Special Track on Metadata & Semantics for Learning Infrastructures (2011, Ismir, Turkey) ◦ http://ieru.org/org/mtsr2011/MSLI/
  • 74.  International Journal of Metadata. Semantics and Ontologies (2013 Vol. 8, No. 1) ◦ Special Issue on Advances in Metadata and Semantics for Learning Infrastructures: Part 1 ◦ Guest Editors: Nikos Palavitsinis. Dr Joris Klerkx and Professor Xavier Ochoa ◦ http://www.inderscience.com/info/inarticletoc.php?jcode=ijms
  • 75.
  • 76.  The organization of Metadata Understanding Sessions showed: ◦ The need for a better explanation of metadata elements ◦ The need for domain experts to be heavily involved in the entire process  In Palavitsinis et al., (2013) we presented a Creative Metadata Understanding Session that was deployed and tested already
  • 77.  2.5 hours instead of two of a typical MUS  Worked well with a focus group of environmental experts  More research is needed
  • 78.  Tsiflidou & Manouselis built upon the work carried out in this Thesis to examine how different metadata quality metrics can be extracted through automated methods.  Completeness, element frequency, entropy, vocabulary values distribution, metadata multilinguality  Used to examine a first set of 2.500 records
  • 79.  One of the first studies on metadata quality costs  Combined with annotation costs can provide a holistic view on the costs associated with populating a repository with resources  Complimentary to existing library approaches
  • 80. No Cost Category Included in MQACP Explanation 1 Metadata Design Yes Designing an application profile based on user needs. 2 Resource Selection No Selecting the resources to be populated in the repository 3 Metadata Annotation No Adding metadata to the resources from scratch 4 Metadata Enrichment No Enriching/correcting problematic metadata fields 5 Peer Review of Metadata Yes Checking the quality of metadata through peer review processes 6 Automated Metadata Review Yes Checking the quality of metadata through automated means 7 Development of Training Material Yes Develop guidelines and manuals for the content providers 8 Training on Metadata No Hold training sessions on metadata annotation / spend time on reading supportive material
  • 81.  Examination board  Supervisors ◦ Salvador Sanchez-Alonso; Nikos Manouselis  Information Engineering Research Unit Personnel ◦ Miguel Refusta; Alberto Abian;…  Agro-Know colleagues ◦ Effie Tsiflidou; Charalampos Thanopoulos; Vassilis Protonotarios;…  Family & Friends

Notes de l'éditeur

  1. One major problem…
  2. Can we set up quality assurance methods for ensuring metadata record quality that will have a positive effect on the resulting quality of the metadata records of a given federation of repositories? Which are the metrics that can be used to effectively assess the metadata record quality? At what levels of the metadata record quality metrics is a repository considered to have a satisfying metadata record quality? Can we introduce a quality assurance process that is transferable through different application contexts and types of repositories? What are the specific adjustments that have to be made to apply a quality assurance process in other contexts? Are the results of the application of the same metadata quality assurance process in different repositories comparable in terms of the resulting metadata quality? Is the cost involved in the application of a metadata quality assurance process comparable in terms of magnitude for different repositories? Is the improvement in metadata record quality comparable with the cost of the quality assurance method?
  3. We have specific phases that cover the lifecycle of a repository. During these phases, specific steps take place, deploying quality assurance methods that are carried out through the use of specific tools and instruments, with the assistance of human actors that deploy them.