Selected Talk by Allan Hanbury, at the European Data Forum 2013, 10 April 2013 in Dublin, Ireland: Algorithm any good? A Cloud-based Infrastructure for Evaluation on Big Data
EDF2013: Selected Talk: Allan Hanbury: Algorithm any good? A Cloud-based Infrastructure for Evaluation on Big Data
1. Algorithm any good?
A Cloud-based
Infrastructure for
Evaluation on Big Data
Allan Hanbury
Vienna University of Technology
The research leading to these results has received funding from the European Union Seventh
Framework Programme (FP7/2007-2013) under grant agreement n° 318068 (VISCERAL).
2. Evaluation
Evaluation campaigns / Challenges /
Benchmarks / Competitions / ...
Makes economic sense
“for every $1 that NIST and its partners invested in
TREC, at least $3.35 to $5.07 in benefits accrued
to IR researchers.”
Has scientific impact
3. Evaluation Campaigns
Ground
truth
Tasks Data
Organiser
Participants
Kyle Mcdonald: http://www.flickr.com/photos/kylemcdonald/6187343093/
4. Evaluation Campaigns
Ground
truth
Tasks Data
Organiser
Participants
Kyle Mcdonald: http://www.flickr.com/photos/kylemcdonald/6187343093/
5. With Big Data?
Ground
truth
Organiser
Tasks Data
Participants
Kyle Mcdonald: http://www.flickr.com/photos/kylemcdonald/6187343093/
6. Benchmarking Algorithms on Big Data
Distributing terabytes is hard
Sending hard disks, download is not feasible
Bringing algorithms to the data is necessary
Motivating participants
Tasks with general interest and few infrastructure
barriers (how to store or treat terabytes ...)
Allow sharing infrastructure
Manual ground truthing does not scale. Use:
Semi-automation (e.g. silver corpus)
Coercion (e.g. crowd sourcing)
…
7. Evaluation on the Cloud
(http://visceral.eu)
Bring the algorithms to the data, not the data
to the algorithms
Put the data on the cloud
Participants program in computing instances on
the cloud
First benchmark on structure recognition in
medical images
8. Training Phase
Cloud
Training Data Test Data
Participant
Instances
Registration
System
Analysis
System
Participants Organiser
9. Evaluation Phase
Cloud
Training Data Test Data
Participant
Instances
Registration
System
Analysis
System
Participants Organiser
10. Annotators
(Radiologists)
Locally Installed
Annotation
Clients
Annotation
Management System
Cloud
Training Data Test Data
Participant
Instances
Registration
System
Analysis
System
Participants Organiser
11. Future Development
Dealing with private data
Does it make sense to evaluate on data that the
participant cannot see?
Does it make sense to evaluate only on extracted
features?
Moving toward eScience
Data identifiers
Algorithm identifiers?
Continuous evaluation
Modular construction of the algorithms
12. Challenges
Sharing components
Who should provide the cloud service?
Who pays for using it?
Transferring components to industry