Webinar by Niall Beard about the aggregated training platform TeSS, and the schema.org working group Bioschemas. The presentation describes the need for TeSS, many of it's features and a look into the difficulties aggregating ANY data online. We go on to the solution of using schema.org as a lightweight method for structuring data. This talk was for the FAIRDOM webinar series. A live recording of it can be found here>
6. Great discovery tool…
•If you know the specific name of the thing you’re looking for
(e.g. ‘EuBIC Winter School’).
•Not so great if you just want to see what proteomics related
events are available
7. The Long Tail of Training Resources
Number of websites
Volumeofmaterials
Large institutions and repositories
>30 training resources
Significant online presence
Smaller websites
<30 training resources
Often buried in search results
8. Quick TeSS Overview
• Aggregation and registration of training events and
materials
• Tools to filter, search, and discover
• Users can organize into packages and training workflows
• Interlinking with other ELIXIR registries
• ELIXIR Node ‘shop window’ view
• https://tess.elixir-europe.org
9. TeSS Materials index page
Filter By:
Content Provider
Scientific Topic
Tool
Standards
Policies
Target Audience
Keyword
Difficulty Level
Author
Contributor
Licence
ELIXIR Node
Search for text and
order results
10. Link to other resources
(from ELIXIR registries
and other)
13. Visual Workflows
• Developing workflows to represent typical
data analyses.
• Attaching tools, training, and other
resources to each stage.
14. Registry integration
• Associate TeSS resources
with bio.tools and
Biosharing.org resources.
• Search for all training
materials about a specific
tool, standard operating
procedure, database etc.
• Tool-centric search
15. TeSS summary
•Aggregated training portal with functions:
• Search and Filter
• Training workflows
• Subscription services
• iAnn events widgets
• Integration with bio.tools and biosharing.org
• ELIXIR node views
•Upcoming:
• More curation tools
• Collaboration with BD2Ks training portal
• More integrations with other information services
18. HTML scraper
• Difficult to write.
• - Every site is unique
• - Some have more treacherous HTML
• Very susceptible to change
• - Fixing is re-implementing.
• - Build up of technical debt
http://cdn2.bigcommerce.com/server100/367cc/p
<p>Mon, 27 Feb 2017, 12:00 –</p>
31. What is Bioschemas
• Developing schema.org
specifications to work for Life
sciences
• Proposing amendments and
new schemas to be able to
describe Life science
resources.
• Events and CreativeWork
(materials). Also for tools,
data, data repositories
Image: http://bioschemas.org/
34. Special thanks to TeSS and Bioschemas’
collaborators, our guinea pigs, and community
TeSS Team
Finn Bacall
Milo Thurston
Aleksandra Nenadic
Susanna-Assunta Sansone
Teresa Attwood
Carole Goble
+ Many more
Events/Training materials Bioschemas Team
Rafael Jiminez
Martin Cook
Premysl Veselyk
Aleksandra Nenadic
Gabriella Rustici
Dominique Batista
+ Many more
https://tess.elxir-europe.org and http://bioschemas.org
Notes de l'éditeur
ELIXIR is a pan-european infrastructure for life science information. Broadly, it’s goals are to comprising of 22 national nodes, plus a node for the EBI, and a special Hub node that aims to help connect and provide support for nodes. Like a spoke and hub system.
Each node has many institutions associated with them, there are over 100 altogether which yield a complicated distributed architecture. Resources are being produced in each institution or across a selection of institutions in huge quantities. Tools, training, compute infrastructure, datasets etc. all exist on heterogenous collections of websites. Some in global repositories, some on country wide repositories, some on institutional, some on personal. There’s humongous diversity in the landscape these resources.
ELIXIR hopes to aggregate and present uniformly distriibuted resources in order to improve their findability. The oft-cited FAIR Principles states guidelines for data stewardship should be Findable Accessible Interoperable and Reproducible. As such ELIXIR attempts to improve findabiity through discovery portals. And as you’ll see, certainly keep the AIR in mind. --0
New providers and developments of old ones - each a unique challenge
Proud to collect so many 500 Materials + 120 upcoming events + 4 thousand archived, looking forward to collecting more
Distributed Research Infrastrtucture. Developing training opportunities across 20+ member nodes
Collect and harmonize. Feture rich environment. Connected outwardly to complimentary projects. Promote ELIXIR nodes and keep access to data open to be utilized
Once we have the data in TeSS we can implement novel and useful features. Subscription service for example. Plans to extend this to a subscription service – input twitter, email and other social media handles and receive notifcations as and when other events that meet your criteria appear.
Navigational tools. Give context to training materials in real scientific processes.
Agile dev so no complete, but we have a usable iteration.
Hosting workshop in UK before March next year. Create content and review mechanism. Details to be announced.
Implementation study for training workflows. Prioritised by training platform, waiting on tools platform. Led by Frederick Coppens from Belgium very capable guy
Important to connect with other ELIXIR data
with Biotools and Biosharing can improve usefulness of content
Collection of schemas can be used to describe online objects
Plugin to the Bioschemas spec development activities
Driving events and training specifications. Prime example for other specifications. Slower going pathing the way but lots of good lessons for future implementations.
Early on would write HTML scrapers if necessary. Now drawing a strong line under that. In last resort situations it will be put on the issue tracker with low priority. Development resources too expensive and maintenance is practically - repeat.