This paper describes the code tagging plug-in coTag, which allows annotating code snippets in the integrated development environment eclipse. coTag offers an easy-to-use interface for tagging and searching. Using the similarity-based search engine of the open-source tool myCBR, the user can search not only for exactly the same tags as offered by other code tagging extensions, but also for similar tags and, thus, for similar code snippets. coTag provides means for context-based adding of new as well as changing of existing similarity links between tags, supported by myCBR’s explanation component.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Code-tagging and similarity-based retrieval with myCBR
1. CAMBRIDGE, UK, 10 DEC 2008
Code-tagging and similarity-
based retrieval with myCBR
Thomas Roth-Berghofer & Daniel Bahls
Senior researcher, trb@dfki.de
German Research Centre for Artificial Intelligence DFKI GmbH
Samstag, 18. Juli 2009
4. Programmer‘s dilemma
• Where is the code fragment I used to solve a
similar problem in the past?
• Is this piece of code still available?
• Is it worth the effort to search for it?
• If so, what would be the right search term?
Samstag, 18. Juli 2009
7. Personalised approach
• Personal
vocabulary: tags
• Linking tags
Samstag, 18. Juli 2009
8. Personalised approach
• Personal
vocabulary: tags
• Linking tags
• Case-based
retrieval
Samstag, 18. Juli 2009
9. Personalised approach
• Personal
vocabulary: tags
• Linking tags
• Case-based
retrieval
• Work context
Samstag, 18. Juli 2009
10. Personalised approach
• Personal
vocabulary: tags
• Linking tags
• Case-based
retrieval
• Work context
• Social dimension:
tag exchange
Samstag, 18. Juli 2009
11. CBR cycle
Agnar Aamodt and Enric Plaza. Case-based reasoning: Foundational issues,
methodological variations, and system approaches. AI Communications, 7(1):39–59, 1994.
Samstag, 18. Juli 2009
12. CBR cycle myCBR
CBR
Agnar Aamodt and Enric Plaza. Case-based reasoning: Foundational issues,
methodological variations, and system approaches. AI Communications, 7(1):39–59, 1994.
Samstag, 18. Juli 2009
15. Case structure
Attribute Value type category
Tags String (multiple) Problem description
Context items String (multiple) Problem description
Code snippet String Solution
Document type String Provenance
Project name String Provenance
File path String Provenance
Author ID String Provenance
Creation date Long Provenance
Rating Float Maintenance
Rating count Integer Maintenance
Samstag, 18. Juli 2009
16. Case structure Set by user
Set by coTag
Attribute Value type category
Tags String (multiple) Problem description
Context items String (multiple) Problem description
Code snippet String Solution
Document type String Provenance
Project name String Provenance
File path String Provenance
Author ID String Provenance
Creation date Long Provenance
Rating Float Maintenance
Rating count Integer Maintenance
Samstag, 18. Juli 2009
22. Situations in which
explanations play a role
• Instructing explanations:
• Novice users want to know about how tagging and (similarity-based)
retrieval works.
• Convincing explanations:
• Regular users want to check when the retrieval does not meet their
expectations.
• Improving explanations
• Regular users want to correct coTag‘s behaviour.
Samstag, 18. Juli 2009
23. Explanation of matching
• Search terms:
• init, logging, config
• Case tags:
• init, Logger
Samstag, 18. Juli 2009
24. Graphical explanation of
trigram matching
• Syntactical similarity
• Typos
• Stemming
Samstag, 18. Juli 2009
25. Similarity customisation
• Tag similarities:
unsimilar 0%
partly similar 25%
similar 50%
very similar 75%
identical 100%
• Updates personal and
community similarity
measure
Samstag, 18. Juli 2009
26. Similarity customisation
• Tag similarities:
unsimilar 0%
partly similar 25%
similar 50%
very similar 75%
identical 100%
• Updates personal and
community similarity
measure
Samstag, 18. Juli 2009
27. Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
28. Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
29. Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
30. Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
31. Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
40. Take home messages
• Re-finding information is a quite
typical task in knowledge-work.
Samstag, 18. Juli 2009
41. Take home messages
• Re-finding information is a quite
typical task in knowledge-work.
• Tagging is a helpful and well-
known technique.
Samstag, 18. Juli 2009
42. Take home messages
• Re-finding information is a quite
typical task in knowledge-work.
• Tagging is a helpful and well-
known technique.
• Similarity-based retrieval can
improve searches.
Samstag, 18. Juli 2009
43. Take home messages
• Re-finding information is a quite
typical task in knowledge-work.
• Tagging is a helpful and well-
known technique.
• Similarity-based retrieval can
improve searches.
• Explanation-aware development of
applications help you deal with
increased complexity of similarity-
based retrieval.
Samstag, 18. Juli 2009
44. Thank you!
CAMBRIDGE, UK, 10 DEC 2008
Code-tagging and similarity-
based retrieval with myCBR
Thomas Roth-Berghofer & Daniel Bahls
Senior researcher, trb@dfki.de
German Research Centre for Artificial Intelligence DFKI GmbH
Samstag, 18. Juli 2009