SlideShare une entreprise Scribd logo
1  sur  47
Télécharger pour lire hors ligne
«Tag-based Semantic
Website Recommendation
for Turkish Language»

Onur Yılmaz
mail@onuryilmaz.me
Outline


Introduction



Related Work



Problem Definition and Algorithm



Experimental Evaluation



Conclusion



Future Work



Demo
Introduction - Definitions


Tags




non-hierarchical keyword or term

Reasons
 categorizing,
 memorizing,
 archiving
 and

sharing…
Introduction - Motivation


Dramatic increase in the number of the websites on
the internet

7.14
billion
pages

Difficulty in
finding and
exploring
new websites

Social
bookmarking

Recommendation
systems
Introduction – Turkish Effect


Recommendation systems search within user inputs


Users tend to use their own language on the internet



Turkey is listed as 32nd country in English proficiency



Turkish and English is very different languages!
Introduction – What is
proposed?


Tag-based recommendation system


For Turkish-language



Which is based on similarity, tag weight, tag
popularity;



Where semantic properties of tags are taken into
account
Related Work


Collaborative filtering





Widely accepted
No context!

Topic and pattern extraction


Usage of WordNet


A lexical database for the English language



2 papers are found for Turkish WordNet but no
source
Related Work


Similarity calculation methods


Durao & Dolog (2009) Reference paper



Tag popularity, tag representativeness and taguser affinity



Without any semantics analysis, 60 % acceptance
level achieved
Problem Definition


Take inputs

Websites
and tags

Recommendation
System
Problem Definition


Provide personal recommendations

Websites
and tags

Recommendation
System
Problem Definition


Aim -> User satisfaction



Recommend websites


User wants to use in the future,



Already using and finds interesting
Problem Definition



Challenge -> Different tagging purposes and
expectations
Website

Tag

Potential Purpose

zaytung.com

zaytung

Archiving

eksisozluk.com

alışkanlık
(ENG: habit)

Internet usage
habit

evekitap.com

ücretsiz kargo
(ENG: free shipping)

Categorizing

9gag.com

eğlenceli
(ENG: funny)

Definition
Data are taken from experiment
Algorithm


Steps of the algorithm

Spell-check

Stemming

Semantics
Analysis

Similarity
Calculation
Algorithm – Spell-Checking


Spell check on the tags


Add a single letter,



Delete a single letter,



Replace one letter and



Transpose two letters

Estimated tags occur or not in Turkish National Corpus.
Algorithm – Spell-Checking


Correction on URLs
Original URL

Corrected URL

https://www.deviantart.com/

deviantart.com

http://www.sahadan.com/Default.aspx

sahadan.com

http://www.yemeksepeti.com/AnonymouseDefault.aspx

yemeksepeti.com
Data are taken from experiment
Algorithm


Steps of the algorithm

Spell-check

Stemming

Semantics
Analysis

Similarity
Calculation
Algorithm – Stemming


Stems of the tags are extracted by removing
suffices.
Website
facebook.com

metu.edu.tr

deviantart.com

Original Tag

Corrected Tag

arkadaşlık

arkadaş

(ENG: friendship)

(ENG: friend)

mühendislik

mühendis

(ENG: engineering)

(ENG: engineer)

eğlenceli

eğlence

(ENG: funny)

(ENG: fun)
Data are taken from experiment
Algorithm


Steps of the algorithm

Spell-check

Stemming

Semantics
Analysis

Similarity
Calculation
Algorithm – Semantics Analysis


An open source «Turkish Thesaurus» project


125.022 <Word, Synonym> pairs
Algorithm – Semantics Analysis


Algorithm applied:
for each “tag” in ALL-DATA do:
for each “synonym” of “tag” in SYNONYM-LIST do:
if “synonym” occurs in ALL-DATA then:
add <user, site, “synonym”> to ALL-DATA
Algorithm – Semantics Analysis
User

Website

Tag

User1

milliyet.com.tr

haber (ENG: news)

User2

sabah.com.tr

gazete (ENG: newspaper)

Original data (ALL-DATA)

Word

Synonym

haber (ENG: news)

gazete (ENG: newspaper)
Synonym List (SYNONYM-LIST)

User

Website

Tag

User1

milliyet.com.tr

gazete (ENG: newspaper)

User2

sabah.com.tr

haber (ENG: news)

Added data to ALL-DATA
Data are taken from experiment
Algorithm – Semantics Analysis



An environment where all users provide tags and
their potential meanings which other people may
have already used.
Algorithm


Steps of the algorithm

Spell-check

Stemming

Semantics
Analysis

Similarity
Calculation
Algorithm –

Website
Rating

=

Similarity Calculation

Tag
Popularity
𝑜𝑣𝑒𝑟 𝑎𝑙𝑙 𝑡𝑎𝑔𝑠

How often
this tag is
used?

x

Tag
Representativeness
How much a tag
can represent a
document?

The more used
for document,
the more
representative
Algorithm –

Similarity
=
(a,b)

Similarity Calculation

Document +
Score (a)

Document x
Score (b)

Cosine
Similarity
(a,b)

Tags as vectors,
Cosine similarity
between vectors
Experimental Evaluation
Call for
participation

Gather websites
and tags

Find
recommendations

Ask for evaluation
Experimental Evaluation
Call for
participation

Gather websites
and tags
www.eksiduyuru.com
Find
recommendations

Ask for
evaluation
Experimental Evaluation
Call for
participation

Gather websites
and tags

25 users
122 websites
366 tags

Find
recommendations

Ask for
evaluation

bit.ly/oneri-sistemi
Experimental Evaluation
Call for
participation

Gather websites
and tags

20 of 25 Users

Find
recommendations

Ask for
evaluation

bit.ly/oneri-sistemi-degerlendirme
Experimental Evaluation
Expected Results

Recommendation
Acceptance

50 %
Not
acceptable

80 %
Excellent
(not expected)
Experimental Evaluation
Results
For top 5

28%

Accepted Recomm.

recommendations

Accepted

Rejected

72%

5
4
3
2

Accepted

1
0
1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20
User

Accepted recommendations by each user (5 Recommendations)
Experimental Evaluation
Results
For top 3

22%
Accepted

Accepted Recomm.

recommendations

Rejected

78%

3
2
1

Accepted

0
1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20
User

Accepted recommendations by each user (3 Recommendations)
Conclusion


What is presented?


Turkish-language tag-based recommendation
system
 Based

on similarity, tag weight, tag
popularity

 Semantic

properties of tags are taken
into account
Conclusion


Main contribution


Combining
 Well-known

similarity measures and

calculations
 Turkish

semantics analysis
Conclusion


Evaluation


An experiment with 25 people



Participants provide websites and tags



Then evaluate recommendations
Future Work


Pre-processing Stage


English inputs

Site
yandex.com


Tags
harita, e-mail, arama

Turkish inputs with English letters

Site
eksiduyuru.com


Tags
duyuru, alinik, satilik

Translation or control over them
Future Work


Semantic Analysis


Small set of synonyms list



125.022 <word, synonym> pairs



Larger and more comprehensive theasurus
Demo
2 users from experiment
Demo
XXX@candasdemir.com
Website

Tags

http://candasdemir.com pazarlama, kişisel, blog
http://radikal.com.tr

haber, gündem, güncel

http://mynet.com

portal, genel, haber

http://sahibinden.com

alışveriş, market, sahibinden

http://markafoni.com

moda, e-ticaret, alışveriş
Demo
XXX@candasdemir.com
Website

Tags

http://candasdemir.com

pazarlama, kişisel, blog

http://radikal.com.tr

haber, gündem, güncel

http://mynet.com

portal, genel, haber

http://sahibinden.com

alışveriş, market, sahibinden

http://markafoni.com

moda, e-ticaret, alışveriş

mynet.com

bilgi

mynet.com

gazete

radikal.com.tr

bilgi

radikal.com.tr

gazete

Added after
semantics
analysis
Demo
XXX@candasdemir.com
Website

User Satisfaction

eksisozluk.com

Accepted

Website
candasdemir.com
radikal.com.tr

zaytung.com

Accepted

sabah.com.tr

Accepted

sahibinden.com

ntvmsnbc.com

Accepted

markafoni.com

golfdunyasi.com.tr

Not Accepted

Recommended
Websites

mynet.com

User Inputs
Demo
demirkolXXX@hotmail.com
Website

Tags

http://www.sahadan.com/Default.aspx

eğlence, merak, futbol

http://www.erepublik.com/

iletişim, strateji, oyun

http://www.1907unifeb.org/forums

fenerbahçe, sohbet, eğlence

http://ligtv.com.tr/

maç özetleri, haber, futbol

https://www.tuttur.com/

para, futbol, eğlence
Demo
demirkolXXX@hotmail.com
Website

Tags

http://www.sahadan.com/Default.aspx

eğlence, merak, futbol

http://www.erepublik.com/

iletişim, strateji, oyun

http://www.1907unifeb.org/forums

fenerbahçe, sohbet, eğlence

http://ligtv.com.tr/

maç özetleri, haber, futbol

https://www.tuttur.com/

para, futbol, eğlence

ligtv.com.tr

bilgi

ligtv.com.tr

gazete
Added after
semantics
analysis
Demo
demirkolXXX@hotmail.com
Website

User Satisfaction

mackolik.com

Accepted

zaytung.com

Accepted

9gag.com

Accepted

ligtv.com.tr

dizi-mag.com

Accepted

tuttur.com

galatasaray.com.tr

Not Accepted

Recommended
Websites

Website

sahadan.com
erepublik.com
1907unifeb.org/forums

User Inputs
References


Adrian, B., Sauermann, L., & Roth-berghofer, T. (2007). ConTag: A
Semantic Tag Recommendation System. Proceedings of ISemantics’ 07



Aksan, Y. et al. (2012). Construction of the Turkish National Corpus (TNC).
In Proceedings of the Eight International Conference on Language
Resources and Evaluation (LREC 2012). İstanbul. Turkiye.
http://www.lrec-conf.org/proceedings/lrec2012/papers.html



Brill, E., & Moore, R. C. (2000). An Improved Error Model for Noisy
Channel Spelling Correction. (Microsoft Research)



Cattuto, C., Benz, D., Hotho, A., & Stumme, G. (2008). Semantic
Grounding of Tag Relatedness in Social Bookmarking Systems. In The
Semantic Web - ISWC 2008. 2008: Springer



Durao, F., & Dolog, P. (2009). A Personalized Tag-based Recommendation
in Social Web Systems. International Workshop on Adaptation and
Personalization for Web 2.0
References


Education First, (2012). EF EPI Country Rankings



Frankfurt International School, (2001). The Differences Between English
and Turkish



ISPA (Investment Support and Promotion Agency) of Turkey, (2010).Turkish
Information and Communication Technologies Industry. Deloitte



Nakamoto, R., Nakajima, S., Miyazaki, J., & Uemura, S. (2007). Tagbased Contextual Collaborative Filtering. IAENG International Journal of
Computer Science



Özbek, A. (2012). Türkçe Eşanlamlı Kelimeler Sözlüğü Projesi (Turkish
Thesaurus Project). Retrieved from http://github.com/maidis/mythes-tr
Thank you!

Contenu connexe

En vedette (8)

Website Implementation #2
Website Implementation #2Website Implementation #2
Website Implementation #2
 
Eight Tips to a Successful Website Implementation
Eight Tips to a Successful Website ImplementationEight Tips to a Successful Website Implementation
Eight Tips to a Successful Website Implementation
 
Website implementation
Website implementationWebsite implementation
Website implementation
 
Further Website Implementation
Further Website ImplementationFurther Website Implementation
Further Website Implementation
 
Website conclusion
Website conclusionWebsite conclusion
Website conclusion
 
Website analysis report
Website analysis reportWebsite analysis report
Website analysis report
 
Website Analysis Report
Website Analysis ReportWebsite Analysis Report
Website Analysis Report
 
Website analysis sample report
Website analysis sample reportWebsite analysis sample report
Website analysis sample report
 

Similaire à Tag-based Semantic Website Recommendation

Linking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish researchLinking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish research
Royal Society of Chemistry
 
Research 2.0
Research 2.0Research 2.0
Research 2.0
thinkict
 
Njsc search like a pro
Njsc search like a proNjsc search like a pro
Njsc search like a pro
bsdesantis
 
Search class
Search classSearch class
Search class
munnisjo
 
A web standards & ud approach for access (bps public)
A web standards & ud approach for access (bps   public)A web standards & ud approach for access (bps   public)
A web standards & ud approach for access (bps public)
Howard Kramer
 

Similaire à Tag-based Semantic Website Recommendation (20)

Siteocre Sxa and Solr - Sitecore User Group UAE Dubai- Jitendra Soni
Siteocre Sxa and Solr - Sitecore User Group UAE Dubai- Jitendra SoniSiteocre Sxa and Solr - Sitecore User Group UAE Dubai- Jitendra Soni
Siteocre Sxa and Solr - Sitecore User Group UAE Dubai- Jitendra Soni
 
sunny-slides
sunny-slidessunny-slides
sunny-slides
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
Search Analytics: Powerful diagnostics for your site
Search Analytics:  Powerful diagnostics for your siteSearch Analytics:  Powerful diagnostics for your site
Search Analytics: Powerful diagnostics for your site
 
Getting the Most out of Type-Ahead/Autocomplete - LavaCon 2015 propsoal by Br...
Getting the Most out of Type-Ahead/Autocomplete - LavaCon 2015 propsoal by Br...Getting the Most out of Type-Ahead/Autocomplete - LavaCon 2015 propsoal by Br...
Getting the Most out of Type-Ahead/Autocomplete - LavaCon 2015 propsoal by Br...
 
Using Search Analytics to Diagnose What’s Ailing your Information Architecture
Using Search Analytics to Diagnose What’s Ailing your Information ArchitectureUsing Search Analytics to Diagnose What’s Ailing your Information Architecture
Using Search Analytics to Diagnose What’s Ailing your Information Architecture
 
Linking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish researchLinking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish research
 
Teaching web accessibility at the source
Teaching web accessibility at the sourceTeaching web accessibility at the source
Teaching web accessibility at the source
 
Search Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your CustomersSearch Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your Customers
 
Research 2.0
Research 2.0Research 2.0
Research 2.0
 
Search Analytics for Fun and Profit
Search Analytics for Fun and ProfitSearch Analytics for Fun and Profit
Search Analytics for Fun and Profit
 
Search Like a Pro (OLV-8/31)
Search Like a Pro (OLV-8/31)Search Like a Pro (OLV-8/31)
Search Like a Pro (OLV-8/31)
 
Njsc search like a pro
Njsc search like a proNjsc search like a pro
Njsc search like a pro
 
Social media recruitment
Social media recruitmentSocial media recruitment
Social media recruitment
 
Optimization by translation
Optimization by translationOptimization by translation
Optimization by translation
 
Search class
Search classSearch class
Search class
 
Siteocre Sxa and Solr - Sitecore User Group Bangalore -
Siteocre Sxa and Solr - Sitecore User Group Bangalore - Siteocre Sxa and Solr - Sitecore User Group Bangalore -
Siteocre Sxa and Solr - Sitecore User Group Bangalore -
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
A web standards & ud approach for access (bps public)
A web standards & ud approach for access (bps   public)A web standards & ud approach for access (bps   public)
A web standards & ud approach for access (bps public)
 
Master Minds on Data Science - Maarten de Rijke
Master Minds on Data Science - Maarten de RijkeMaster Minds on Data Science - Maarten de Rijke
Master Minds on Data Science - Maarten de Rijke
 

Dernier

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Dernier (20)

Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 

Tag-based Semantic Website Recommendation