SlideShare une entreprise Scribd logo
1  sur  13
WebMining Projectwork
How to suggest the query you’d like to
input after
The WebLog
AnonID Query QueryTime ItemRank ClickURL
142rentdirect.com 01/03/2006 07:17
142www.prescriptionfortime.com 12/03/2006 12:31
142staple.com 17/03/2006 21:19
142staple.com 17/03/2006 21:19
142www.newyorklawyersite.com 18/03/2006 08:02
142www.newyorklawyersite.com 18/03/2006 08:03
142westchester.gov 20/03/2006 03:55 1
http://www.westchesterg
ov.com
142space.comhttp 24/03/2006 20:51
The WebLog is AOL weblog made available to public in 2006
The goal
Building a query suggestion application
exploting the information observed on the AOL
WebLog.
Constrains:
1) the application relies on observed queries
2) The application needs to be fast!
The approach
Exploiting the relation between typed queries
and clicked URL by AOL users:
If two queries share “a lot or URLs”
then they are strongly related to
each other
“a lot of URLs”….
Several approaches can be followed for linking
observed queries to clicked URLs
We’ve been inspired by “Query-URL Bipartite
Based Approach to Personalized Query
Recommendation” paper by Li, Yang, Liu,
Kitsuregawa, Proceedings of the Twenty-Third
AAAI Conference on Artificial Intelligence (2008)
Idea 1/2
Let q(i) be the i-th query and u(k) be the k-th
clicked url after a query is typed
A Bipartite Graph
can be built such
that for each q(i)
belonging to the
query set, a link to a
subsequent clicked
url u(k) can be
defined
Idea 2/2
Once a Bipartite Graph has been built, a relation
between any query belonging to the query set
can be established accordingly to the clicked
URLs.
An Affinity Graph over the
query set can be defined
consequently, where the
edges between two
queries have to be
weighted in order to
exploit it in a suggestion
task
Weighting the Edges
𝒘 𝒊, 𝒋 = 𝒌=𝟏
𝑼
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒕𝒊𝒎𝒆𝒔 𝑼𝑹𝑳(𝒌) 𝒊𝒔 𝒄𝒍𝒊𝒄𝒌𝒆𝒅 𝒃𝒚 𝒒 𝒊 𝒂𝒏𝒅 𝒒(𝒋)
𝒌=𝟏
𝑼
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒕𝒊𝒎𝒆𝒔 𝒂𝒏𝒚 𝑼𝑹𝑳(𝒌) 𝒊𝒔 𝒄𝒍𝒊𝒌𝒆𝒅 𝒃𝒚 𝒒 𝒊 + 𝒌=𝟏
𝑼
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒕𝒊𝒎𝒆𝒔 𝒂𝒏𝒚 𝑼𝑹𝑳(𝒌) 𝒊𝒔 𝒄𝒍𝒊𝒌𝒆𝒅 𝒃𝒚 𝒒 𝒋
Let q(i) be the i-th query and u(k) be the k-th clicked url
after a query is typed
w(i,j) is equal to 1 if once q(i) or q(j) are passed the same URLs are clicked
w(i,j) is equal to 0 if once q(i) or q(j) are passed, all the clicked URLs don’t
match
Managing “over-clicked URLs”
On the AOL 2006 WebLog dataset there exist a number
of URLs which are over-clicked by users, independently
of the query they type before clicking them.
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
-foot-and-mouth-…
http://books.stores.ebay.ie
http://dixonmayfair.com
http://grounds-mag.com
http://local.infospace.com
http://p072.ezboard.com
http://shop.treonauts.com
http://vipcams.literotica.com
http://www.acbarandgrill.com
http://www.alyandaj.com
http://www.assplundering.com
http://www.beardieagilitydie…
http://www.bodo.com
http://www.calnhs.org
http://www.chantcd.com
http://www.clubunlimited.com
http://www.creativeforecasti…
http://www.dennys.com
http://www.duplicolor.com
http://www.esilvercart.com
http://www.fitzandfloyd.com
http://www.gamecubecheats…
http://www.grandmashandsb…
http://www.henrymedical.com
http://www.i-m-t.demon.co.uk
http://www.jacksonsoccer.com
http://www.keyloggers.com
http://www.leesburg2day.com
http://www.madison.k12.ky.us
http://www.mercy.net
http://www.mp3sugar.com
http://www.netads.com
http://www.oceanviewinnan…
http://www.partsforlifts.com
http://www.poetsgraves.co.uk
http://www.radio-3.ru
http://www.robotstorehk.com
http://www.scotfest.com
http://www.skinashoba.com
http://www.starktaxes.com
http://www.talktorusty.com
http://www.theremyreport.c…
http://www.trollcarnival.com
http://www.vcta.com
http://www.welovedolls.com
http://www.xandocosi.com
URLs Click Count
Managing “over-clicked URLs”
Those URLS generate a noise in the query recommendation
algorithm. For this reason we selected only those URLs having
less than 1,000 clicks
0
100
200
300
400
500
600
700
800
900
1000
-foot-and-mouth-…
http://blackdicksmovies.deluxep…
http://dallasnative.com
http://freescreensaver.ezthemes…
http://jingdong.en.alibaba.com
http://mtv-spring-…
http://pub25.bravenet.com
http://store.vegas.com
http://westsideconnection.org
http://www.acsu.buffalo.edu
http://www.amarula.com
http://www.asht.org
http://www.bathandmore.com
http://www.blackmanlaw.com
http://www.buerge.com
http://www.caswells.com
http://www.chsb.org
http://www.colts.com
http://www.ctahperd.org
http://www.dewattoport.com
http://www.dvdworldonline.com
http://www.ericdaugherty.com
http://www.findlayfpc.org
http://www.frugalhaus.com
http://www.gniarmls.com
http://www.hankingroup.com
http://www.homerwood.com
http://www.incomemax.com
http://www.jesusandkidz.com
http://www.kinray.com
http://www.lemassif.com
http://www.machinetools.net.tw
http://www.medrekforum.com
http://www.montgomerycollege.…
http://www.natalbelo.com
http://www.northlouisianaskydiv…
http://www.orientvisual.com
http://www.performancedogsina…
http://www.pptbackgrounds.fsn…
http://www.ravc.com
http://www.rodssteak-…
http://www.scms.ca
http://www.simplysiestakey.com
http://www.sportsstats.com
http://www.supersprings.com
http://www.thebeverlyhillscouri…
http://www.tombraidermovie.com
http://www.ulqini.de
http://www.virtualict.com
http://www.whipnspur.com
http://www.yardleylondon.com
URLs Click Count
Affinity Graph Representation
Once the edge weight is computed, for each query
q(i) we built a main dictionay having key = q(i) and
value equal to an ordered dictionary.
The ordered dictionary has keys equals to the
queries sharing at least 1 URL with q(i) and values
equal to w(i,j).
The main dictionary is used to feed the query
suggestion API and provide a reliable result in
milliseconds.
Demo for those who can’t enjoy it the
LIVE one 
Thanks!
Andrea Gigli
https://about.me/andrea.gigli

Contenu connexe

Similaire à Search Engine Query Suggestion Application

Similaire à Search Engine Query Suggestion Application (20)

How Tracking Companies Circumvented Ad Blockers Using WebSockets
How Tracking Companies Circumvented Ad Blockers Using WebSocketsHow Tracking Companies Circumvented Ad Blockers Using WebSockets
How Tracking Companies Circumvented Ad Blockers Using WebSockets
 
How Tracking Companies Circumvent Ad Blockers Using WebSockets
How Tracking Companies Circumvent Ad Blockers Using WebSocketsHow Tracking Companies Circumvent Ad Blockers Using WebSockets
How Tracking Companies Circumvent Ad Blockers Using WebSockets
 
Антон Бойко "Azure Web Apps deep dive"
Антон Бойко "Azure Web Apps deep dive"Антон Бойко "Azure Web Apps deep dive"
Антон Бойко "Azure Web Apps deep dive"
 
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptCombining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
 
OWASP Free Training - SF2014 - Keary and Manico
OWASP Free Training - SF2014 - Keary and ManicoOWASP Free Training - SF2014 - Keary and Manico
OWASP Free Training - SF2014 - Keary and Manico
 
A Novel Interface to a Web Crawler using VB.NET Technology
A Novel Interface to a Web Crawler using VB.NET TechnologyA Novel Interface to a Web Crawler using VB.NET Technology
A Novel Interface to a Web Crawler using VB.NET Technology
 
HTML5.pptx
HTML5.pptxHTML5.pptx
HTML5.pptx
 
Crunching the Top 10000 Websites' Password Policies and Controls [Presented b...
Crunching the Top 10000 Websites' Password Policies and Controls [Presented b...Crunching the Top 10000 Websites' Password Policies and Controls [Presented b...
Crunching the Top 10000 Websites' Password Policies and Controls [Presented b...
 
IT Skills Analysis
IT Skills AnalysisIT Skills Analysis
IT Skills Analysis
 
Info2006 Web20 Taly Print
Info2006 Web20 Taly PrintInfo2006 Web20 Taly Print
Info2006 Web20 Taly Print
 
Cindy Krum Krum Cindy "What SEOs Need To Know About Progressive Web Apps" SMX...
Cindy Krum Krum Cindy "What SEOs Need To Know About Progressive Web Apps" SMX...Cindy Krum Krum Cindy "What SEOs Need To Know About Progressive Web Apps" SMX...
Cindy Krum Krum Cindy "What SEOs Need To Know About Progressive Web Apps" SMX...
 
Lec6 ecom fall16
Lec6 ecom fall16Lec6 ecom fall16
Lec6 ecom fall16
 
Software Analysis for the Web: Achievements and Prospects
Software Analysis for the Web: Achievements and ProspectsSoftware Analysis for the Web: Achievements and Prospects
Software Analysis for the Web: Achievements and Prospects
 
IRJET - Review on Search Engine Optimization
IRJET - Review on Search Engine OptimizationIRJET - Review on Search Engine Optimization
IRJET - Review on Search Engine Optimization
 
GDD Japan 2009 - Designing OpenSocial Apps For Speed and Scale
GDD Japan 2009 - Designing OpenSocial Apps For Speed and ScaleGDD Japan 2009 - Designing OpenSocial Apps For Speed and Scale
GDD Japan 2009 - Designing OpenSocial Apps For Speed and Scale
 
Web Development Training in Ambala ! Batra Computer Centre
Web Development Training in Ambala ! Batra Computer CentreWeb Development Training in Ambala ! Batra Computer Centre
Web Development Training in Ambala ! Batra Computer Centre
 
Door Of Internet
Door Of InternetDoor Of Internet
Door Of Internet
 
Amp your site: An intro to accelerated mobile pages
Amp your site: An intro to accelerated mobile pagesAmp your site: An intro to accelerated mobile pages
Amp your site: An intro to accelerated mobile pages
 
Real-time Collaborative Editing with CRDTs
Real-time Collaborative Editing with CRDTsReal-time Collaborative Editing with CRDTs
Real-time Collaborative Editing with CRDTs
 
UCLA HACKU'11
UCLA HACKU'11UCLA HACKU'11
UCLA HACKU'11
 

Plus de Andrea Gigli

Plus de Andrea Gigli (20)

How organizations can become data-driven: three main rules
How organizations can become data-driven: three main rulesHow organizations can become data-driven: three main rules
How organizations can become data-driven: three main rules
 
Equity Value for Startups.pdf
Equity Value for Startups.pdfEquity Value for Startups.pdf
Equity Value for Startups.pdf
 
Introduction to recommender systems
Introduction to recommender systemsIntroduction to recommender systems
Introduction to recommender systems
 
Data Analytics per Manager
Data Analytics per ManagerData Analytics per Manager
Data Analytics per Manager
 
Balance-sheet dynamics impact on FVA, MVA, KVA
Balance-sheet dynamics impact on FVA, MVA, KVABalance-sheet dynamics impact on FVA, MVA, KVA
Balance-sheet dynamics impact on FVA, MVA, KVA
 
Reasons behind XVAs
Reasons behind XVAs Reasons behind XVAs
Reasons behind XVAs
 
Recommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial ServicesRecommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial Services
 
Mine the Wine by Andrea Gigli
Mine the Wine by Andrea GigliMine the Wine by Andrea Gigli
Mine the Wine by Andrea Gigli
 
Fast Feature Selection for Learning to Rank - ACM International Conference on...
Fast Feature Selection for Learning to Rank - ACM International Conference on...Fast Feature Selection for Learning to Rank - ACM International Conference on...
Fast Feature Selection for Learning to Rank - ACM International Conference on...
 
Feature Selection for Document Ranking
Feature Selection for Document RankingFeature Selection for Document Ranking
Feature Selection for Document Ranking
 
Using R for Building a Simple and Effective Dashboard
Using R for Building a Simple and Effective DashboardUsing R for Building a Simple and Effective Dashboard
Using R for Building a Simple and Effective Dashboard
 
Impact of Valuation Adjustments (CVA, DVA, FVA, KVA) on Bank's Processes - An...
Impact of Valuation Adjustments (CVA, DVA, FVA, KVA) on Bank's Processes - An...Impact of Valuation Adjustments (CVA, DVA, FVA, KVA) on Bank's Processes - An...
Impact of Valuation Adjustments (CVA, DVA, FVA, KVA) on Bank's Processes - An...
 
Comparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text MiningComparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text Mining
 
Master in Big Data Analytics and Social Mining 20015
Master in Big Data Analytics and Social Mining 20015Master in Big Data Analytics and Social Mining 20015
Master in Big Data Analytics and Social Mining 20015
 
Electricity Derivatives
Electricity DerivativesElectricity Derivatives
Electricity Derivatives
 
Crawling Tripadvisor Attracion Reviews - Italiano
Crawling Tripadvisor Attracion Reviews - ItalianoCrawling Tripadvisor Attracion Reviews - Italiano
Crawling Tripadvisor Attracion Reviews - Italiano
 
Search Engine for World Recipes Expo 2015
Search Engine for World Recipes Expo 2015Search Engine for World Recipes Expo 2015
Search Engine for World Recipes Expo 2015
 
A Data Scientist Job Map Visualization Tool using Python, D3.js and MySQL
A Data Scientist Job Map Visualization Tool using Python, D3.js and MySQLA Data Scientist Job Map Visualization Tool using Python, D3.js and MySQL
A Data Scientist Job Map Visualization Tool using Python, D3.js and MySQL
 
From real to risk neutral probability measure for pricing and managing cva
From real to risk neutral probability measure for pricing and managing cvaFrom real to risk neutral probability measure for pricing and managing cva
From real to risk neutral probability measure for pricing and managing cva
 
Startup Saturday Internet Festival 2014
Startup Saturday Internet Festival 2014Startup Saturday Internet Festival 2014
Startup Saturday Internet Festival 2014
 

Dernier

Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
imonikaupta
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Sheetaleventcompany
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Chandigarh Call girls 9053900678 Call girls in Chandigarh
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
@Chandigarh #call #Girls 9053900678 @Call #Girls in @Punjab 9053900678
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
ellan12
 

Dernier (20)

Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
Al Barsha Night Partner +0567686026 Call Girls Dubai
Al Barsha Night Partner +0567686026 Call Girls  DubaiAl Barsha Night Partner +0567686026 Call Girls  Dubai
Al Barsha Night Partner +0567686026 Call Girls Dubai
 
Russian Call Girls Pune (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...
Russian Call Girls Pune  (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...Russian Call Girls Pune  (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...
Russian Call Girls Pune (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...
 
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 

Search Engine Query Suggestion Application

  • 1. WebMining Projectwork How to suggest the query you’d like to input after
  • 2. The WebLog AnonID Query QueryTime ItemRank ClickURL 142rentdirect.com 01/03/2006 07:17 142www.prescriptionfortime.com 12/03/2006 12:31 142staple.com 17/03/2006 21:19 142staple.com 17/03/2006 21:19 142www.newyorklawyersite.com 18/03/2006 08:02 142www.newyorklawyersite.com 18/03/2006 08:03 142westchester.gov 20/03/2006 03:55 1 http://www.westchesterg ov.com 142space.comhttp 24/03/2006 20:51 The WebLog is AOL weblog made available to public in 2006
  • 3. The goal Building a query suggestion application exploting the information observed on the AOL WebLog. Constrains: 1) the application relies on observed queries 2) The application needs to be fast!
  • 4. The approach Exploiting the relation between typed queries and clicked URL by AOL users: If two queries share “a lot or URLs” then they are strongly related to each other
  • 5. “a lot of URLs”…. Several approaches can be followed for linking observed queries to clicked URLs We’ve been inspired by “Query-URL Bipartite Based Approach to Personalized Query Recommendation” paper by Li, Yang, Liu, Kitsuregawa, Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008)
  • 6. Idea 1/2 Let q(i) be the i-th query and u(k) be the k-th clicked url after a query is typed A Bipartite Graph can be built such that for each q(i) belonging to the query set, a link to a subsequent clicked url u(k) can be defined
  • 7. Idea 2/2 Once a Bipartite Graph has been built, a relation between any query belonging to the query set can be established accordingly to the clicked URLs. An Affinity Graph over the query set can be defined consequently, where the edges between two queries have to be weighted in order to exploit it in a suggestion task
  • 8. Weighting the Edges 𝒘 𝒊, 𝒋 = 𝒌=𝟏 𝑼 𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒕𝒊𝒎𝒆𝒔 𝑼𝑹𝑳(𝒌) 𝒊𝒔 𝒄𝒍𝒊𝒄𝒌𝒆𝒅 𝒃𝒚 𝒒 𝒊 𝒂𝒏𝒅 𝒒(𝒋) 𝒌=𝟏 𝑼 𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒕𝒊𝒎𝒆𝒔 𝒂𝒏𝒚 𝑼𝑹𝑳(𝒌) 𝒊𝒔 𝒄𝒍𝒊𝒌𝒆𝒅 𝒃𝒚 𝒒 𝒊 + 𝒌=𝟏 𝑼 𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒕𝒊𝒎𝒆𝒔 𝒂𝒏𝒚 𝑼𝑹𝑳(𝒌) 𝒊𝒔 𝒄𝒍𝒊𝒌𝒆𝒅 𝒃𝒚 𝒒 𝒋 Let q(i) be the i-th query and u(k) be the k-th clicked url after a query is typed w(i,j) is equal to 1 if once q(i) or q(j) are passed the same URLs are clicked w(i,j) is equal to 0 if once q(i) or q(j) are passed, all the clicked URLs don’t match
  • 9. Managing “over-clicked URLs” On the AOL 2006 WebLog dataset there exist a number of URLs which are over-clicked by users, independently of the query they type before clicking them. 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 -foot-and-mouth-… http://books.stores.ebay.ie http://dixonmayfair.com http://grounds-mag.com http://local.infospace.com http://p072.ezboard.com http://shop.treonauts.com http://vipcams.literotica.com http://www.acbarandgrill.com http://www.alyandaj.com http://www.assplundering.com http://www.beardieagilitydie… http://www.bodo.com http://www.calnhs.org http://www.chantcd.com http://www.clubunlimited.com http://www.creativeforecasti… http://www.dennys.com http://www.duplicolor.com http://www.esilvercart.com http://www.fitzandfloyd.com http://www.gamecubecheats… http://www.grandmashandsb… http://www.henrymedical.com http://www.i-m-t.demon.co.uk http://www.jacksonsoccer.com http://www.keyloggers.com http://www.leesburg2day.com http://www.madison.k12.ky.us http://www.mercy.net http://www.mp3sugar.com http://www.netads.com http://www.oceanviewinnan… http://www.partsforlifts.com http://www.poetsgraves.co.uk http://www.radio-3.ru http://www.robotstorehk.com http://www.scotfest.com http://www.skinashoba.com http://www.starktaxes.com http://www.talktorusty.com http://www.theremyreport.c… http://www.trollcarnival.com http://www.vcta.com http://www.welovedolls.com http://www.xandocosi.com URLs Click Count
  • 10. Managing “over-clicked URLs” Those URLS generate a noise in the query recommendation algorithm. For this reason we selected only those URLs having less than 1,000 clicks 0 100 200 300 400 500 600 700 800 900 1000 -foot-and-mouth-… http://blackdicksmovies.deluxep… http://dallasnative.com http://freescreensaver.ezthemes… http://jingdong.en.alibaba.com http://mtv-spring-… http://pub25.bravenet.com http://store.vegas.com http://westsideconnection.org http://www.acsu.buffalo.edu http://www.amarula.com http://www.asht.org http://www.bathandmore.com http://www.blackmanlaw.com http://www.buerge.com http://www.caswells.com http://www.chsb.org http://www.colts.com http://www.ctahperd.org http://www.dewattoport.com http://www.dvdworldonline.com http://www.ericdaugherty.com http://www.findlayfpc.org http://www.frugalhaus.com http://www.gniarmls.com http://www.hankingroup.com http://www.homerwood.com http://www.incomemax.com http://www.jesusandkidz.com http://www.kinray.com http://www.lemassif.com http://www.machinetools.net.tw http://www.medrekforum.com http://www.montgomerycollege.… http://www.natalbelo.com http://www.northlouisianaskydiv… http://www.orientvisual.com http://www.performancedogsina… http://www.pptbackgrounds.fsn… http://www.ravc.com http://www.rodssteak-… http://www.scms.ca http://www.simplysiestakey.com http://www.sportsstats.com http://www.supersprings.com http://www.thebeverlyhillscouri… http://www.tombraidermovie.com http://www.ulqini.de http://www.virtualict.com http://www.whipnspur.com http://www.yardleylondon.com URLs Click Count
  • 11. Affinity Graph Representation Once the edge weight is computed, for each query q(i) we built a main dictionay having key = q(i) and value equal to an ordered dictionary. The ordered dictionary has keys equals to the queries sharing at least 1 URL with q(i) and values equal to w(i,j). The main dictionary is used to feed the query suggestion API and provide a reliable result in milliseconds.
  • 12. Demo for those who can’t enjoy it the LIVE one 