Ranieri Baraglia, Carlos Castillo, Debora Donato, Franco Maria Nardini, Raffaele Perego and Fabrizio Silvestri: "The Effects of Time on Query Flow Graph-based Models for Query Suggestion". In proceedings of RIAO. Paris, France, 2010.
Slides prepared by Franco Maria Nardini
Boost Fertility New Invention Ups Success Rates.pdf
The Effects of Time on Query Flow Graph-based Models for Query Suggestion
1. The Effects of Time on Query
Flow Graph-based Models for
Query Suggestion
Carlos Castillo, Debora Donato Ranieri Baraglia, Franco Maria Nardini
Raffaele Perego, Fabrizio Silvestri
Yahoo! Research Barcelona
HPC Lab, ISTI-CNR, Pisa
martedì 4 maggio 2010
3. Outline
• Introduction
• Aims of this Work
• The Query-Flow Graph
• Evaluating the Aging Effect
• Combating the Aging Effect
• Distributed QFG Building
• Conclusions & Future Works
martedì 4 maggio 2010
5. Introduction
• Web search engines use query recommender
systems to improve users’ search experience;
martedì 4 maggio 2010
6. Introduction
• Web search engines use query recommender
systems to improve users’ search experience;
• Query recommender systems give hints to users on
possible “interesting queries”:
• relative to their information needs;
martedì 4 maggio 2010
7. Introduction
• Web search engines use query recommender
systems to improve users’ search experience;
• Query recommender systems give hints to users on
possible “interesting queries”:
• relative to their information needs;
• Query recommender systems exploit the
knowledge of past web search engines users:
• recorded in query logs.
martedì 4 maggio 2010
9. Aims of this Work
• to show that time has negative effects on a query
recommender model:
• the model becomes unable to generate good suggestions
as time passes;
• bursty queries;
martedì 4 maggio 2010
10. Aims of this Work
• to show that time has negative effects on a query
recommender model:
• the model becomes unable to generate good suggestions
as time passes;
• bursty queries;
• to extend a state-of-the-art recommender system by providing
a methodology for dealing efficiently with evolving data;
• to define a “good” strategy to update the model;
• to define an distributed/parallel algorithm to update the
model;
martedì 4 maggio 2010
12. The Query-Flow Graph
•
barcelona fc
QFG [Boldi et al., CIKM’08] is a website
compact and powerful representation 0.043
barcelona fc
of Web Search engine users’ behavior; 0.031
fixtures
barcelona fc 0.017 real
madrid
0.080
0.011
0.506
0.439
barcelona
hotels 0.072
0.018 cheap
barcelona
0.023
hotels
0.029
<T>
barcelona luxury
0.043
barcelona
0.018
barcelona hotels
weather
0.416
0.523
0.100
barcelona
weather
online
martedì 4 maggio 2010
13. The Query-Flow Graph
•
barcelona fc
QFG [Boldi et al., CIKM’08] is a website
compact and powerful representation 0.043
barcelona fc
of Web Search engine users’ behavior; 0.031
fixtures
• QFG is a graph composed by:
0.080
barcelona fc 0.017 real
madrid
1. a set of nodes, V = Q ∪ {s,t}; 0.011
0.506
0.439
2. a set of directed edges, E ⊆ V x V: barcelona
hotels 0.072
0.018 cheap
•
barcelona
0.023
(q, q’) are connected if they are 0.029
hotels
<T>
consecutive at least one time in 0.043
barcelona luxury
at least one session;
barcelona
0.018
barcelona hotels
weather
0.416
3. a weighting function w = E --> (0, 1]:
•
0.523
assigning a weight w(q, q’) to 0.100
each edge; barcelona
weather
online
martedì 4 maggio 2010
15. The Query-Flow Graph
• two weighting schemes:
• relative frequencies: counting query occurrences;
• chaining probabilities: (q,q’) in the same chain
• classification on a set of features (text, n-grams,
session) over all sessions where (q,q’) are
consecutive;
martedì 4 maggio 2010
16. The Query-Flow Graph
• two weighting schemes:
• relative frequencies: counting query occurrences;
• chaining probabilities: (q,q’) in the same chain
• classification on a set of features (text, n-grams,
session) over all sessions where (q,q’) are
consecutive;
• noisy edges: edges with low probability are removed;
martedì 4 maggio 2010
18. The Query-Flow Graph
• Query recommendation:
• random walk with restart on the graph;
• considering history of the users (on the
preference vector);
martedì 4 maggio 2010
19. The Query-Flow Graph
• Query recommendation:
• random walk with restart on the graph;
• considering history of the users (on the
preference vector);
• A score is associated to each suggestion;
martedì 4 maggio 2010
27. Boldi et al. in [4]. This method uses chaining probabi
measured by means of a machine learning method. The
Experimental
tial step was thus to extract those features from each t
ing log, and storing them into a compressed graph re
sentation. In particular we extracted 25 different feat
Assumptions
(time-related, session and textual features) for each pa
queries (q, q ) that are consecutive in at least one sessio
the query log.
Table 1 shows the number of nodes and edges of the
• M , M are used for training;
1 2
ferent graphs corresponding to each query log segment
for training.
• two different QFGs; time window
March 06
id
M1
nodes
3,814,748
edges
6,129,629
April 06 M2 3,832,973 6,266,648
Table 1: Number of nodes and edges for the gra
corresponding to the two different training
ments.
It is important to remark that we have not re-trained
classification model for the assignment of weights associ
with QFG edges. We reuse the one that has been used i
for segmenting users sessions into query chains1 . Th
another point in favor of QFG-based models. Once you t
the classifier to assign weights to QFG edges, you can r
it on different data-sets without losing in effectiveness.
martedì 4 maggio 2010 1
28. Boldi et al. in [4]. This method uses chaining probabi
measured by means of a machine learning method. The
Experimental
tial step was thus to extract those features from each t
ing log, and storing them into a compressed graph re
sentation. In particular we extracted 25 different feat
Assumptions
(time-related, session and textual features) for each pa
queries (q, q ) that are consecutive in at least one sessio
the query log.
Table 1 shows the number of nodes and edges of the
• M , M are used for training;
1 2
ferent graphs corresponding to each query log segment
for training.
• two different QFGs; time window
March 06
id
M1
nodes
3,814,748
edges
6,129,629
April 06 M2 3,832,973 6,266,648
• Queries in the third month Number of nodes testing; for the gra
Table 1: are used for and edges
corresponding to the two different training
ments.
It is important to remark that we have not re-trained
classification model for the assignment of weights associ
with QFG edges. We reuse the one that has been used i
for segmenting users sessions into query chains1 . Th
another point in favor of QFG-based models. Once you t
the classifier to assign weights to QFG edges, you can r
it on different data-sets without losing in effectiveness.
martedì 4 maggio 2010 1
29. Boldi et al. in [4]. This method uses chaining probabi
measured by means of a machine learning method. The
Experimental
tial step was thus to extract those features from each t
ing log, and storing them into a compressed graph re
sentation. In particular we extracted 25 different feat
Assumptions
(time-related, session and textual features) for each pa
queries (q, q ) that are consecutive in at least one sessio
the query log.
Table 1 shows the number of nodes and edges of the
• M , M are used for training;
1 2
ferent graphs corresponding to each query log segment
for training.
• two different QFGs; time window
March 06
id
M1
nodes
3,814,748
edges
6,129,629
April 06 M2 3,832,973 6,266,648
• Queries in the third month Number of nodes testing; for the gra
Table 1: are used for and edges
corresponding to the two different training
• We evaluate the aging effect by measuring the quality
ments.
of suggestions produced by models on M , and M ;
It is important to remark that we have not re-trained
1 2
classification model for the assignment of weights associ
• If the model ages M
with QFG edges. We reuse the one that has been used i
outperforms M , in terms of
for segmenting users sessions1into query chains1 . Th
2
another point in favor of QFG-based models. Once you t
quality of suggestions;
the classifier to assign weights to QFG edges, you can r
it on different data-sets without losing in effectiveness.
martedì 4 maggio 2010 1
31. Evaluating the Aging
Effect
1e+06
Top 1000 queries in month 1 on month 1
Top 1000 queries in month 3 on month 1
100000
10000
1000
100
10 !#$%'()*+,'
1
1 10 100 1000
martedì 4 maggio 2010
32. Evaluating the Aging
Effect
• Two classes of test queries:
• F1: 30 queries highly
1e+06
Top 1000 queries in month 1 on month 1
Top 1000 queries in month 3 on month 1
frequent in M1 having a 100000
large drop in the test
month (ex. shakira). 10000
• F3: 30 queries highly 1000
frequent in the test
month having a large
100
drop in M1 (ex. da vinci 10 !#$%'()*+,'
code, mothers day gift);
1
1 10 100 1000
martedì 4 maggio 2010
33. Evaluating the Aging
Effect
• Two classes of test queries:
• F1: 30 queries highly
1e+06
Top 1000 queries in month 1 on month 1
Top 1000 queries in month 3 on month 1
frequent in M1 having a 100000
large drop in the test
month (ex. shakira). 10000
• F3: 30 queries highly 1000
frequent in the test
month having a large
100
drop in M1 (ex. da vinci 10 !#$%'()*+,'
code, mothers day gift);
•
1
F1, F3 contain very diverse
1 10 100 1000
queries;
martedì 4 maggio 2010
36. 3742 2652
2162 2615
Evaluating the Aging
2001 2341
1913 2341
1913 2341
Effect (II)
• When k suggestions share the
same score, those are useless; (!!!
'!!!
!!!
%!!! )*+,
-./)012.342+*5
$!!!
#!!!
!
# $ % ' (
martedì 4 maggio 2010
37. 3742 2652
2162 2615
Evaluating the Aging
2001 2341
1913 2341
1913 2341
Effect (II)
• When k suggestions share the
same score, those are useless; (!!!
• Same suggestion score: '!!!
•
!!!
same probability on the
graph; %!!! )*+,
-./)012.342+*5
• the model is not able to $!!!
give a priority to #!!!
recommendations; !
# $ % ' (
martedì 4 maggio 2010
38. 3742 2652
2162 2615
Evaluating the Aging
2001 2341
1913 2341
1913 2341
Effect (II)
• When k suggestions share the
same score, those are useless; (!!!
• Same suggestion score: '!!!
•
!!!
same probability on the
graph; %!!! )*+,
-./)012.342+*5
• the model is not able to $!!!
give a priority to #!!!
recommendations; !
• Confirmed by an user-study
# $ % ' (
on F1, and F3;
martedì 4 maggio 2010
40. Evaluating the Aging
Effect (III)
• Working hypothesis:
• useful recommendations do not share the same
recommendation score;
martedì 4 maggio 2010
41. Evaluating the Aging
Effect (III)
• Working hypothesis:
• useful recommendations do not share the same
recommendation score;
• Automatic evaluation;
• 400 highly frequent queries in the test month;
• evaluating the number of useful recommendations;
• k = 3;
martedì 4 maggio 2010
43. ate recommendations are taken from different query
Evaluating the Aging
recommendations with their assigned relative scores.
Effect (IV)
reduces the “noise” on the data and generates more precise
knowledge on which recommendations are computed. Fur-
thermore, the increase is quite independent from the thresh-
old level, i.e. by increasing the threshold from 0.5 to 0.75
the overall quality is, roughly, constant.
• Results: filtering
threshold
average number
of useful sugges-
tions on M1
average number
of useful sugges-
tions on M2
0 2.84 2.91
0.5 5.85 6.23
0.65 5.85 6.23
0.75 5.85 6.18
Table 4: Recommendation statistics obtained by us-
ing the automatic evaluation method on a set of 400
queries drawn from the most frequent in the third
month.
We further break down the overall results shown in Table 4
to show the number of queries on which the QFG-based
martedì 4 maggio 2010
44. ate recommendations are taken from different query
Evaluating the Aging
recommendations with their assigned relative scores.
Effect (IV)
reduces the “noise” on the data and generates more precise
knowledge on which recommendations are computed. Fur-
thermore, the increase is quite independent from the thresh-
old level, i.e. by increasing the threshold from 0.5 to 0.75
the overall quality is, roughly, constant.
• Results: filtering
threshold
average number
of useful sugges-
tions on M1
average number
of useful sugges-
tions on M2
0 2.84 2.91
0.5 5.85 6.23
0.65 5.85 6.23
0.75 5.85 6.18
• Table 4: Recommendation statistics obtained by us-
Average ing the automatic evaluation method on a set of 400
number of useful suggestions is greater in
M2 than queries drawn from the most frequent in the third
in M1;
month.
• Filtering process helps a lot;
We further break down the overall results shown in Table 4
to show the number of queries on which the QFG-based
martedì 4 maggio 2010
47. Evaluating the Aging
Effect (V)
• On a histogram (cumulative distribution):
400
300
200
100
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
M1 M2
• Results on M are always better than those on M :
2 1
• less queries without suggestions;
martedì 4 maggio 2010
49. Combating the Aging
Effect
• QFG recommender models age:
• Average recommendation quality degrades;
• Recommendations should not be influenced by
time;
martedì 4 maggio 2010
50. Combating the Aging
Effect
• QFG recommender models age:
• Average recommendation quality degrades;
• Recommendations should not be influenced by
time;
• Update of the model vs. rebuilding it “from scratch”;
martedì 4 maggio 2010
52. Combating the Aging
t a model
or which Effect (II)
QFGs. Suppose the model used to generate recommenda-
tions consists of a portion of data representing one month
(for M1 and M2 ) or two months (for M12 ) of the query
commen- log. The model is being updated every 15 days (for M1
•
to always and M2 ) or every 30 days (for M12 ). By using the first ap-
Solution: incremental update of Mevery means days to rebuild
proach, we pay 22 (44) minutes 1 by 15 (30) of “fresh data” in M2
•
the new model from scratch on a new set of data obtained
Graph the last two months of the query log. Instead, by using
from algebra [Bordino et al., 2008];
FLOW
•
the second approach, we need to pay only 15 (32) minutes
Some measures on the two different approaches:
for updating the one-month (two-months) QFG.
apidly in
“From scratch” “Incremental”
commen-
Dataset strategy [min.] strategy [min.]
endation M1 (March 2006) 21 14
tive queries. M2 (April 2006) 22 15
both fre- M12 (March and April) 44 32
heir value
ariation). Table 5: Time needed to build a Query Flow Graph
o movies, from scratch and using our “incremental” approach
eral with (from merging two QFG representing an half of
it is easy data).
martedì 4 maggio 2010
53. Combating the Aging
t a model
or which Effect (II)
QFGs. Suppose the model used to generate recommenda-
tions consists of a portion of data representing one month
(for M1 and M2 ) or two months (for M12 ) of the query
commen- log. The model is being updated every 15 days (for M1
•
to always and M2 ) or every 30 days (for M12 ). By using the first ap-
Solution: incremental update of Mevery means days to rebuild
proach, we pay 22 (44) minutes 1 by 15 (30) of “fresh data” in M2
•
the new model from scratch on a new set of data obtained
Graph the last two months of the query log. Instead, by using
from algebra [Bordino et al., 2008];
FLOW
•
the second approach, we need to pay only 15 (32) minutes
Some measures on the two different approaches:
for updating the one-month (two-months) QFG.
apidly in
“From scratch” “Incremental”
commen-
Dataset strategy [min.] strategy [min.]
endation M1 (March 2006) 21 14
tive queries. M2 (April 2006) 22 15
both fre- M12 (March and April) 44 32
•
heir value
Incremental updates: 2/3 of the build w.r.t. “from scratch” strategy;
ariation). Table 5: Time needed to time a Query Flow Graph
from scratch and using our “incremental” approach
•
o movies,
Evaluation onmerging two QFG representing an half of
eral with (from the same set of 400 queries;
it is easy data).
martedì 4 maggio 2010
55. 3698 shakira video
shakira 3135 shakira nude
Combating the Aging
3099 shakira wallpaper
3020 shakira biography
3018 shakira aol music
2015 free video downloads
Effect (III)
Table 7: Some examples of recommendations gen-
erated on different QFG models. Queries used to
generate recommendations are taken from different
query sets.
• Results: filtering
threshold
average number
of useful sugges-
tions on M2
average number
of useful sugges-
tions on M12
0 2.91 3.64
0.5 6.23 7.95
0.65 6.23 7.94
0.75 6.18 7.9
Table 8: Recommendation statistics obtained by us-
ing the automatic evaluation method on a relatively
large set of 400 queries drawn from the most fre-
quent in the third month.
martedì 4 maggio 2010
gated the main reasons why we obtain such an improvement.
56. 3698 shakira video
shakira 3135 shakira nude
Combating the Aging
3099 shakira wallpaper
3020 shakira biography
3018 shakira aol music
2015 free video downloads
Effect (III)
Table 7: Some examples of recommendations gen-
erated on different QFG models. Queries used to
generate recommendations are taken from different
query sets.
• Results: filtering
threshold
average number
of useful sugges-
tions on M2
average number
of useful sugges-
tions on M12
0 2.91 3.64
0.5 6.23 7.95
0.65 6.23 7.94
0.75 6.18 7.9
• Average number of useful suggestion is obtained by us-
Table 8: Recommendation statistics greater in
ing the automatic evaluation method on a relatively
M12 than in M2, or 400M1;
large set of in queries drawn from the most fre-
quent in the third month.
martedì 4 maggio 2010
gated the main reasons why we obtain such an improvement.
58. 12,5
Combating the Aging
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
M1 M2 M12
Effect (IV)
Figure 4: Histogram showing the number of queries
(on the y axis) having a certain number of useful
recommendations (on the x axis). Results are eval-
• uated automatically.
On a histogram (cumulative distribution):
400
300
t 200
100
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
M1 M2 M12
-
Figure 5: Histogram showing the total number of
queries (on the y axis) having at least a certain num-
ber of useful recommendations (on the x axis). For
instance the third bucket shows how many queries
martedì 4 maggio 2010
59. 12,5
Combating the Aging
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
M1 M2 M12
Effect (IV)
Figure 4: Histogram showing the number of queries
(on the y axis) having a certain number of useful
recommendations (on the x axis). Results are eval-
• uated automatically.
On a histogram (cumulative distribution):
400
300
t 200
100
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
M1 M2 M12
- • Results on M12 are always better than M1, and M2;
Figure 5: Histogram showing the total number of
• queries improvement ofhaving at least aleast four good
large (on the y axis) queries with at certain num-
suggestions;
ber of useful recommendations (on the x axis). For
instance the third bucket shows how many queries
martedì 4 maggio 2010
61. Distributed QFG
4. using the graph algebra described in [8], each pa
graph is iteratively merged. Each iteration is do
parallel on the different available nodes of the clo
Building
5. the final resulting data-graph is now processed
other steps [4] (normalization, chain extraction,
dom walk) to obtain the complete and usable QF
• a parallel way to update QFGs:
01)2()*+,'#3456#7)8#
Divide-and-Conquer approach;
• the query log is split in m
!#$%'# !#$%'# !#$%'# !#$%'#
parts;
• parallel extraction of the
-./# -./# -./# -./#
features;
• compressing step;
!#()*+,#-./# !#()*+,#-./#
• merging graphs;
• final operations 9#()*+,'#-./#
(normalization, pagerank, etc.);
martedì 4 maggio 2010 Figure 6: Example of the building of a two mo
63. Conclusions
• We study the effects of time on QFG-based query
recommender systems;
martedì 4 maggio 2010
64. Conclusions
• We study the effects of time on QFG-based query
recommender systems;
• We built different QFGs from the AOL query log;
• we analyze the quality of recommendation;
• we show that recommendation models ages;
• we introduce an “incremental” algorithm for updating
the model;
• we propose a parallel/distributed way of building
QFGs;
martedì 4 maggio 2010
66. Future Works
• to define a strategy for merging graphs assigning
different weights to each subgraph;
• more importance to “fresh” data;
martedì 4 maggio 2010
67. Future Works
• to define a strategy for merging graphs assigning
different weights to each subgraph;
• more importance to “fresh” data;
• to compare the robustness of QFG recommender
systems with other query recommenders with
respect to aging;
martedì 4 maggio 2010
68. Future Works
• to define a strategy for merging graphs assigning
different weights to each subgraph;
• more importance to “fresh” data;
• to compare the robustness of QFG recommender
systems with other query recommenders with
respect to aging;
• to design a MapReduce algorithm to build and update
efficiently QFGs recommender systems;
martedì 4 maggio 2010
69. Questions?
Thank you for your attention!
martedì 4 maggio 2010
70. References
• [Boldi et al., CIKM’08]: The Query Flow Graph: model
and applications. Boldi, Bonchi, Castillo, Donato,
Gionis,Vigna. CIKM’08.
• [Boldi et al., WSCD’09]: Query Suggestions using
Query-Flow Graphs. Boldi, Bonchi, Castillo, Donato,
Vigna. WSCD’09.
• [Bordino et al., 2008]: Algebra for the joint mining of
query log graphs, 2008.
martedì 4 maggio 2010