Learning to rankの評価手法

紹介する評価手法
•DCG
•NDCG
•MAP
•ケンドールの順位相関係数
(Kendall s Tau)

DCG(Discount Cumulative count)
p.17
•DCGとは measures the goodness
of the ranking list with the labels
4 EVALUATION
luation ontheperformance ofaranking model iscarried outbycomparison between the
output by the model and the ranking Usts given as ground truth. Several evaluation m
widelyused in IR and other fields. These include NDCG (Normalized Discounted Cum
n), DCG (Discounted Cumulative Gain) [53], MAP (Mean Average Precision) [101
nnersTakeAll),MRR (Mean Reciprocal Rank), and Kendall's Tau.
Given queryqi and associated documents D,,suppose thatttj istheranking list (perm
Diandy/ isthesetoflabels (grades) ofD,-. DCGmeasures thegoodness oftheranking
abels. Specifically, DCG at position k forqi isdefined:
DCG(k)= J2 GU)D(7n(j)),
j:ni(j)<k
reG(-) is a gain function and £>(•) isa position discount function. Note that 7T; (y) den
tion ofdij in717.Therefore, the summation is taken over the top kpositions in ranking
G represents the cumulative gain ofaccessing the information from position one topo
e, the definition ofNDCG (or DCG) arc formulated based on the indices ofdocuments. Itis also possible to defin
DCG) basedon the indices of positions.
DCGは「トップからk番目までを評価する」

G(j)の内容 p.18
• G(j)とは「関係あるdocumentがどれだけπ_iに存在する
か？」を示す指標
In a perfect ranking, the documents with higher g
multiple perfectrankings for a query and associated
n is normally defined as an exponential function
informationexponentially increases when grade o
GO) = 2^-1,
rade) ofdjj inranking list 7r, .Thediscount functio
position. That is to say, satisfaction of accessing inf
ここで，y_(i,j)はdocument, d_(i,j)に与えられたラベル
ラベルの値が高いほど，G(j)の値は高くなる．
documents, d_i=(d_(i,1), d_(i,2), d_(i,3)) に対して,
ラベル集合はY_i=(3, 3, 2)のように与えられる．

D(π_i(j))の内容 p.19
•D(π_i(j))はd_(i,j)の順位が低いほど，小さ
くなる値
π_i(j)はdocument, d_(i,j)の順位を示す.
2.2. LEARNING TASK
eases when positionof information access increases.
1
D(TTiU)) =
log20+*,(./))'
re7r,0) is the positionof djj in rankinglist7T,-.
Hence, DCG and NDCG at positionk for q-t become
V^ V'-i - 1
DCG(k) = ) -—,
NDCG(k)=DCG-uk) e j;;;;a))
DCG and NDCG of the whole ranking list for qi become
D(π_i(j))はπ_i(j)が１の時，つまりdocument,d_i(j)が一位の
時，最大．
document,d_i(j)の順位が下がるほど，分母の値は大きくなる
ので，D(π_i(j))の値は小さくなる

もどって，DCGの説明(p.17)
• G(j)は関係あるdocumentほど高くなる値
• D(π_i(j))はd_(i,j)の順位が低いほど，低くなる値
EVALUATION
uation ontheperformance ofaranking model iscarried outbycomparison between ther
output by the model and the ranking Usts given as ground truth. Several evaluation m
idelyused in IR and other fields. These include NDCG (Normalized Discounted Cum
), DCG (Discounted Cumulative Gain) [53], MAP (Mean Average Precision) [101],
nersTakeAll),MRR (Mean Reciprocal Rank), and Kendall's Tau.
Given queryqi and associated documents D,,suppose thatttj istheranking list (permu
iandy/ isthesetoflabels (grades) ofD,-. DCGmeasures thegoodness oftheranking li
bels. Specifically, DCG at position k forqi isdefined:
DCG(k)= J2 GU)D(7n(j)),
j:ni(j)<k
eG(-) is a gain function and £>(•) isa position discount function. Note that 7T; (y) deno
on ofdij in717.Therefore, the summation is taken over the top kpositions in ranking l
represents the cumulative gain ofaccessing the information from position one topos
the definition ofNDCG (or DCG) arc formulated based on the indices ofdocuments. Itis also possible to define
CG) basedon the indices of positions.
DCGはG(j)が高い値ばかりで，D(π_i(j))も高い値ばかりの時に
大きくなる．
つまり，「k番目までのdocumentが高い値のラベルをも
ち」，「k番目までのdocumentがリストπの中で高い順位に
ある」時にDCGは大きくなる．

NDCGの説明（p.19）
D(TTiU)) =
log20+*,(./))'
7r,0) is the positionof djj in rankinglist7T,-.
Hence, DCG and NDCG at positionk for q-t become
V^ V'-i - 1
DCG(k) = ) -—,
NDCG(k)=DCG-uk) e j;;;;a))
DCG and NDCG of the whole ranking list for qi become
DCG= £
. log2(I +^(7))'J.ni(j)<rti
NDCG = DCG~mx £. log2(l+*,(./))
J-*i{j)<ni
DCG and NDCG values are further averaged overqueries(/ = 1, ••• , m).
Table2.4 gives examples of calculating NDCG values of two ranking Usts. NDCG (
eeffect ofgiving highscores to the ranking lists inwhich relevant documents areranked
DCGを逆数として掛け合わせて，
正規化することになる．

MAP(Mean Average Precision)の説明（p.20）
. log2(l+*,(./))
J-*i{j)<ni
DCG and NDCG values are further averaged overqueries(/ = 1, ••• , m).
Table2.4 gives examples of calculating NDCG values of two ranking Usts. NDCG (DC
theeffect ofgiving highscores to the ranking lists inwhich relevant documents areranked h
the examples inTable2.4.Forthe perfect rankings, the NDCG value at each positionis alw
,whilefor imperfect rankings, the NDCG values areless than one.
MAP isanother measure widely usedin IR.In MAP,it isassumed that the gradesofreleva
at two levels: 1 and 0. Given queryq;,associated documents D,, rankingHst 7T, on D;, and la
f Di, Average Precision forqt isdefined:
£/=i yij
re ytj is the label (grade) of dij and takes on 1 or 0 as value, representing being relevan
evant. P(j) for query qt is defined:
p, .x = T,k:Tri(k)<niU) y'<k
*iU)
re JTj(j) is the position of dij in jtj. P(j) represents the precision until the position ofdij
Note that labels areeither 1or 0, and thusprecision (i.e.,ratioof label 1)canbedefined. Ave
cision represents averaged precision over allthepositions ofdocuments with label 1forquer
Average Precisionは
MAPの最大の特徴はラベルが「０と１」だけ
ランキングリストの平均Precisionを返す
y_(i,j)は０と１のみ
j=1からn_iまでなので，
すべてのdocumentのランキングを評価する
t ofgiving highscores to the ranking lists inwhich relevant documents ar
mples inTable2.4.Forthe perfect rankings, the NDCG value at eachpos
or imperfect rankings, the NDCG values areless than one.
isanother measure widely usedin IR.In MAP,it isassumed that the grad
vels: 1 and 0. Given queryq;,associated documents D,, rankingHst 7T, on
erage Precision forqt isdefined:
£/=i yij
s the label (grade) of dij and takes on 1 or 0 as value, representing bei
(j) for query qt is defined:
*iU)
is the position ofdij in jtj. P(j) represents the precision until the posit
labels areeither 1or 0, and thusprecision (i.e.,ratioof label 1)canbedef
Precisionは

MAP(Mean Average Precision)の説明（p.20）
£/=i yij
is the label (grade) of dij and takes on 1 or 0 as value, representing be
P(j) for query qt is defined:
*iU)
) is the position ofdij in jtj. P(j) represents the precision until the pos
at labels areeither 1or 0, and thusprecision (i.e.,ratioof label 1)canbed
epresents averaged precision over allthepositions ofdocuments with labe
Precisionは
π_i
document,jの順位
この範囲の{0,1}の
合計値
P(j)=document,jまでに関係あるdocumentがどれだけあるか？/
document,jの順位
この範囲に1(関係ある)のdocument
が多いほど，P(j)は高い値

Kendall s Tau(ケンドールの順位相関係数) p.20
ケンドールの相関係数は
「２つのリストの中でアイテムペアの順序関係がどれだけ一致しているか？」を評価
（ここではGoldのリストとシステムによるリスト）
数値の範囲は-1 +1．+1に近いほど「関係性あり」，-1に近いほど「関係なし」
arefurtheraveraged overqueries to become MeanAverage Precision
mple ofcalculating the AP value ofone ranking Ust.
re proposed in statistics. It isdefined on two rankingUsts: one is the
del, andthe other isbythe groundtruth. Kendall's Tau of rankingUst
h tt* isdefined:
2c,
Ti = -1,
2n/0»i - l)
of concordant pairs between the two Usts, and /!/ denotes the length
KendaUs Tau between two ranking Usts: (A,B,C) and (C,A,B) is as
2x1 1
een —1 and +1. If the two ranking Usts are exactlythe same, then it
reverse orderof the other, then it is —1.It is easyto verify KendaUs
n_i:アイテムの数
c_i:順序が一致したアイテムペア数
実は(n_i)C2を展開した式
例えば(A,B,C)と(C,A,B)のとき
n_i:アイテムの数=3
順序が一致したアイテムペアは(A,B)のみだから,c_i=1
分母，つまり考えられるアイテムペア数は3C2=3
結果，T_iは-0.3333...で，「あまり関係性がない」と言える
参考: http://d.hatena.ne.jp/sleepy_yoshi/20110326/p1

Learning to rankの評価手法

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (19)

En vedette

En vedette (14)

Similaire à Learning to rankの評価手法

Similaire à Learning to rankの評価手法 (20)

Dernier

Dernier (20)

Learning to rankの評価手法