Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Response Summarizer: An Automatic Summarization System of Call Center Conversation
1. Response Summarizer:
An Automatic Summarization System
of Call Center Conversation
September 28 and 29, 2016
Keio University, Hagiwara Laboratory
Masahiro Yamamoto
Mentor: Nishitoba-san, Iwata-san
PFI Internship Final Report
2. Conversational
Data
Document Retrieval
(Answer Finder)
Past Documents
Document id: 001
・Displayed only beginning
→Difficult to understand
contents quickly
Customer: もしもし
Operator: 担当の○○です。
Customer: エアコンが壊れたんだけど
……
Document id: 001
・Displayed summary
→Possible to understand
contents quickly
Problem: エアコンの故障
Response: コンセントの抜き差しを
試してみて下さい。
いつ故障しましたか。
IdealPresent
Customer
use past cases
for response
Operator
My air conditioner
was broken.
Task:
Call Center Conversation
Summarization
3. 1. Background
■ Operators of a call center
・Must respond to a variety of questions
・It is difficult to answer correctly especially for beginner
→Response Support by showing past summarized data
■ Task description
・Extracting the following items
1. Problem report of a customer: 1 sentence
(e.g.) 「エアコンがつきません」
2. Response of an operator: within a predetermined
number of characters
(e.g.) 「ブレーカーの上げ下げを試してくれませんか」
2
4. 2. Proposed System
■ Approach
1. Analysis of conversational data: Comprehension of the
tendencies of important utterances
2. Scoring words: Using above knowledge
3. Summarization of customer’s utterance: Extract one
sentence from the customer’s utterances
4. Summarization of operator’s utterance: Extract words
using dynamic programming
3
5. 2. Proposed System
■ Analysis of conversational data
・Customer’s utterance
- talk about topic word firstly: Important → knowledge⑴
- talk about personal information: Unimportant
→ knowledge(2)
・Operator’s utterance
- talk about personal information: Unimportant
→ knowledge(3)
- “○○か”, “ ○○かね”: Important → knowledge(4)
- turning point of a topic: Important → knowledge(5)
→ Scoring words using these knowledge
4
6. 2. Proposed System
■ Summarization of customer’s utterance
・Scoring “new information” according to “citation degree”:
knowledge(1)
- New information: Noun which is talked firstly in the conversation
- Citation degree: Degree of the new information is cited
-- If the new information is cited many times,
the citation degree is high
※ “Proper Noun”, “Numbers” are stop words: knowledge(2)
・The score of the sentence is defined as the total sum
of the citation degree
・The sentence with high is extracted as a problem report
5
isScore
is
7. 2. Proposed System
■ Summarization of customer’s utterance
・Example (c = customer, o = operator, Green = new information)
S1: エアコンの電源がつきません。 (c)
S2: エアコンですね。コンセントの抜き差しを試してみてください。(o)
S3: うまくいきません。(c)
S4: エアコンを使い始めたのはいつからですか。(o)
→Sentence with highest score “S1” is extracted as a problem report
6
1 _1
2
1 ( 1) (1 1 1 2 0 3 1) 5
n
k
k
w k x
エアコン エアコン
1 _1
2
1 ( 1) (1 1 0 2 0 3 0)
n
k
k
w k x
電源 電源
1 1 1sScore w TFIDF w TFIDF エアコン エアコン 電源 電源
3 0sScore
8. 2. Proposed System
■ Scoring word in the operator’s utterance as follows
- : TF-IDF of word j. Only content words. The words about
personal information confirmation are stop words: knowledge(3)
- Rule-based scoring : knowledge(4)
- Total sum of citation degree of new information : knowledge(5)
- Scoring based on text segmentation : knowledge(5)
7
1i
i
s
If s is first sentence of the segment
TS
Otherwise
ijC
1is
If theend of sentenceis
R
Otherwise
"か"or "かね"
iR
iTS
jTFIDF
ij j i ij iW TFIDF R C TS
ij
9. 2. Proposed System
■ Text Segmentation of conversational data [1]
・Extension of HDP-LDA
(Introduce a latent variable to detect a topic shift)
8
[1] Nguyen et al. “SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation
in Multiparty Conversations”, ACL, 2012.
,c tl
10. 2. Proposed System
■ Summarization of operator’s utterance
・Extract words using dynamic programming
9
max.
imn
ij ij
i j
w x
. .s t
:Thescoreof a wordijw ij
:Decision variable {0,1}ijx
:Thenumberof wordsin thesentenceim i
Constraint of summarylength
Constraint of thenumber of chunks
Constraint of syntaxtree
11. 3. Experiment 1
■ Purpose
・To test the accuracy of summarization of customer’s utterances
■ Method
・Evaluate concordance rate of correct data and system’s output
・Correct data is made by human. (21 instances)
■ Comparison method
・Baseline: Score of the sentence is calculated by total sum of TF-IDF
of content words in the sentence
■Result
10
Concordance rate
Baseline 0.24 (5/21)
Proposed 0.81 (17/21)
→ Effectiveness of the
proposed system is confirmed
12. 3. Experiment 2
■ Purpose
・To test the accuracy of summarization of operator’s utterances
■ Method
・Evaluation Index: ROUGE-N
- The ratio of consistent N-gram of reference summary and output
・Reference summary is made by human
(15 instances. The average number of characters: 178.0)
■ Comparison methods
・Summarization method
- COMPRESSION: Extraction of words explained in p.9
- KNAPSACK: Extraction of sentences (Knapsack problem)
・Scoring method
- Combination of the scoring methods explained in p.7
11
13. 3. Experiment 2
■ Result (COMPRESSION)
12
Rouge1 Rouge2
Baseline 0.55 0.37
+S 0.58 0.41
+R 0.63 0.48
+C 0.61 0.43
+TS 0.57 0.41
+S & R 0.62 0.46
+S & C 0.61 0.45
+S & TS 0.58 0.41
+ S & R & C 0.63 0.47
+S & R & TS 0.62 0.46
+S & R & C & TS 0.64 0.49
14. 3. Experiment 2
■ Result (KNAPSACK)
13
Rouge1 Rouge2
Baseline 0.51 0.35
+S 0.56 0.40
+R 0.58 0.43
+C 0.52 0.33
+TS 0.51 0.36
+S & R 0.60 0.44
+S & R & TS 0.61 0.46
+S & R & C & TS 0.59 0.42
・Baseline: Scoring all words using TF-IDF
・Stop words(S): Scoring words without stop words
・Rule(R): Higher score to the sentences 「〇〇か」 and 「〇〇かね」
・Citation(C): Higher score to the new information
・Text Segmentation(TS): Higher score to the first sentence of segments
15. 3. Experiment 2
■ Discussion
・Extraction vs Compression
- Compression: highest ROUGE. But there are sentences
with a lower amount of information
- Extraction: lower ROUGE. But there are only sentences
with a higher amount of information
・Scoring methods
- Stop words and Rule: improvement of ROUGE
- Citation degree and Text Segmentation: A little improvement
・What is the best method?
- Highest Rouge: Compression & Scoring using “S”, “R”, “C” and “TS”
- Practical use: Extraction & Scoring using “S” and “R”
14
16. 4. Conclusion & Future work
■ Call Center Conversation Summarization
・Analysis of conversation data
・Summarization using above knowledge
・Highest Rouge: Compression & Scoring using “Stop words”, “Rule”,
“Citation degree” and “Text Segmentation”
・Practical use: Extraction & Scoring using “Stop words” and “Rule”
■ Future work
・Apply this method to another domain
- Another company: Multiple questions of a customer
→Necessary to develop this method for the complicated data
15