SlideShare a Scribd company logo
1 of 32
Download to read offline
Nonparametric Bayesian Word Discovery
for Symbol Emergence in Robotics
Tadahiro Taniguchi
College of Information Science & Engineering
Ritsumeikan University
Invited talk @ Workshop on Machine Learning Methods for High-
Level Cognitive Capabilities in Robotics 2016 ML-HLCR2016, in
IROS2016, Daejeon, Korea 13/10/2016
@tanichu
Contents
1. Introduction
2. Word segmentation and discovery
3. Nonparametric Bayesian double articulation
analyzer (NPB-DAA)
4. Conclusion
Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi,
Tetsuya Ogata, and Hideki Asoh
Symbol Emergence in Robotics: A Survey
Advanced Robotics. (2016)DOI:10.1080/01691864.2016.1164622
@tanichu
Without any pre-existing
knowledge of phonemes and
vocabularies.
(like human infants)
[D. Roy 2002]
[N. Iwahashi 2003]
D. K. Roy and A. P. Pentland, “Learning words from sights and sounds: a computational model,”
Cogn. Sci., vol. 26, no. 1, pp. 113–146, 2002.
N. Iwahashi, “Language acquisition through a human – robot interface by combining speech ,
visual , and behavioral information,” vol. 156, pp. 109–121, 2003.
Unsupervised machine learning for language
acquisition by a robot
Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata,
and Hideki Asoh, Symbol Emergence in Robotics: A Survey
Advanced Robotics, .(2016)DOI:10.1080/01691864.2016.1164622
Symbol emergence system
Word discovery for
symbol emergence in robotics
• To enable a robot obtain many words
provided by its user through human-robot
interaction, word discovery is a critical task.
… … …
can
apple
...
you
this
...
...
Contents
1. Introduction
2. Word segmentation and discovery
3. Nonparametric Bayesian double articulation
analyzer (NPB-DAA)
4. Conclusion
@tanichu
Word segmentation
in language acquisition
 When parents speak to their children, they rarely use
“isolated words,” but use continuous word sequences,
i.e. sentences.
 Word segmentation is a primary task of language
acquisition.
 The child has to perform word segmentation without
pre-existing knowledge of vocabulary because
children do not know lists of words before they learn.
… … …THISISANAPPLE
APPLEISSOSWEET
HEYLOOKATTHIS
OHYOUARESOCUTE
?????
Unsupervised word segmentation
 Word segmentation problem
– Example:
• Thisisanapple(ðɪsɪzənæpl) -> This(ðɪs) is(ɪz) an(ən) apple(æpl)
• WATASHIWATANAKANADESU(わたしはたなかです)
-> WATASHI(わたし) WA(は) TANAKA(たなか) DESU(です)
– To segment sentences into words (morpheme).
– This had required preexisting knowledge of language model, i.e.,
dictionary.
 Unsupervised word segmentation
– No preexisting dictionaries are used.
– A nonparametric Bayesian framework for word segmentation
[Goldwater+ 09]
– Unsupervised word segmentation method based on the Nested
Pitman–Yor language model (NPYLM)
[Mochihashi+ 09].
S. Goldwater, T. L. Griffiths, and M. Johnson, “A Bayesian framework for word segmentation:
exploring the effects of context.,” Cognition, vol. 112, no. 1, pp. 21–54, 2009.
Daichi Mochihashi, Takeshi Yamada, Naonori Ueda."Bayesian Unsupervised Word Segmentation
with Nested Pitman-Yor Language Modeling". ACL-IJCNLP 2009, pp.100-108, 2009.
NPYLM [Mochihashi ‘09]
(Nested Pitman-Yor Language Model)
• Mochihashi et al. proposed NPYLM for unsupervised word
segmentation.
• NPYLM has a word n-gram model and a letter n-gram model.
Each adopts hierarchical Pitman-Yor language model as a
language model.
• Bayesian nonparametrics.
• Efficient blocked Gibbs sampler.
Daichi Mochihashi, Takeshi Yamada, Naonori Ueda."Bayesian Unsupervised Word Segmentation
with Nested Pitman-Yor Language Modeling". ACL-IJCNLP 2009, pp.100-108, 2009.
Language model
(Vocabulary)
Word segmentation
Updating language model
2016/10/16 11
[From Mochihashi’s presentation slide]
http://chasen.org/~daiti-m/paper/jfssa2009segment.pdf
Analysis
Problems with Unsupervised Word
Segmentation in Word Discovery Tasks
• NPYLM presumes that the target document
(sentences) is transcribed without errors.
– If there are phoneme recognition errors, its
performance becomes dramatically worse.
– How to mitigate the effect of phoneme recognition
errors in word discovery is an important issue in
real-world language acquisition.
[Saffuran 1996]
Problem and approach
Continuous speech signals
Single Gaussian
emission distribution
with duration distribution
Word dictionary and
bigram language model
A single nonparametric Bayesian
probabilistic generative model
Acoustic
model
Language
model
… … …
Unsupervised learning
Contents
1. Introduction
2. Word segmentation and discovery
3. Nonparametric Bayesian double articulation
analyzer (NPB-DAA)
4. Conclusion
1. Tadahiro Taniguchi, Shogo Nagasaka, Ryo Nakashima
Nonparametric Bayesian Double Articulation Analyzer for Direct Language
Acquisition from Continuous Speech Signals
IEEE Transactions on Cognitive and Developmental Systems .(2016)
2. Tadahiro Taniguchi, Ryo Nakashima, Hailong Liu and Shogo Nagasaka
Double Articulation Analyzer with Deep Sparse Autoencoder for Unsupervised
Word Discovery from Speech Signals
Advanced Robotics, Vol.30 (11-12) pp. 770-783 .(2016) @tanichu
Double articulation
structure in semiotic data
• Semiotic time-series data often has double
articulation
– Speech signal is a continuous and high-dimensional time-series.
– Spoken sentence is considered a sequence of phonemes.
– Phonemes are grouped into words, and people give them meanings.
h a u m ʌ́ tʃ I z ð í s
[h a u ] [m ʌ́ tʃ] [ i z ] [ð í s]
How much is this?Word
Phoneme
Speech
signal
semantic
(meaningful)
meaningless
unsegmented
Does the human brain have a special capability to analyze
double articulation structures embedded in time-series data?
1 2 46 1 27 8 5 10 11 13 14 7
W H A T I S T H I S T H I S I S A P E N
[WHAT] [IS] [THIS] [THIS] [IS] [A] [PEN]
Speech
Motion
Driving
Working hypothesis
Double Articulation Structure in Human Behavior
2016/10/16
Double Articulation Analyzer (DAA) and its
application to non-speech time series data
Double Articulation Analyzer = sticky HDP-HMM + NPYLM
 sticky HDP-HMM = nonparametric Bayesian HMM
 NPYLM = nonparametric Bayesian language model for unsupervised
morphological analysis
HDP-HMM
[Fox ‘07]
NPYLM
[Moachihashi ‘09]
Tadahiro Taniguchi, Shogo Nagasaka, Double Articulation Analyzer for Unsegmented Human Motion using
Pitman-Yor Language model and Infinite Hidden Markov Model, 2011 IEEE/SICE SII.(2011)
Human motion
Driving
behavior
Imitation learning
Motion segmentation
[Taniguchi’11]
Extracting driving chunk
[Nagasaka ‘12]
Detecting intentional
changing points
[Takenaka ‘12]
Prediction [Taniguchi ‘12]
Video summarization
[Takenaka ‘12]
For topic modeling [Bando
‘13]
Simultaneous acquisition of
phoneme and language models
 [Nakamura+ 2014] used a pre-existing phoneme model and
did not make a robot learn a phoneme model.
 There are still few studies about unsupervised simultaneous
learning of phoneme and language models from speech
signals [Kamper+ 15, Lee+ 15].
 Does the analysis of double articulation structure
embedded in speech signals enable a robot to obtain
phoneme and language models simultaneously?
… … …
Prosodic Cues
Distributional Cues
Co-occurrence Cues
H. Kamper, A. Jansen, and S. Goldwater, “Fully Unsupervised Small-Vocabulary Speech
Recognition Using a Segmental Bayesian Model,” in INTERSPEECH 2015, 2015.
C.-y. Lee, T. J. O. Donnell, and J. Glass, “Unsupervised Lexicon Discovery from Acoustic Input,”
Transactions of the Association for Computational Linguistics, vol. 3, pp. 389-403, 2015.
Making full use of the directly from speech signals
Hierarchical Dirichlet process hidden language model
(HDP-HLM) [Taniguchi+ 16]
19Tadahiro Taniguchi, Shogo Nagasaka, Ryo Nakashima, Nonparametric Bayesian Double Articulation Analyzer for
Direct Language Acquisition from Continuous Speech Signals, IEEE Transactions on Cognitive and Developmental
Systems.(2016)
γLM
Language model
(Word bigram)
γWM
i=1,…,∞
αWM
j=1,…,∞
Word model
(Letter bigram)
z1 zs-1 zs zs+1 zS
Latent words
(Super state sequence)
wi
i=1,…,∞
ls1 lsk lsL
Latent letters
Ds1 Dsk
x1
xt1
s1 xT
Acoustic model
ωj
θj
G
H
yT
Observation
Ds1 Dsk DsL
Duration
βLM
αLM
πLM
i
βWM
πWM
j
xt
2
s1 xt1
sk
xt
2
sk xt
1
sL xt
2
sL
j=1,…,∞
yt
2
sLyt
1
sLyt1
sk yt
2
sk
yt1
s1 yt
2
s1
y1
DsL
zs
zszs
zs
zs
zs
zs
Language model
(Word bigram
model with
letter bigram
model)
Acoustic model
(phoneme model)
Word sequence
Phoneme sequence
A probabilistic generative model for time-series data
having double articulation structure
HDP-HLM as an extension of HDP-HSMM
 HDP-HLM can be regarded as an extension of HDP-
HSMM [Johnson’13]
 This property helps us to derive efficient inference
procedure.
Matthew J Johnson and Alan S Willsky. Bayesian nonparametric hidden semi-markov models.
The Journal of Machine Learning Research, Vol. 14, No. 1, pp. 673–701, 2013.
HDP-HSMM
(hierarchical Dirichlet process
hidden semi-Markov model)
corresponds..
Inference (Blocked Gibbs sampler)
 Blocked Gibbs sampler can be derived by extending HDP-
HMM’s backward filtering-forward sampling algorithm.
Backward filtering
Forward sampling
Parameter update
very heavy....
Evaluation experiment using
artificial 2 or 3 words sentences with
Japanese five vowels
 Five artificial words {aioi, aue, ao, ie, uo} prepared by connecting five
Japanese vowels.
 30 sentences (25 two-word and 5 three-word sentences) are prepared and
each sentence is recorded twice by four Japanese speakers.
 MFCC (frame size =25ms, shift = 10ms, frame rate 100hz)
ex. aioi ao
γLM
Language model
(Word bigram)
γWM
i=1,…,∞
αWM
j=1,…,∞
Word model
(Letter bigram)
z1 zs-1 zs zs+1 zS
Latent words
(Super state sequence)
wi
i=1,…,∞
ls1 lsk lsL
Latent letters
Ds1 Dsk
x1
xt1
s1 xT
Acoustic model
ωj
θj
G
H
yT
Observation
Ds1 Dsk DsL
Duration
βLM
αLM
πLM
i
βWM
πWM
j
xt
2
s1 xt1
sk
xt
2
sk xt
1
sL xt
2
sL
j=1,…,∞
yt
2
sLyt
1
sLyt1
sk yt
2
sk
yt1
s1 yt
2
s1
y1
DsL
zs
zszs
zs
zs
zs
zs
* HDP-HLM are trained separately for each speaker.
Sample of results
 Compared to Conventional DAA, NPB-DAA could discover latent
words accurately.
 The inference procedure could gradually
estimate the boundaries of words and
phonemes.
ex) ao-ie-ao
Unsupervised word discovery
with trained phoneme recognizer
Nonparametric Bayesian Double Articulation Analyzer
(NPB-DAA) based on HDP-HLM [Taniguchi ’16]
 The method could estimate language and acoustic/phoneme
models simultaneously.
 The comparative methods were compared using ARI
(adjusted rand index) from the viewpoint of frame-based
clustering task.
 It even outperformed an off-the-shelf speech recognition
system-based method in a word discovery task.
24
Tadahiro Taniguchi, Shogo Nagasaka, Ryo Nakashima, Nonparametric Bayesian Double Articulation Analyzer
for Direct Language Acquisition from Continuous Speech Signals, IEEE Transactions on Cognitive and
Developmental Systems.(2016)
Unsupervised word discovery
Speech recognition with
off-the-shelf ASR system
Double Articulation Analyzer with Deep Sparse Autoencoder
for Unsupervised Word Discovery from Speech Signals
 Deep learning-based feature extraction method , deep sparse
autoencoder (DSAE), was employed to increase the performance of
NPB-DAA.
 DSAE can be trained in an unsupervised manner. Therefore, the total
learning system is still an unsupervised learning system.
Experimental results
The NPB-DAA with DSAE even outperformed
MFCC-based off-the-shelf speech
recognition system.
Tadahiro Taniguchi, Ryo Nakashima, Hailong Liu and Shogo Nagasaka, Double Articulation Analyzer with Deep
Sparse Autoencoder for Unsupervised Word Discovery from Speech Signals, Advanced Robotics.(2016)
Contents
1. Introduction
 Symbol emergence in robotics
2. Word discovery with multimodal categorization
3. Direct word discovery from speech signals
4. Conclusion
@tanichu
Conclusion
 Symbol Emergence in Robotics is introduced.
SER is a synthetic approach towards
developmental mental system involving
language acquisition and symbol emergence
systems.
 An unsupervised machine learning method for
word discovery by robots, NPB-DAA, is
introduced. This is based on Nonparametric
Bayesian approach.
… … …
Current problems and future challenges
• Current Problems
– Computational cost
• Analyzing only 60 sentences require more than one hour.
– Speaker dependency
• Unsupervised learning from multi-speaker speech signals is
currently difficult because each speaker’s acoustic feature
is different from each other.
• Future Challenges
– Efficient and Fast Algorithm
• Inventing more efficient inference methods and using more
computational resources are our future directions.
– Unsupervised Speaker Adaptation
• Developing a unsupervised speaker adaptation method for
language acquisition from multi-speaker speech signals.
– Mutual learning of words, phonemes and objects.
• It is expected that phoneme acquisition performance is also
increased by learning phonemes with objects simultaneously.
Future challenge
Word discovery for symbol emergence in robotics
… … …
NPB-DAA
Multimodal
object
categorization
SLAM
Motion
Primitives
Affordance
learning
Syntax learning
Probabilistic
information
PhonemesPhonemes
& words
Future challenge
2005-2015
2016-
Information
2016/10/16
email: taniguchi@ci.ristumei.ac.jp
Special Thanks
• Ritsumeikan University
• R. Nakashima, S. Nagasaka, A.
Taniguchi, K. Hayashi
• DENSO co.
• T. Bando, K. Takenaka, K. Hitomi
• Okayama Pref. Univ.
• N. Iwahashi
Visit http://www.tanichu.com/
Facebook: please search me
Twitter: @tanichu
Acknowledgement
[Github] NPB-DAA
https://github.com/EmergentSystemLabStudent/NPB_DAA
@tanichu

More Related Content

What's hot

Statistical Semantic入門 ~分布仮説からword2vecまで~
Statistical Semantic入門 ~分布仮説からword2vecまで~Statistical Semantic入門 ~分布仮説からword2vecまで~
Statistical Semantic入門 ~分布仮説からword2vecまで~
Yuya Unno
 
Phrase Structure Identification and Classification of Sentences using Deep Le...
Phrase Structure Identification and Classification of Sentences using Deep Le...Phrase Structure Identification and Classification of Sentences using Deep Le...
Phrase Structure Identification and Classification of Sentences using Deep Le...
ijtsrd
 
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
kevig
 

What's hot (20)

Statistical Semantic入門 ~分布仮説からword2vecまで~
Statistical Semantic入門 ~分布仮説からword2vecまで~Statistical Semantic入門 ~分布仮説からword2vecまで~
Statistical Semantic入門 ~分布仮説からword2vecまで~
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
 
Phrase Structure Identification and Classification of Sentences using Deep Le...
Phrase Structure Identification and Classification of Sentences using Deep Le...Phrase Structure Identification and Classification of Sentences using Deep Le...
Phrase Structure Identification and Classification of Sentences using Deep Le...
 
Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language Processing
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...
 
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 
A N H YBRID A PPROACH TO W ORD S ENSE D ISAMBIGUATION W ITH A ND W ITH...
A N H YBRID  A PPROACH TO  W ORD  S ENSE  D ISAMBIGUATION  W ITH  A ND  W ITH...A N H YBRID  A PPROACH TO  W ORD  S ENSE  D ISAMBIGUATION  W ITH  A ND  W ITH...
A N H YBRID A PPROACH TO W ORD S ENSE D ISAMBIGUATION W ITH A ND W ITH...
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word Clouds
 
Lecture 2: Computational Semantics
Lecture 2: Computational SemanticsLecture 2: Computational Semantics
Lecture 2: Computational Semantics
 
Nlp presentation
Nlp presentationNlp presentation
Nlp presentation
 
NL Context Understanding 23(6)
NL Context Understanding 23(6)NL Context Understanding 23(6)
NL Context Understanding 23(6)
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
NLP and its Use in Education
NLP and its Use in EducationNLP and its Use in Education
NLP and its Use in Education
 

Viewers also liked

Symbol emergence in robotics @ Shonan meeting 2013/11/13
Symbol emergence in robotics @ Shonan meeting 2013/11/13 Symbol emergence in robotics @ Shonan meeting 2013/11/13
Symbol emergence in robotics @ Shonan meeting 2013/11/13
Tadahiro Taniguchi
 

Viewers also liked (20)

Sci13 招待講演
Sci13 招待講演Sci13 招待講演
Sci13 招待講演
 
ビブリオバトルにおける コミュニティ形成のダイナミクス
ビブリオバトルにおける コミュニティ形成のダイナミクスビブリオバトルにおける コミュニティ形成のダイナミクス
ビブリオバトルにおける コミュニティ形成のダイナミクス
 
人工知能概論 4
人工知能概論 4人工知能概論 4
人工知能概論 4
 
人工知能概論 15
人工知能概論 15人工知能概論 15
人工知能概論 15
 
Os 12 記号創発ロボティクス / OS趣旨説明@JSAI2015
Os 12 記号創発ロボティクス / OS趣旨説明@JSAI2015 Os 12 記号創発ロボティクス / OS趣旨説明@JSAI2015
Os 12 記号創発ロボティクス / OS趣旨説明@JSAI2015
 
Stochastic Variational Inference
Stochastic Variational InferenceStochastic Variational Inference
Stochastic Variational Inference
 
記号創発ロボティクスの狙い
記号創発ロボティクスの狙い 記号創発ロボティクスの狙い
記号創発ロボティクスの狙い
 
Symbol emergence in robotics @ Shonan meeting 2013/11/13
Symbol emergence in robotics @ Shonan meeting 2013/11/13 Symbol emergence in robotics @ Shonan meeting 2013/11/13
Symbol emergence in robotics @ Shonan meeting 2013/11/13
 
人工知能概論 14
人工知能概論 14人工知能概論 14
人工知能概論 14
 
人工知能概論 8
人工知能概論 8人工知能概論 8
人工知能概論 8
 
人工知能概論 6
人工知能概論 6人工知能概論 6
人工知能概論 6
 
人工知能概論 5
人工知能概論 5人工知能概論 5
人工知能概論 5
 
人工知能概論 7
人工知能概論 7人工知能概論 7
人工知能概論 7
 
コミュニケーション場のメカニズムデザイン 自律性を活かす記号過程のための制度設計
コミュニケーション場のメカニズムデザイン 自律性を活かす記号過程のための制度設計コミュニケーション場のメカニズムデザイン 自律性を活かす記号過程のための制度設計
コミュニケーション場のメカニズムデザイン 自律性を活かす記号過程のための制度設計
 
人工知能概論 10
人工知能概論 10人工知能概論 10
人工知能概論 10
 
イラストで学ぶ人工知能概論 9
イラストで学ぶ人工知能概論 9イラストで学ぶ人工知能概論 9
イラストで学ぶ人工知能概論 9
 
人工知能概論 3
人工知能概論 3人工知能概論 3
人工知能概論 3
 
人工知能概論 13
人工知能概論 13人工知能概論 13
人工知能概論 13
 
人工知能概論 2
人工知能概論 2人工知能概論 2
人工知能概論 2
 
Composing graphical models with neural networks for structured representation...
Composing graphical models with neural networks for structured representation...Composing graphical models with neural networks for structured representation...
Composing graphical models with neural networks for structured representation...
 

Similar to Nonparametric Bayesian Word Discovery for Symbol Emergence in Robotics

LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxLiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
VishnuRajuV
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
Theodore J. LaGrow
 

Similar to Nonparametric Bayesian Word Discovery for Symbol Emergence in Robotics (20)

Wei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelWei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI Panel
 
D3 dhanalakshmi
D3 dhanalakshmiD3 dhanalakshmi
D3 dhanalakshmi
 
Pos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil TextsPos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil Texts
 
dialogue act modeling for automatic tagging and recognition
 dialogue act modeling for automatic tagging and recognition dialogue act modeling for automatic tagging and recognition
dialogue act modeling for automatic tagging and recognition
 
Jarrar: Introduction to Natural Language Processing
Jarrar: Introduction to Natural Language ProcessingJarrar: Introduction to Natural Language Processing
Jarrar: Introduction to Natural Language Processing
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
Speech-Recognition.pptx
Speech-Recognition.pptxSpeech-Recognition.pptx
Speech-Recognition.pptx
 
Word Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented LanguagesWord Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented Languages
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
Nlp
NlpNlp
Nlp
 
Nlp (1)
Nlp (1)Nlp (1)
Nlp (1)
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing
 
Fmri of bilingual brain atl reveals language independent representations
Fmri of bilingual brain atl reveals language independent representations Fmri of bilingual brain atl reveals language independent representations
Fmri of bilingual brain atl reveals language independent representations
 
Intro to Auto Speech Recognition -- How ML Learns Speech-to-Text
Intro to Auto Speech Recognition -- How ML Learns Speech-to-TextIntro to Auto Speech Recognition -- How ML Learns Speech-to-Text
Intro to Auto Speech Recognition -- How ML Learns Speech-to-Text
 
10
1010
10
 
Cognitive plausibility in learning algorithms
Cognitive plausibility in learning algorithmsCognitive plausibility in learning algorithms
Cognitive plausibility in learning algorithms
 
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxLiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
 
Hidden markov model based part of speech tagger for sinhala language
Hidden markov model based part of speech tagger for sinhala languageHidden markov model based part of speech tagger for sinhala language
Hidden markov model based part of speech tagger for sinhala language
 

More from Tadahiro Taniguchi

Designing wisdom through the web
Designing wisdom through the webDesigning wisdom through the web
Designing wisdom through the web
Tadahiro Taniguchi
 
記号を用いたコミュニケーションを実現するために何が必要か?― 記号創発ロボティクスの 視点から ―
記号を用いたコミュニケーションを実現するために何が必要か?― 記号創発ロボティクスの 視点から ―記号を用いたコミュニケーションを実現するために何が必要か?― 記号創発ロボティクスの 視点から ―
記号を用いたコミュニケーションを実現するために何が必要か?― 記号創発ロボティクスの 視点から ―
Tadahiro Taniguchi
 

More from Tadahiro Taniguchi (11)

「知」の循環と拡張を加速する対話空間のメカニズムデザイン(JST未来社会創造事業)
「知」の循環と拡張を加速する対話空間のメカニズムデザイン(JST未来社会創造事業)「知」の循環と拡張を加速する対話空間のメカニズムデザイン(JST未来社会創造事業)
「知」の循環と拡張を加速する対話空間のメカニズムデザイン(JST未来社会創造事業)
 
人工知能概論 12
人工知能概論 12人工知能概論 12
人工知能概論 12
 
人工知能概論 11
人工知能概論 11人工知能概論 11
人工知能概論 11
 
人工知能概論 1
人工知能概論 1人工知能概論 1
人工知能概論 1
 
電子情報通信学会 2012年総合大会 電力問題へのさまざまなアプローチ「人工知能的アプローチ」 講演資料
電子情報通信学会 2012年総合大会 電力問題へのさまざまなアプローチ「人工知能的アプローチ」 講演資料電子情報通信学会 2012年総合大会 電力問題へのさまざまなアプローチ「人工知能的アプローチ」 講演資料
電子情報通信学会 2012年総合大会 電力問題へのさまざまなアプローチ「人工知能的アプローチ」 講演資料
 
2013年度 創発システム研究室 3回生配属ガイダンス資料
2013年度 創発システム研究室 3回生配属ガイダンス資料2013年度 創発システム研究室 3回生配属ガイダンス資料
2013年度 創発システム研究室 3回生配属ガイダンス資料
 
ビブリオバトル2013 普及四年目のアレグレット
ビブリオバトル2013 普及四年目のアレグレットビブリオバトル2013 普及四年目のアレグレット
ビブリオバトル2013 普及四年目のアレグレット
 
「ビブリオバトルのすすめかた」@教員向け言語能力向上研修会(書評合戦)
「ビブリオバトルのすすめかた」@教員向け言語能力向上研修会(書評合戦)「ビブリオバトルのすすめかた」@教員向け言語能力向上研修会(書評合戦)
「ビブリオバトルのすすめかた」@教員向け言語能力向上研修会(書評合戦)
 
Designing wisdom through the web
Designing wisdom through the webDesigning wisdom through the web
Designing wisdom through the web
 
記号を用いたコミュニケーションを実現するために何が必要か?― 記号創発ロボティクスの 視点から ―
記号を用いたコミュニケーションを実現するために何が必要か?― 記号創発ロボティクスの 視点から ―記号を用いたコミュニケーションを実現するために何が必要か?― 記号創発ロボティクスの 視点から ―
記号を用いたコミュニケーションを実現するために何が必要か?― 記号創発ロボティクスの 視点から ―
 
AML-dynamics ライスボールセミナー
AML-dynamics ライスボールセミナーAML-dynamics ライスボールセミナー
AML-dynamics ライスボールセミナー
 

Recently uploaded

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 

Recently uploaded (20)

data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 

Nonparametric Bayesian Word Discovery for Symbol Emergence in Robotics

  • 1. Nonparametric Bayesian Word Discovery for Symbol Emergence in Robotics Tadahiro Taniguchi College of Information Science & Engineering Ritsumeikan University Invited talk @ Workshop on Machine Learning Methods for High- Level Cognitive Capabilities in Robotics 2016 ML-HLCR2016, in IROS2016, Daejeon, Korea 13/10/2016 @tanichu
  • 2. Contents 1. Introduction 2. Word segmentation and discovery 3. Nonparametric Bayesian double articulation analyzer (NPB-DAA) 4. Conclusion Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata, and Hideki Asoh Symbol Emergence in Robotics: A Survey Advanced Robotics. (2016)DOI:10.1080/01691864.2016.1164622 @tanichu
  • 3. Without any pre-existing knowledge of phonemes and vocabularies. (like human infants) [D. Roy 2002] [N. Iwahashi 2003] D. K. Roy and A. P. Pentland, “Learning words from sights and sounds: a computational model,” Cogn. Sci., vol. 26, no. 1, pp. 113–146, 2002. N. Iwahashi, “Language acquisition through a human – robot interface by combining speech , visual , and behavioral information,” vol. 156, pp. 109–121, 2003. Unsupervised machine learning for language acquisition by a robot
  • 4. Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata, and Hideki Asoh, Symbol Emergence in Robotics: A Survey Advanced Robotics, .(2016)DOI:10.1080/01691864.2016.1164622
  • 6. Word discovery for symbol emergence in robotics • To enable a robot obtain many words provided by its user through human-robot interaction, word discovery is a critical task. … … … can apple ... you this ... ...
  • 7. Contents 1. Introduction 2. Word segmentation and discovery 3. Nonparametric Bayesian double articulation analyzer (NPB-DAA) 4. Conclusion @tanichu
  • 8. Word segmentation in language acquisition  When parents speak to their children, they rarely use “isolated words,” but use continuous word sequences, i.e. sentences.  Word segmentation is a primary task of language acquisition.  The child has to perform word segmentation without pre-existing knowledge of vocabulary because children do not know lists of words before they learn. … … …THISISANAPPLE APPLEISSOSWEET HEYLOOKATTHIS OHYOUARESOCUTE ?????
  • 9. Unsupervised word segmentation  Word segmentation problem – Example: • Thisisanapple(ðɪsɪzənæpl) -> This(ðɪs) is(ɪz) an(ən) apple(æpl) • WATASHIWATANAKANADESU(わたしはたなかです) -> WATASHI(わたし) WA(は) TANAKA(たなか) DESU(です) – To segment sentences into words (morpheme). – This had required preexisting knowledge of language model, i.e., dictionary.  Unsupervised word segmentation – No preexisting dictionaries are used. – A nonparametric Bayesian framework for word segmentation [Goldwater+ 09] – Unsupervised word segmentation method based on the Nested Pitman–Yor language model (NPYLM) [Mochihashi+ 09]. S. Goldwater, T. L. Griffiths, and M. Johnson, “A Bayesian framework for word segmentation: exploring the effects of context.,” Cognition, vol. 112, no. 1, pp. 21–54, 2009. Daichi Mochihashi, Takeshi Yamada, Naonori Ueda."Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling". ACL-IJCNLP 2009, pp.100-108, 2009.
  • 10. NPYLM [Mochihashi ‘09] (Nested Pitman-Yor Language Model) • Mochihashi et al. proposed NPYLM for unsupervised word segmentation. • NPYLM has a word n-gram model and a letter n-gram model. Each adopts hierarchical Pitman-Yor language model as a language model. • Bayesian nonparametrics. • Efficient blocked Gibbs sampler. Daichi Mochihashi, Takeshi Yamada, Naonori Ueda."Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling". ACL-IJCNLP 2009, pp.100-108, 2009. Language model (Vocabulary) Word segmentation Updating language model
  • 11. 2016/10/16 11 [From Mochihashi’s presentation slide] http://chasen.org/~daiti-m/paper/jfssa2009segment.pdf Analysis
  • 12. Problems with Unsupervised Word Segmentation in Word Discovery Tasks • NPYLM presumes that the target document (sentences) is transcribed without errors. – If there are phoneme recognition errors, its performance becomes dramatically worse. – How to mitigate the effect of phoneme recognition errors in word discovery is an important issue in real-world language acquisition. [Saffuran 1996]
  • 13. Problem and approach Continuous speech signals Single Gaussian emission distribution with duration distribution Word dictionary and bigram language model A single nonparametric Bayesian probabilistic generative model Acoustic model Language model … … … Unsupervised learning
  • 14. Contents 1. Introduction 2. Word segmentation and discovery 3. Nonparametric Bayesian double articulation analyzer (NPB-DAA) 4. Conclusion 1. Tadahiro Taniguchi, Shogo Nagasaka, Ryo Nakashima Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals IEEE Transactions on Cognitive and Developmental Systems .(2016) 2. Tadahiro Taniguchi, Ryo Nakashima, Hailong Liu and Shogo Nagasaka Double Articulation Analyzer with Deep Sparse Autoencoder for Unsupervised Word Discovery from Speech Signals Advanced Robotics, Vol.30 (11-12) pp. 770-783 .(2016) @tanichu
  • 15. Double articulation structure in semiotic data • Semiotic time-series data often has double articulation – Speech signal is a continuous and high-dimensional time-series. – Spoken sentence is considered a sequence of phonemes. – Phonemes are grouped into words, and people give them meanings. h a u m ʌ́ tʃ I z ð í s [h a u ] [m ʌ́ tʃ] [ i z ] [ð í s] How much is this?Word Phoneme Speech signal semantic (meaningful) meaningless unsegmented Does the human brain have a special capability to analyze double articulation structures embedded in time-series data?
  • 16. 1 2 46 1 27 8 5 10 11 13 14 7 W H A T I S T H I S T H I S I S A P E N [WHAT] [IS] [THIS] [THIS] [IS] [A] [PEN] Speech Motion Driving Working hypothesis Double Articulation Structure in Human Behavior 2016/10/16
  • 17. Double Articulation Analyzer (DAA) and its application to non-speech time series data Double Articulation Analyzer = sticky HDP-HMM + NPYLM  sticky HDP-HMM = nonparametric Bayesian HMM  NPYLM = nonparametric Bayesian language model for unsupervised morphological analysis HDP-HMM [Fox ‘07] NPYLM [Moachihashi ‘09] Tadahiro Taniguchi, Shogo Nagasaka, Double Articulation Analyzer for Unsegmented Human Motion using Pitman-Yor Language model and Infinite Hidden Markov Model, 2011 IEEE/SICE SII.(2011) Human motion Driving behavior Imitation learning Motion segmentation [Taniguchi’11] Extracting driving chunk [Nagasaka ‘12] Detecting intentional changing points [Takenaka ‘12] Prediction [Taniguchi ‘12] Video summarization [Takenaka ‘12] For topic modeling [Bando ‘13]
  • 18. Simultaneous acquisition of phoneme and language models  [Nakamura+ 2014] used a pre-existing phoneme model and did not make a robot learn a phoneme model.  There are still few studies about unsupervised simultaneous learning of phoneme and language models from speech signals [Kamper+ 15, Lee+ 15].  Does the analysis of double articulation structure embedded in speech signals enable a robot to obtain phoneme and language models simultaneously? … … … Prosodic Cues Distributional Cues Co-occurrence Cues H. Kamper, A. Jansen, and S. Goldwater, “Fully Unsupervised Small-Vocabulary Speech Recognition Using a Segmental Bayesian Model,” in INTERSPEECH 2015, 2015. C.-y. Lee, T. J. O. Donnell, and J. Glass, “Unsupervised Lexicon Discovery from Acoustic Input,” Transactions of the Association for Computational Linguistics, vol. 3, pp. 389-403, 2015. Making full use of the directly from speech signals
  • 19. Hierarchical Dirichlet process hidden language model (HDP-HLM) [Taniguchi+ 16] 19Tadahiro Taniguchi, Shogo Nagasaka, Ryo Nakashima, Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals, IEEE Transactions on Cognitive and Developmental Systems.(2016) γLM Language model (Word bigram) γWM i=1,…,∞ αWM j=1,…,∞ Word model (Letter bigram) z1 zs-1 zs zs+1 zS Latent words (Super state sequence) wi i=1,…,∞ ls1 lsk lsL Latent letters Ds1 Dsk x1 xt1 s1 xT Acoustic model ωj θj G H yT Observation Ds1 Dsk DsL Duration βLM αLM πLM i βWM πWM j xt 2 s1 xt1 sk xt 2 sk xt 1 sL xt 2 sL j=1,…,∞ yt 2 sLyt 1 sLyt1 sk yt 2 sk yt1 s1 yt 2 s1 y1 DsL zs zszs zs zs zs zs Language model (Word bigram model with letter bigram model) Acoustic model (phoneme model) Word sequence Phoneme sequence A probabilistic generative model for time-series data having double articulation structure
  • 20. HDP-HLM as an extension of HDP-HSMM  HDP-HLM can be regarded as an extension of HDP- HSMM [Johnson’13]  This property helps us to derive efficient inference procedure. Matthew J Johnson and Alan S Willsky. Bayesian nonparametric hidden semi-markov models. The Journal of Machine Learning Research, Vol. 14, No. 1, pp. 673–701, 2013. HDP-HSMM (hierarchical Dirichlet process hidden semi-Markov model) corresponds..
  • 21. Inference (Blocked Gibbs sampler)  Blocked Gibbs sampler can be derived by extending HDP- HMM’s backward filtering-forward sampling algorithm. Backward filtering Forward sampling Parameter update very heavy....
  • 22. Evaluation experiment using artificial 2 or 3 words sentences with Japanese five vowels  Five artificial words {aioi, aue, ao, ie, uo} prepared by connecting five Japanese vowels.  30 sentences (25 two-word and 5 three-word sentences) are prepared and each sentence is recorded twice by four Japanese speakers.  MFCC (frame size =25ms, shift = 10ms, frame rate 100hz) ex. aioi ao γLM Language model (Word bigram) γWM i=1,…,∞ αWM j=1,…,∞ Word model (Letter bigram) z1 zs-1 zs zs+1 zS Latent words (Super state sequence) wi i=1,…,∞ ls1 lsk lsL Latent letters Ds1 Dsk x1 xt1 s1 xT Acoustic model ωj θj G H yT Observation Ds1 Dsk DsL Duration βLM αLM πLM i βWM πWM j xt 2 s1 xt1 sk xt 2 sk xt 1 sL xt 2 sL j=1,…,∞ yt 2 sLyt 1 sLyt1 sk yt 2 sk yt1 s1 yt 2 s1 y1 DsL zs zszs zs zs zs zs * HDP-HLM are trained separately for each speaker.
  • 23. Sample of results  Compared to Conventional DAA, NPB-DAA could discover latent words accurately.  The inference procedure could gradually estimate the boundaries of words and phonemes. ex) ao-ie-ao
  • 24. Unsupervised word discovery with trained phoneme recognizer Nonparametric Bayesian Double Articulation Analyzer (NPB-DAA) based on HDP-HLM [Taniguchi ’16]  The method could estimate language and acoustic/phoneme models simultaneously.  The comparative methods were compared using ARI (adjusted rand index) from the viewpoint of frame-based clustering task.  It even outperformed an off-the-shelf speech recognition system-based method in a word discovery task. 24 Tadahiro Taniguchi, Shogo Nagasaka, Ryo Nakashima, Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals, IEEE Transactions on Cognitive and Developmental Systems.(2016) Unsupervised word discovery Speech recognition with off-the-shelf ASR system
  • 25. Double Articulation Analyzer with Deep Sparse Autoencoder for Unsupervised Word Discovery from Speech Signals  Deep learning-based feature extraction method , deep sparse autoencoder (DSAE), was employed to increase the performance of NPB-DAA.  DSAE can be trained in an unsupervised manner. Therefore, the total learning system is still an unsupervised learning system.
  • 26. Experimental results The NPB-DAA with DSAE even outperformed MFCC-based off-the-shelf speech recognition system. Tadahiro Taniguchi, Ryo Nakashima, Hailong Liu and Shogo Nagasaka, Double Articulation Analyzer with Deep Sparse Autoencoder for Unsupervised Word Discovery from Speech Signals, Advanced Robotics.(2016)
  • 27. Contents 1. Introduction  Symbol emergence in robotics 2. Word discovery with multimodal categorization 3. Direct word discovery from speech signals 4. Conclusion @tanichu
  • 28. Conclusion  Symbol Emergence in Robotics is introduced. SER is a synthetic approach towards developmental mental system involving language acquisition and symbol emergence systems.  An unsupervised machine learning method for word discovery by robots, NPB-DAA, is introduced. This is based on Nonparametric Bayesian approach. … … …
  • 29. Current problems and future challenges • Current Problems – Computational cost • Analyzing only 60 sentences require more than one hour. – Speaker dependency • Unsupervised learning from multi-speaker speech signals is currently difficult because each speaker’s acoustic feature is different from each other. • Future Challenges – Efficient and Fast Algorithm • Inventing more efficient inference methods and using more computational resources are our future directions. – Unsupervised Speaker Adaptation • Developing a unsupervised speaker adaptation method for language acquisition from multi-speaker speech signals. – Mutual learning of words, phonemes and objects. • It is expected that phoneme acquisition performance is also increased by learning phonemes with objects simultaneously.
  • 30. Future challenge Word discovery for symbol emergence in robotics … … … NPB-DAA Multimodal object categorization SLAM Motion Primitives Affordance learning Syntax learning Probabilistic information PhonemesPhonemes & words
  • 32. Information 2016/10/16 email: taniguchi@ci.ristumei.ac.jp Special Thanks • Ritsumeikan University • R. Nakashima, S. Nagasaka, A. Taniguchi, K. Hayashi • DENSO co. • T. Bando, K. Takenaka, K. Hitomi • Okayama Pref. Univ. • N. Iwahashi Visit http://www.tanichu.com/ Facebook: please search me Twitter: @tanichu Acknowledgement [Github] NPB-DAA https://github.com/EmergentSystemLabStudent/NPB_DAA @tanichu