SlideShare une entreprise Scribd logo
1  sur  31
Hybrid Multichannel Signal Separation Using
Supervised Nonnegative Matrix Factorization
Daichi Kitamura, (The University of Tokyo, Japan)
Hiroshi Saruwatari, (The University of Tokyo, Japan)
Satoshi Nakamura, (Nara Institute of Science and Technology, Japan)
Yu Takahashi, (Yamaha Corporation, Japan)
Kazunobu Kondo, (Yamaha Corporation, Japan)
Hirokazu Kameoka, (The University of Tokyo, Japan)
東京大学,YAMAHA
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
2
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
3
Research background
• Signal separation have received much attention.
• Music signal separation based on nonnegative matrix
factorization (NMF) is a very active research area.
• Supervised NMF (SNMF) achieves the highest
separation performance.
• To improve its performance, SNMF-based
multichannel signal separation method is required.
4
• Automatic music transcription
• 3D audio system, etc.
Applications
Separate!
Separate the target signal from multichannel
signals with high accuracy.
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
5
• NMF can extract significant spectral patterns.
– Basis matrix has frequently-appearing spectral patterns
in .
NMF [Lee, et al., 2001]
Amplitude
Amplitude
Observed matrix
(spectrogram)
Basis matrix
(spectral patterns)
Activation matrix
(Time-varying gain)
Time
Ω: Number of frequency bins
𝑇: Number of time frames
𝐾: Number of bases
Time
Frequency
Frequency
6
Basis
• SNMF
– Supervised spectral separation method
Supervised NMF [Smaragdis, et al., 2007]
Separation process Optimize
Training process
Supervised basis matrix
(spectral dictionary)
Sample sounds
of target signal
7
Fixed
Sample sound
Target signal Other signalMixed signal
Problems of SNMF
• SNMF is only for a single-channel signal
– For multichannel signal, SNMF cannot use information
between channels.
• When many interference sources exist, separation
performance of SNMF markedly degrades.
8
Separate
Residual
components
9
• Multichannel NMF
– is a natural extension of NMF for a multichannel signal
– uses spatial information for the clustering of bases to
achieve the unsupervised separation task.
Multichannel NMF [Sawada, et al., 2013]
Problems:
Multichannel NMF involve strong dependence on initial values
and lack robustness.
Microphone array
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– Motivation and strategy
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
10
• Sawada’s multichannel NMF
– is unified method to solve spatial and spectral separations.
– Maximizes a likelihood:
– For supervised situation, target spectral patterns is given.
– Too much difficult to solve (lack robustness)
– Computationally inefficient (much computational time)
Motivation and strategy
11
Spatial direction
of target signal
Source components
of all signals
Target Other
Observed spectrograms
• Proposed hybrid method
– divides the problems as follows:
– The spatial separation should be carried out with classical
D.O.A. estimation methods.
• These methods are very efficient and stable.
– Divide and conquer method
Motivation and strategy
12
Unsupervised
spatial separation
Supervised
spectral separation
Approximation
Classical D.O.A. estimation SNMF-based method
Directional clustering [Araki, et al., 2007]
• Directional clustering
– Unsupervised spatial separation method
– k-means clustering (fast and stable)
• Problems
– Artificial distortion arises owing to the binary masking.
13
Right
L R
Center
Left
L R
Center
Binary masking
Input signal (stereo) Separated signal
1 1 1 0 0 0
1 0 0 0 0 0
1 1 1 1 0 0
1 0 0 0 0 0
1 1 1 1 1 1
Frequency
Time
C C C R L R
C L L L R R
C C C C R R
C R R L L L
C C C C C C
Frequency
Time
Binary maskSpectrogram
Entry-wise product
Proposed method: hybrid separation
• Hybrid separation method
14
Input stereo signal
Spatial separation method
(Directional clustering)
SNMF-based separation method
(SNMF with spectrogram restoration)
Separated signal
L R
SNMF with spectrogram restoration
: Holes
Time
Frequency
Separated cluster
Spectral holes (lost components)
The proposed SNMF treats these
holes as unseen observations
Supervised basis
…
Extrapolate the
fittest bases
15
(dictionary of target signal)
Fix up
SNMF with spectrogram restoration
Center RightLeft
Direction
sourcecomponent
z
(b)
Center RightLeft
Direction
sourcecomponent
(a)
Target
Center RightLeft
Direction
sourcecomponent
(c)
Extrapolated
componentsFrequencyofFrequencyofFrequencyof
After
Input
After
signal
directional
clustering
super-
resolution-
based SNMF
Binary
masking
16
Time
FrequencyObserved spectrogram
Target
Interference
Time
Time
Frequency
Extrapolate
Frequency
Separated cluster
Reconstructed data
Supervised
spectral bases
Directional
clustering
SNMF with
spectrogram restoration
• The divergence is defined at all grids except for the
holes by using the Binary mask matrix .
Decomposition model and cost function
17
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Cost function:
: Binary masking matrix obtained from directional clustering
• The divergence is defined at all grids except for the
holes by using the Binary mask matrix .
Decomposition model and cost function
18
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Cost function:
: Binary masking matrix obtained from directional clustering
Binary index to exclude the holes
• The divergence is defined at all grids except for the
holes by using the Binary mask matrix .
Decomposition model and cost function
19
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Regularization term
Cost function:
: Binary masking matrix obtained from directional clustering
Binary index to exclude the holes
• The divergence is defined at all grids except for the
holes by using the Binary mask matrix .
Decomposition model and cost function
20
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Regularization term
Penalty term
[Kitamura, et al. 2014]
Cost function:
: Binary masking matrix obtained from directional clustering
Binary index to exclude the holes
• : -divergence [Eguchi, et al., 2001]
– EUC-distance
– KL-divergence
– IS-divergence
Generalized divergence: b -divergence
21
The best criterion for
signal separation
[Kitamura, et al., 2014]
• We used two -divergences for the main cost and
the regularization cost as and .
Decomposition model and cost function
22
Decomposition model:
Cost function:
Supervised bases (Fixed)
Update rules
• We can obtain the update rules for the optimization of
the variables matrices , , and .
23
Update rules:
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
24
• Mixed signal includes four melodies (sources).
• Three compositions of instruments
– We evaluated the average score of 36 patterns.
Experimental condition
25
Center
1
2 3
4
Left Right
Target source
Supervision
signal
24 notes that cover all the notes in the target melody
Dataset Melody 1 Melody 2 Midrange Bass
No. 1 Oboe Flute Piano Trombone
No. 2 Trumpet Violin Harpsichord Fagotto
No. 3 Horn Clarinet Piano Cello
14
12
10
8
6
4
2
0
SDR[dB]
43210
bNMF
• Signal-to-distortion ratio (SDR)
– total quality of the separation, which includes the degree of
separation and absence of artificial distortion.
Experimental result: closed data
26
Good
Bad
Conventional SNMF
(single-channel SNMF)
Proposed hybrid method
Directional
clustering
Supervised
Multichannel
NMF [Sawada]
KL-divergence EUC-distance
SNMF with spectrogram restoration
• SNMF with spectrogram restoration has two tasks.
• The optimal divergence for source separation is KL-
divergence ( ).
• In contrast, a divergence with higher value is
suitable for the basis extrapolation.
27
Source
separation
SNMF with
spectrogram restoration
Basis
extrapolation
Trade-off: separation and restoration
• The optimal divergence for SNMF with spectrogram
restoration and its hybrid method is based on the
trade-off between separation and restoration abilities.
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
Sparseness: strong Sparseness: weak
28
Performance
Separation
Total performance of the hybrid method
Restoration
0 1 2 3 4
• Closed data experiment
– used different Tone generator for training and test signals
Experimental condition
29
Supervision
signal
24 notes that cover all the notes in the target melody
Provided by Tone generator A
Provided by Tone generator B
(more real sound)
+ back ground noise (SNR = 10 dB)
Center
1
2 3
4
Left Right
Target source
10
8
6
4
2
0
-2
-4
SDR[dB]
43210
bNMF
• Signal-to-distortion ratio (SDR)
– total quality of the separation, which includes the degree of
separation and absence of artificial distortion.
Experimental result: open data
30
Good
Bad
Conventional SNMF
(single-channel SNMF)
Proposed hybrid method
Directional
clustering
Supervised
Multichannel
NMF [Sawada]
KL-divergence EUC-distance
Conclusions
• We proposed a hybrid multichannel signal separation
method combining directional clustering and SNMF
with spectrogram restoration.
• There is a trade-off between separation and
restoration abilities.
31
Thank you for your attention!
Demonstration
is available!

Contenu connexe

Tendances

Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
奈良先端大 情報科学研究科
 
Depth Estimation of Sound Images Using Directional Clustering and Activation...
Depth Estimation of Sound Images Using  Directional Clustering and Activation...Depth Estimation of Sound Images Using  Directional Clustering and Activation...
Depth Estimation of Sound Images Using Directional Clustering and Activation...
奈良先端大 情報科学研究科
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
奈良先端大 情報科学研究科
 

Tendances (20)

Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...
 
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
 
Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...
 
Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...
 
Superresolution-based stereo signal separation via supervised nonnegative mat...
Superresolution-based stereo signal separation via supervised nonnegative mat...Superresolution-based stereo signal separation via supervised nonnegative mat...
Superresolution-based stereo signal separation via supervised nonnegative mat...
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure models
 
Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...
 
Depth estimation of sound images using directional clustering and activation-...
Depth estimation of sound images using directional clustering and activation-...Depth estimation of sound images using directional clustering and activation-...
Depth estimation of sound images using directional clustering and activation-...
 
DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...
 
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
 
Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
Depth Estimation of Sound Images Using Directional Clustering and Activation...
Depth Estimation of Sound Images Using  Directional Clustering and Activation...Depth Estimation of Sound Images Using  Directional Clustering and Activation...
Depth Estimation of Sound Images Using Directional Clustering and Activation...
 
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
 
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
 

En vedette

Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討
Shinnosuke Takamichi
 

En vedette (10)

Asj2017 3 bileveloptnmf
Asj2017 3 bileveloptnmfAsj2017 3 bileveloptnmf
Asj2017 3 bileveloptnmf
 
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
 
Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討
 
Asj2017 3invited
Asj2017 3invitedAsj2017 3invited
Asj2017 3invited
 
Slp201702
Slp201702Slp201702
Slp201702
 
ILRMA 20170227 danwakai
ILRMA 20170227 danwakaiILRMA 20170227 danwakai
ILRMA 20170227 danwakai
 
Ea2015 7for ss
Ea2015 7for ssEa2015 7for ss
Ea2015 7for ss
 
Discriminative SNMF EA201603
Discriminative SNMF EA201603Discriminative SNMF EA201603
Discriminative SNMF EA201603
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
 
数値解析と物理学
数値解析と物理学数値解析と物理学
数値解析と物理学
 

Similaire à Hybrid NMF APSIPA2014 invited

Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Alpen-Adria-Universität
 
automatic detection of pulmonary nodules in lung ct images
automatic detection of pulmonary nodules in lung ct imagesautomatic detection of pulmonary nodules in lung ct images
automatic detection of pulmonary nodules in lung ct images
Wookjin Choi
 
A Sparse-Coding Based Approach for Class-Specific Feature Selection
A Sparse-Coding Based Approach for Class-Specific Feature SelectionA Sparse-Coding Based Approach for Class-Specific Feature Selection
A Sparse-Coding Based Approach for Class-Specific Feature Selection
Davide Nardone
 

Similaire à Hybrid NMF APSIPA2014 invited (20)

A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
 
Test vector compression
Test vector compressionTest vector compression
Test vector compression
 
Test vector compression in Digital Testing
Test vector compression in Digital Testing Test vector compression in Digital Testing
Test vector compression in Digital Testing
 
Non-Uniform sampling and reconstruction of multi-band signals
Non-Uniform sampling and reconstruction of multi-band signalsNon-Uniform sampling and reconstruction of multi-band signals
Non-Uniform sampling and reconstruction of multi-band signals
 
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxSPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
 
sub topics of NMR.pptx
sub topics of NMR.pptxsub topics of NMR.pptx
sub topics of NMR.pptx
 
DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...
 
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
 
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
 
FYP presentation
FYP presentationFYP presentation
FYP presentation
 
M.sc. presentation t.bagheri fashkhami
M.sc. presentation t.bagheri fashkhamiM.sc. presentation t.bagheri fashkhami
M.sc. presentation t.bagheri fashkhami
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detection
 
sequencea.ppt
sequencea.pptsequencea.ppt
sequencea.ppt
 
Phased Array Scan Planning and Modeling for Weld inspection
Phased Array Scan Planning and Modeling for Weld inspectionPhased Array Scan Planning and Modeling for Weld inspection
Phased Array Scan Planning and Modeling for Weld inspection
 
time based ranging via uwb radios
time based ranging via uwb radiostime based ranging via uwb radios
time based ranging via uwb radios
 
Recovery of low frequency Signals from noisy data using Ensembled Empirical M...
Recovery of low frequency Signals from noisy data using Ensembled Empirical M...Recovery of low frequency Signals from noisy data using Ensembled Empirical M...
Recovery of low frequency Signals from noisy data using Ensembled Empirical M...
 
MU- mimo [autosaved]
MU- mimo [autosaved]MU- mimo [autosaved]
MU- mimo [autosaved]
 
automatic detection of pulmonary nodules in lung ct images
automatic detection of pulmonary nodules in lung ct imagesautomatic detection of pulmonary nodules in lung ct images
automatic detection of pulmonary nodules in lung ct images
 
A Sparse-Coding Based Approach for Class-Specific Feature Selection
A Sparse-Coding Based Approach for Class-Specific Feature SelectionA Sparse-Coding Based Approach for Class-Specific Feature Selection
A Sparse-Coding Based Approach for Class-Specific Feature Selection
 

Dernier

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
Bhagirath Gogikar
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 

Dernier (20)

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 

Hybrid NMF APSIPA2014 invited

  • 1. Hybrid Multichannel Signal Separation Using Supervised Nonnegative Matrix Factorization Daichi Kitamura, (The University of Tokyo, Japan) Hiroshi Saruwatari, (The University of Tokyo, Japan) Satoshi Nakamura, (Nara Institute of Science and Technology, Japan) Yu Takahashi, (Yamaha Corporation, Japan) Kazunobu Kondo, (Yamaha Corporation, Japan) Hirokazu Kameoka, (The University of Tokyo, Japan) 東京大学,YAMAHA
  • 2. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 2
  • 3. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 3
  • 4. Research background • Signal separation have received much attention. • Music signal separation based on nonnegative matrix factorization (NMF) is a very active research area. • Supervised NMF (SNMF) achieves the highest separation performance. • To improve its performance, SNMF-based multichannel signal separation method is required. 4 • Automatic music transcription • 3D audio system, etc. Applications Separate! Separate the target signal from multichannel signals with high accuracy.
  • 5. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 5
  • 6. • NMF can extract significant spectral patterns. – Basis matrix has frequently-appearing spectral patterns in . NMF [Lee, et al., 2001] Amplitude Amplitude Observed matrix (spectrogram) Basis matrix (spectral patterns) Activation matrix (Time-varying gain) Time Ω: Number of frequency bins 𝑇: Number of time frames 𝐾: Number of bases Time Frequency Frequency 6 Basis
  • 7. • SNMF – Supervised spectral separation method Supervised NMF [Smaragdis, et al., 2007] Separation process Optimize Training process Supervised basis matrix (spectral dictionary) Sample sounds of target signal 7 Fixed Sample sound Target signal Other signalMixed signal
  • 8. Problems of SNMF • SNMF is only for a single-channel signal – For multichannel signal, SNMF cannot use information between channels. • When many interference sources exist, separation performance of SNMF markedly degrades. 8 Separate Residual components
  • 9. 9 • Multichannel NMF – is a natural extension of NMF for a multichannel signal – uses spatial information for the clustering of bases to achieve the unsupervised separation task. Multichannel NMF [Sawada, et al., 2013] Problems: Multichannel NMF involve strong dependence on initial values and lack robustness. Microphone array
  • 10. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – Motivation and strategy – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 10
  • 11. • Sawada’s multichannel NMF – is unified method to solve spatial and spectral separations. – Maximizes a likelihood: – For supervised situation, target spectral patterns is given. – Too much difficult to solve (lack robustness) – Computationally inefficient (much computational time) Motivation and strategy 11 Spatial direction of target signal Source components of all signals Target Other Observed spectrograms
  • 12. • Proposed hybrid method – divides the problems as follows: – The spatial separation should be carried out with classical D.O.A. estimation methods. • These methods are very efficient and stable. – Divide and conquer method Motivation and strategy 12 Unsupervised spatial separation Supervised spectral separation Approximation Classical D.O.A. estimation SNMF-based method
  • 13. Directional clustering [Araki, et al., 2007] • Directional clustering – Unsupervised spatial separation method – k-means clustering (fast and stable) • Problems – Artificial distortion arises owing to the binary masking. 13 Right L R Center Left L R Center Binary masking Input signal (stereo) Separated signal 1 1 1 0 0 0 1 0 0 0 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 1 1 1 1 1 Frequency Time C C C R L R C L L L R R C C C C R R C R R L L L C C C C C C Frequency Time Binary maskSpectrogram Entry-wise product
  • 14. Proposed method: hybrid separation • Hybrid separation method 14 Input stereo signal Spatial separation method (Directional clustering) SNMF-based separation method (SNMF with spectrogram restoration) Separated signal L R
  • 15. SNMF with spectrogram restoration : Holes Time Frequency Separated cluster Spectral holes (lost components) The proposed SNMF treats these holes as unseen observations Supervised basis … Extrapolate the fittest bases 15 (dictionary of target signal) Fix up
  • 16. SNMF with spectrogram restoration Center RightLeft Direction sourcecomponent z (b) Center RightLeft Direction sourcecomponent (a) Target Center RightLeft Direction sourcecomponent (c) Extrapolated componentsFrequencyofFrequencyofFrequencyof After Input After signal directional clustering super- resolution- based SNMF Binary masking 16 Time FrequencyObserved spectrogram Target Interference Time Time Frequency Extrapolate Frequency Separated cluster Reconstructed data Supervised spectral bases Directional clustering SNMF with spectrogram restoration
  • 17. • The divergence is defined at all grids except for the holes by using the Binary mask matrix . Decomposition model and cost function 17 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Cost function: : Binary masking matrix obtained from directional clustering
  • 18. • The divergence is defined at all grids except for the holes by using the Binary mask matrix . Decomposition model and cost function 18 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Cost function: : Binary masking matrix obtained from directional clustering Binary index to exclude the holes
  • 19. • The divergence is defined at all grids except for the holes by using the Binary mask matrix . Decomposition model and cost function 19 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Regularization term Cost function: : Binary masking matrix obtained from directional clustering Binary index to exclude the holes
  • 20. • The divergence is defined at all grids except for the holes by using the Binary mask matrix . Decomposition model and cost function 20 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Regularization term Penalty term [Kitamura, et al. 2014] Cost function: : Binary masking matrix obtained from directional clustering Binary index to exclude the holes
  • 21. • : -divergence [Eguchi, et al., 2001] – EUC-distance – KL-divergence – IS-divergence Generalized divergence: b -divergence 21 The best criterion for signal separation [Kitamura, et al., 2014]
  • 22. • We used two -divergences for the main cost and the regularization cost as and . Decomposition model and cost function 22 Decomposition model: Cost function: Supervised bases (Fixed)
  • 23. Update rules • We can obtain the update rules for the optimization of the variables matrices , , and . 23 Update rules:
  • 24. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 24
  • 25. • Mixed signal includes four melodies (sources). • Three compositions of instruments – We evaluated the average score of 36 patterns. Experimental condition 25 Center 1 2 3 4 Left Right Target source Supervision signal 24 notes that cover all the notes in the target melody Dataset Melody 1 Melody 2 Midrange Bass No. 1 Oboe Flute Piano Trombone No. 2 Trumpet Violin Harpsichord Fagotto No. 3 Horn Clarinet Piano Cello
  • 26. 14 12 10 8 6 4 2 0 SDR[dB] 43210 bNMF • Signal-to-distortion ratio (SDR) – total quality of the separation, which includes the degree of separation and absence of artificial distortion. Experimental result: closed data 26 Good Bad Conventional SNMF (single-channel SNMF) Proposed hybrid method Directional clustering Supervised Multichannel NMF [Sawada] KL-divergence EUC-distance
  • 27. SNMF with spectrogram restoration • SNMF with spectrogram restoration has two tasks. • The optimal divergence for source separation is KL- divergence ( ). • In contrast, a divergence with higher value is suitable for the basis extrapolation. 27 Source separation SNMF with spectrogram restoration Basis extrapolation
  • 28. Trade-off: separation and restoration • The optimal divergence for SNMF with spectrogram restoration and its hybrid method is based on the trade-off between separation and restoration abilities. -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] Sparseness: strong Sparseness: weak 28 Performance Separation Total performance of the hybrid method Restoration 0 1 2 3 4
  • 29. • Closed data experiment – used different Tone generator for training and test signals Experimental condition 29 Supervision signal 24 notes that cover all the notes in the target melody Provided by Tone generator A Provided by Tone generator B (more real sound) + back ground noise (SNR = 10 dB) Center 1 2 3 4 Left Right Target source
  • 30. 10 8 6 4 2 0 -2 -4 SDR[dB] 43210 bNMF • Signal-to-distortion ratio (SDR) – total quality of the separation, which includes the degree of separation and absence of artificial distortion. Experimental result: open data 30 Good Bad Conventional SNMF (single-channel SNMF) Proposed hybrid method Directional clustering Supervised Multichannel NMF [Sawada] KL-divergence EUC-distance
  • 31. Conclusions • We proposed a hybrid multichannel signal separation method combining directional clustering and SNMF with spectrogram restoration. • There is a trade-off between separation and restoration abilities. 31 Thank you for your attention! Demonstration is available!

Notes de l'éditeur

  1. This is outline of my talk.
  2. This is outline of my talk.
  3. Recently, // signal separation technologies have received much attention. These technologies are available for many applications. And nonnegative matrix factorization, // NMF in short, // has been a very active area of the signal separation. Particularly, supervised NMF (SNMF) / achieves good separation performance. However, SNMF can be used for only single-channel signals. To improve its performance, SNMF-based multichannel signal separation method is required.
  4. This is outline of my talk.
  5. Before explaining a supervised NMF, I will explain the basic of simple NMF. NMF is a powerful method for extracting significant features from a spectrogram. This method decomposes the input spectrogram Y into a product of basis matrix F and activation matrix G, where basis matrix F / has frequently-appearing spectral patterns / as basis vectors like this, and activation matrix G / has time-varying gains / of each spectral pattern.
  6. To separate the target signal with NMF, Supervised NMF has been proposed. In SNMF, first, we train the sample sound of the target signal, which is like a musical scale. Then we construct the supervised basis F. This is a spectral dictionary of the target sound. Next, we separate the mixed signal / using the supervised basis F, as FG+HU. Therefore, the target signal obtained as FG, and the other signal is reconstructed by HU.
  7. The problem of SNMF is that This is only for a single-channel signals. We cannot use any information between channels. But almost all music signals are the stereo format. So we should extend simple SNMF to the a multichannel SNMF. In addition, when many interfering sources exist, the separation performance of SNMF markedly degrades.
  8. As another means for the multichannel signal separation, Multichannel NMF also has been proposed by Sawada. This is a natural extension of NMF, and uses spatial information for the clustering of bases, to achieve the unsupervised separation. However, this method is very difficult optimization problem mathematically. So, this method strongly depends on the initial values.
  9. Sawada’s multichannel NMF is a unified method to solve spatial and spectral separations simultaneously. This method maximizes a likelihood like this, where theta is a spatial direction of the target signal, F, G, H, and U is source components of target and other signals, Y is an observed given spectrogram of both channels. For the supervised situation, the target spectral patterns F is given like this. However, even if F is given, this optimization is too much difficult to solve. So it lacks robustness. Also, it requires much computational time.
  10. Our proposed method approximately divides the problem into the unsupervised spatial separation and supervised spectral separation. Because we can use efficient classical D.O.A. estimation methods for the spatial separation. This is very efficient and stable. Then SNMF is applied for the spectral separation problem. Therefore, this method can be considered as a divide and conquer method. The optimal methods are applied for each separations.
  11. For the spatial separation, we used a directional clustering because this is very fast and stable. This method utilizes level difference between left and right channels as a clustering cue. So, we can separate the sources direction-wisely. And this is equal to binary masking in the spectrogram domain. We get the binary mask from the result of clustering, and we calculate an entry-wise product. Finally we obtain the separated direction. However, the separated direction has an artificial distortion owing to the binary masking.
  12. So we proposed a new SNMF-based method named SMNF with spectrogram restoration. This is the concept of our proposed hybrid method. First, the target direction is separated. Then, target signal is extracted by this new SNMF.
  13. Here, / the separated signal by directional clustering / has many spectral holes owing to the binary masking. This spectrum is an example. There are so many spectral holes owing to the binary masking. However, / the proposed SNMF treats these holes as unseen observations like this. We exclude these components from the cost function. Then, the target bases are extrapolated using the fittest spectral pattern / from the supervised bases F. As a result, the lost components are restored by the supervised basis extrapolation.
  14. This figure shows the directional distribution of the input stereo signal. The target source is in the center direction, and the other interfering sources are distributed like this. After directional clustering, / left and right source components / leak in the center cluster, // and center sources lose some of their components. These lost components / correspond to the spectral holes. And after SNMF with spectrogram restoration, the target components are separated / and restored using supervised bases. In other words, / the resolution of the target spectrogram / is recovered.
  15. This is a decomposition model of SNMF with spectrogram restoration. It is the same as the simple SNMF. And, J is the cost function of the proposed SNMF. In this cost function,
  16. We introduce the binary index i, which is for excluding the holes from the total cost. This index is obtained from the binary mask matrix. Therefore, the divergence is defined at all spectrogram grids / except for the spectral holes.
  17. For the grids of the holes, we impose a regularization term to avoid the extrapolation error.
  18. The third term is a penalty term to avoid sharing the same basis between F and H. This penalty improves the separation performance in SNMF.
  19. For the divergence measure, we propose to use beta-divergence. This is a generalized distance function, which involves EUC-distance, KL-divergence, and IS-divergence when beta = 2, 1, and 0. In SNMF, it is reported that / KL-divergence is the best criterion for the signal separation.
  20. And we used two beta-divergences for the main cost and regularization cost / as beta_NMF and beta_reg.
  21. From the minimization of the cost function, / we can obtain the update rules / for the optimization of variable matrices G, H, and U.
  22. This is outline of my talk.
  23. This is an experimental condition. The mixed signal includes four melodies. Each sound source located like this figure, / where the target source is always located in the center direction / with other interfering source. And we prepared 3 compositions of instruments and evaluated the average score of 36 patterns. In addition, the supervision signal has 24 notes like this score, which cover all the notes in the target melody.
  24. This is a result of experiment. We showed the average SDR score, where SDR indicates the total quality of the separation. Directional clustering cannot separate the sources in the same direction, so the result was not good. Multichannel NMF strongly depends on the initial value, and the average score becomes bad. The hybrid method outperforms the conventional SNMF. And the conventional SNMF achieves the highest score when beta equals 1, KL-divergence. However, surprisingly, EUC-distance is preferable for the proposed hybrid method.
  25. This is because / SNMF with spectrogram restoration has two tasks, namely, Separation of the target signal / and basis extrapolation for the restoration of the spectrogram. And it is reported that the KL-divergence is suitable for the source separation. However, in contrast, a divergence with higher beta value is suitable for the basis extrapolation. This fact is experimentally proven in our paper.
  26. The reason is that / if we use the smaller beta value, such as a KL-divergence, the obtained basis becomes sparse. (pointing figure) On the other hand, if we use the higher beta value, the sparseness of the basis becomes weak. And the sparse basis is not suitable for the basis extrapolation using only the observable data. Therefore, the optimal divergence for the hybrid method is around EUC-distance / because of the trade-off between separation and restoration abilities / like this graph. The optimal beta is shifted from 1 to 2.
  27. Also, we conducted an open data experiment. Here we used the different MIDI Tone generator for the training and test signals. Therefore, the waveforms are not same, but similar. In addition, we added the back ground noise to the test signals as SNR = 10 dB.
  28. This is the result. Even if we use the different training sound, we can achieve good results. Sawada’s multichannel NMF does not work because this method cannot reduce the defuse noise.
  29. This is conclusions of my talk. Thank you for your attention.