Robust FIR System Identification for Super-Gaussian Noise Based on Hyperbolic Secant Distribution (ISPACS2018)
1. Robust FIR System Identification
for Super-Gaussian Noise
Based on Hyperbolic Secant Distribution
H. Tanji, T. Murakami, H. Kamata
School of Science and Technology, Meiji University, Japan
ISPACS2018, Nov. 2018.
1
2. Outline 2
FIR System Identification
Overview
Statistical perspective
Proposed Method
Statistical model using super-Gaussian distribution
Optimization algorithm
Simulations in Noisy Environments
Conclusions
3. FIR System Identification 3
Estimation of characteristics of an unknown system
FIR model
Unknown system Observed signalReference signal
(known)
Noise
FIR: Finite Impulse Response
requires NO feedback path.
Optimization problem
is to obtain minimizing the error between and .
4. Identification in Actual Environments 4
We develop a robust algorithm
towards acoustic signal processing applications.
The observed signal is corrupted by super-Gaussian noise.
Speech
Music
Impulsive noise
Unknown system Observed signalReference signal
(known)
Noise
5. FIR System Identification 5
measures dispersion of the error.
for any
if and only if
Requirements for :
Unknown system Observed signalReference signal
(known)
Noise
Optimization problem
is to obtain minimizing the error between and .
6. FIR System Identification 6
Examples:
L2 norm
L1 norm
Optimization problem
is to obtain minimizing the error between and .
measures dispersion of the error.
for any
if and only if
Requirements for :
7. A Statistical Perspective 7
Assume a noise distribution
Maximize the log-likelihood function with respect to
Unknown system Observed signalReference signal
(known)
Noise
8. A Statistical Perspective 8
For Gaussian noise
For super-Gaussian noise: speech, music, etc.
Gaussian distribution is equivalent to L2 norm optimization.
Laplace distribution is equivalent to L1 norm optimization.
L2 norm
L1 norm
[Gustafsson]
[Song 2014, Bottegal 2015]
Assume a noise distribution
Maximize the log-likelihood function with respect to
9. Proposed Statistical Model 9
We assume that the noise follows
the hyperbolic secant (sech) distribution
Why we use the sech distribution?
The nonlinear function for
a super-Gaussian source in ICA
Derivative of the sech distribution
We appreciate the works in Independent Component Analysis (ICA)
as a statistical acoustic modeling [Bell 1995, Benesty].
[Baten 1937]
10. Comparison of distributions 10
Gauss
Laplace
Sech
GaussLaplace
Sech
Compared with the Gaussian distribution, the sech distribution
has a heavy tail
is concentrated at
11. Parameter Estimation 11
Cost function based on the sech distribution
The inverse of the sech distribution
Which parameter should we estimate?
Laplace
Sech Nonlinear in
Linear
in or
Answer:
Gauss
and
12. Parameter Estimation | MM algorithm 12
Cost function based on the sech distribution
The inverse of the sech distribution
Majorization-Minimization (MM) algorithm
minimizes an upper bound
of the cost function satisfying
updates and alternately.
: auxiliary variable
13. Parameter Estimation | MM algorithm 13
Cost function based on the sech distribution
The inverse of the sech distribution
Majorization-Minimization (MM) algorithm for
minimizes the upper bound of the cost function:
An inequality for log-hyperbolic cosine [Ono 2010]
updates , , and using the solutions of
14. Parameter Estimation | Update rules 14
Cost function based on the sech distribution
The inverse of the sech distribution
The update rules require NO step size parameter.
An estimate of is needed.
would not be canceled.
Majorization-Minimization (MM) algorithm for
The nonnegativity of
is guaranteed
15. Simulations | Overview 15
We estimate an unknown FIR system in noisy environments.
32000Length of the output
Evaluation index Normalized Mean Squared Error (NMSE) [Morgan 1998]
Averaged over 50 random trials
Order of FIR model 64
Competitors
Least square solution (Gaussian-based method) [Gustafsson]
Expectation-Maximization method
based on the Laplace distribution [Song 2014]
(a) Sech distributed noise
(b) Factory noise
(c) Speech + 5 dB of Gaussian noise
(d) Speech + 10 dB of Gaussian noise
(Kurtosis = 2.00)
(Kurtosis = 6.09)
(Kurtosis = 1.10)
(Kurtosis = 1.64)
Noise
16. Simulations | Noise 16
Sech distributed noise
(Kurtosis = 2.00)
(a)
Factory noise
(Kurtosis = 6.09)
(b)
Speech + 5 dB of Gaussian noise
(Kurtosis = 1.10)
(c)
Speech + 10 dB of Gaussian noise
(Kurtosis = 1.64)
(d)
17. Simulations | NMSE 17
The proposed method shows the best performance in (a), (b), and (c).
(a) Sech distributed noise (b) Factory noise
(c) Speech + 5 dB of Gaussian noise (d) Speech + 10 dB of Gaussian noise
18. Simulations | Behavior of convergence 18
The convergence of the proposed method is
significantly faster than Laplace-based method
Termination condition:
Averaged convergence curve of NMSE.
(d) Speech + 10 dB of Gaussian noise, at 20 [dB] input SNR.
Proposed method
Terminated
at 16th update (max.)
Laplace-based method
Terminated
at 500th update (min.)
19. Conclusions 19
FIR system Identification
Based on Super-Gaussian Distribution
We have introduced the statistical model based on the sech distribution
The relation between
the sech distribution and the nonlinear function in ICA has been discovered.
The proposed method has shown
fast convergence and favorable estimation performance
in the super-Gaussian noise environments.
Future Direction
Developing an adaptive algorithm based on the sech distribution
20. References I 20
[Gustafsson]
Gustafsson, “Adaptive filtering and change detection, ” Wiley, 2000.
[Song 2014]
W. Song, W. Yao, and Y. Xing, “Robust mixture regression model fitting by Laplace
distribution,” Computational Statistics & Data Analysis, vol. 71, no. Supplement C, pp.
128–137.
[Bottegal 2015]
G. Bottegal, H. Hjalmarsson, A.Y. Aravkin, and G. Pillonetto, “Outlier robust kernel-
based system identification using l1-Laplace techniques,” in Proc. 54th IEEE Conference
on Decision and Control (CDC), Dec. 2015, pp. 2109–2114.
[Baten 1937]
W.D. Baten, “The probability law for the sum of independent variables, each subject
to the law ”, Bulletin of the American Mathematical Society,
40(4), pp.284-290, Apr. 1937.
21. References II 21
[Bell 1995]
A.J. Bell and T.J. Sejnowski, “An information-maximization approach to blind separation
and blind deconvolution,” Neural Computation, vol. 7, no. 6, pp. 1129–1159.
[Benesty]
J. Benesty and Y. Huang, Adaptive signal processing: applications to real-world problems,
Springer, 2003.
[Morgan 1998]
D.R. Morgan, J. Benesty, and M.M. Sondhi, “On the evaluation of estimated impulse
responses,” IEEE Signal Processing Letters, vol. 5, no. 7, pp. 174–176.
[Ono 2010]
N. Ono and S. Miyabe, “Auxiliary-function-based independent component analysis for
super-Gaussian sources,” in Proc. 9th International Conference on Latent Variable
Analysis and Signal Separation (LVA/ICA), pp. 165–172.