Joint Noise Reduction and lp-Norm Minimization for Enhancing Time Delay Estimation in Colored Noise

Show more

1. Introduction

Estimating the time delay from two received signals at spatially separated sensors is of important significance in signal processing [1]. It has many practical applications, such as multichannel speech enhancement, echo cancellation and wireless communications. The basic problem of the time delay estimation (TDE) is estimating accurately the time delay of interfering signals, aiming to exclude the influence of noise and interference. Many TDE approaches have been proposed. They mainly include the generalized correlation method [2], the statistical method [3], the parametric estimation method [4], the adaptive estimation method [5] [6], the combinational estimation method [7], and the -norm minimization-based estimation method [8]. Among them, the -norm minimization-based estimation method can find the time delay by minimizing an -norm objective function. It was reported in [8] that this method can obtain more robust results than other several conventional approaches against impulse noise. However, these conventional TDE approaches do not consider the influence of noise, specially under low SNR conditions. Moreover, in general they are only efficient in white noise.

Speech enhancement techniques have been applied in speech recognition and voice communication. They can recover the clean speech signal from the noisy signal by noise reduction. Speech enhancement algorithms can be classified as single channel and multichannel speech enhancement algorithms. The multichannel speech enhance- ment algorithms usually mix multiple noisy signals for noise reduction. By contrast, the single speech enhance- ment algorithms utilize only one noisy signal and thus do not change the time delay of the noisy signal at each channel. So, the single speech enhancement algorithms have the potential to improve the performance of the TDE algorithm. At present, the single-channel speech enhancement algorithms mainly include the spectral subtraction-based methods [9], the Kalman filtering-based parametric method [10], the statistic-based approach [11] [12], and the subspace-based method [13]. These conventional algorithms are suitable for white noise reduction. To deal with colored noise, one conventional approach is that the noisy speech signal is multiplied by the square root of the noise covariance matrix’s inverse [14]. Another conventional approach is the prewhitening covariance matrix of the colored noise. These prewhitening approaches all require to estimate the covariance matrix of the colored noise in advance. Recently, to avoid disadvantage of estimating the covariance matrix of the colored noise, an improved subspace method was proposed in [15]. It was reported that the improved subspace method outperforms conventional speech enhancement methods in colored noise reduction.

In this paper, a new method for enhancing time delay estimation (TDE) in colored noise is presented by joining noise reduction and -norm minimization. We first perform the improved subspace method for enhanced signals corrupted by colored noise and then we use the -norm minimization based TDE method to estimate the time delay from the enhanced signals. Experiment results show that the proposed joint algorithm can obtain more accurate TDE than several conventional algorithms in colored noise, especially in the case of low signal-to-noise ratio.

2. TDE Signal Model and Estimation

2.1. Signal Model

Consider the following TDE signal model:

(1)

where is the unknown random source signal, is the attenuation factor, is the time delay to be estimated, and are uncorrelated noise observations which are independent of, and is the sampling length of the noisy signals. The goal of TDE is to estimate the delay from noise observations.

2.2. l_{p}-Norm Minimization-Based Estimation

For robust TDE, Ma and Nikias introduced [16] the following -norm cost function about the delay D and the attenuation factor:

(2)

where, M is the order parameter and is the sinc function. (2) can be written in a

matrix form as:

(3)

where, , , and

To minimize, Zeng et al. presented an efficient two-steps procedure [8]. In the first step, the global optimum is estimated for each given D. The estimation for the global optimum has the following three cases.

Case 1:. The cost function is a one-dimensional quadratic function of. Its optimal solution is given by

(4)

Case 2:. The cost function is the least absolute deviation function:

(5)

where is the th element of. Let, then the optimal

solution of the cost function is the weighted median of the sequence with the weights. The procedure of computation of is listed in Algorithm 1.

Case 3:. The cost function has derivative. Thus the following fixed-point iteration is used to find the optimal solution:

(6)

where is the estimate of in the kth iteration, and

where is the th element of. In the second step, a search range and a step size are first determined, then the value of increases from to with the step size being, after that the delay profile is computed by substituting each given and the corresponding into

the cost function, finally the minimum is used to estimate the.

3. Proposed TDE Algorithm

3.1. Improved Subspace Method for Colored Noise Reduction

Without the loss of generality, we consider the following noise signal model:

(7)

Our goal is to restore from by colored noise reduction.

Recently, for colored noise reduction, an improved subspace method was presented in [15]. Let the colored noise be modeled as the pth order autoregressive signal process

(8)

where are the AR noise model parameters, is the drive noise which is assumed to be white with variance. Let K denote the length of one frame signal, and let, , and. Then (7) can be written in a vector form:

(9)

and (8) can be written in a vector form:

where and is the whitening matrix:

Multiplying (9) by, we have

(10)

where and. Since is the white noise, the conventional subspace method for white noise reduction can be directly used to estimate, given as

(11)

where, is the Lagrangian multiplier, , is the

covariance matrix of the whitening signal vector, and and consist of the eigenvector and eigenvalue of, respectively. Then the clean speech signal can be estimated by

(12)

For each signal frame, the improved subspace algorithm (denoted as ISS) is summarized as:

3.2. Proposed TDE Algorithm

In this section, we introduce a new method for enhancing time delay estimation (TDE) in colored noise, based on joint noise reduction and -norm minimization. An improved subspace method for colored noise reduction is first performed. The time delay is then estimated by using the enhanced signal, based on the -norm minimization. The proposed TDE algorithm is listed in Algorithm 3. Compared with conventional TDE algorithms, the proposed TDE algorithm can greatly reduce the interference of colored noise such that the TDE accuracy is enhanced.

4. Experimental Results

In this section, we conduct numerical simulations to demonstrate the effectiveness of the proposed algorithm. We compare the proposed TDE algorithm with the TDE algorithm based -norm minimization without noise reduction. We also compare with other two TDE algorithms based -norm minimization with noise reduction, where the minimum mean square error(MMSE) and maximum a posterior(MAP) estimators of the magnitude- squared spectrum(denoted as MMSE-MSS and MAP-MSS) are used for noise reduction, respectively. The source signal and noisy signal are taken from the NOIZEUS [17] and NOISEX [18] corpora, respectively. We randomly select twenty different speech sentences from the NOIZEUS corpora. Babble and factory noises are selected from the NOISEX corpora. Each speech sentence is corrupted by these two noises with different input SNRs. The true delay is set to, the attenuation factor is, the approximation order parameter is, the delay search range is, and the search step size is. To evaluate the performance of our proposed methods, we use the root mean square error(RMSE), which is defined as:

(13)

where is the number of speech sentences and is the delay estimate of the mth speech sentence. By the -norm minimization method we see that and are the best choice in Babble noise and factory noise, respectively.

In the first test, we perform the four algorithms for different values of the input SNRs. Figure 1 and Figure 2 display the RMSE results of the four algorithms with different values of the input SNRs (From 0 dB to 10 dB) in factory noise and babble noise, respectively. From the two figures, we first see that the four algorithms obtain higher value of RMSE when increasing the input SNRs. This indicates that the noise decrease the performance of TDE. Second, we see that the proposed algorithm can outperform the TDE algorithm based -norm minimization without noise reduction for all input SNRs in terms of RMSE. Third, the proposed TDE algorithm can get a lower value of RMSE than the other two TDE algorithms with noise reduction, based on the MMSE- MSS and MAP-MSS estimators, respectively. This also indicates that the proposed algorithm can obtain the best accurate TDE in terms of RMSE.

In the second test, we perform the four algorithms with the input SNRs being 5 dB via different values of. Figure 3 and Figure 4 display their RMSE results of the four algorithms in factory noise and babble noise,

Figure 1. RMSE of TDE based on four algorithms with different input SNRs and in factory noise.

Figure 2. RMSE of TDE based on four algorithms with different input SNRs and in babble noise.

Figure 3. RMSE of TDE based on four algorithms via different values of with the input SNRs being 5 dB and in factory noise.

Figure 4. RMSE of TDE based on four algorithms via different values of with the input SNRs being 5 dB and in babble noise.

respectively. From the two figures, we first see that the proposed algorithms outperform TDE without speech enhancement in any value of. Second, the proposed TDE algorithm can get a lower value of RMSE than the other two TDE algorithms with noise reduction, based on the MMSE-MSS and MAP-MSS estimators, respec- tively. This indicates that the proposed algorithm can obtain the best accurate TDE in any value of.

References

[1] Xing, H.Y. and Tang, J. (2008) Analysis and Survey of Algorithms for Time-Delay Estimation. Technical Acoustics, 27, 110-114.

[2] Knapp, C.H. and Carter, G.C (1976) The Generalized Correlation Method for Estimation of Time Delay. IEEE Trans. Acoust. Speech Signal Process, 24, 320-327. http://dx.doi.org/10.1109/TASSP.1976.1162830

[3] Torrieri, D.J. (1984) Statistical Theory of Passive Location Systems. IEEE Transactions on Aerospace and Electronic Systems, 20, 183-197. http://dx.doi.org/10.1109/TAES.1984.310439

[4] Kenneth, W.K., Lui, B., Frankie, K.W. and So Chan, H.C. (2009) Accurate Time Delay Estimation Based Passive Localization. Signal Processing, 89, 1835-1838. http://dx.doi.org/10.1016/j.sigpro.2009.03.009

[5] Palanisamy, P. and Kalyanasundaram, N. (2010) Sonar Target Detection by Modified Adaptive Noise Cancellation Using Correlating Filter. International Journal of Electronics, 98, 41-60.
http://dx.doi.org/10.1080/00207217.2010.497670

[6] Widrow, B. and Stearn, S.D. (1993) Adaptive Signal Processing. Prentice-Hall, Englewood Cliffs, NJ.

[7] Lotf, S., Mourad, T. and Sabeur, A.A.C. (2013) Performance of Wavelet Analysis and Neural Networks for Pathological Voices Identification. International Journal of Electronics, 98, 1129-1140.

[8] Zeng, W.J., So, H.C. and Abdelhak, M.Z. (2013) An Minimization Approach to Time Delay Estimation in Impulse Noise. Digital Signal Processing, 23, 1247-1254. http://dx.doi.org/10.1016/j.dsp.2013.03.013

[9] Boll, S. (1979) Suppression of Acoustic Noise in Speech Using Spectral Subtraction. IEEE Trans. Acoust., Speech, Signal Process, 27, 113-120. http://dx.doi.org/10.1109/TASSP.1979.1163209

[10] So, S. and Paliwal, K.K. (2011) Suppressing the Influence of Additive Noise on the Kalman Gain for Low Residual Noise Speech Enhancement. Speech Communication, 53, 355-378. http://dx.doi.org/10.1016/j.specom.2010.10.006

[11] Lu, Y. and Loizou, P.C. (2011) Estimators of the Mag-nitude-Squared Spectrum and Mehthods for Incorporating SNR Uncertainty. IEEE Transactions on Audio, Speech and Language Processing, 19, 1123-1137.
http://dx.doi.org/10.1109/TASL.2010.2082531

[12] Martin, R. (2005) Speech Enhancement Based on Minimum Mean-Square Error Estimation and Supergaussian Priors. IEEE Transactions on Speech Audio Processing, 13, 845-856. http://dx.doi.org/10.1109/TSA.2005.851927

[13] Ephraim, Y. and Van Trees, H.L. (1995) A Signal Subspace Ap-proach for Speech Enhancement. IEEE Transactions on Speech Audio Processing, 3, 251-266. http://dx.doi.org/10.1109/89.397090

[14] Lev-ari, H. and Ephraim, Y. (2003) Extension of The Signal Subspace Speech Enhancement Approach to Colored Noise. IEEE Signal Processing Letters, 10, 104-106. http://dx.doi.org/10.1109/LSP.2003.808544

[15] Wei, Q. and Xia, Y.S. (2013) A Novel Prewhitening Subspace Method for Enhancing Speech Corrupted by Colored Noise. Proc. IEEE CISP, 3, 1282-1286. http://dx.doi.org/10.1109/cisp.2013.6743870

[16] Ma, X. and Nikias, C.L. (1996) Joint Estimation of Time Delay and Frequent Delay in Impulsive Noise Using Fractional Lower Order Statistics. IEEE Trans. Signal Process., 44, 2669-2687. http://dx.doi.org/10.1109/78.542175

[17] Loizou, P.C. (2007) Speech Enhancement: Theory and Practice. CRC, Boca Raton, FL, USA.

[18] Varga, A. and Steeneken, H.J.M. (1993) Assessment for Automatic Speech Recog-nition: II, NOISEX-92: A Database and An Experiment to Study The Effect of Additive Noise on Speech Recognition Systems. Speech Communication, 12, 247-251. http://dx.doi.org/10.1016/0167-6393(93)90095-3