[EBOOK] Speech Enhancement Methods Based On Casa Incorporating Spectral Correlation PDF Download

Signal processing

Speech Enhancement Methods Based on CASA Incorporating Spectral Correlation

Book Details:

Author : Feng Bao
Publisher :
Release : 2018
ISBN :
Pages : 141 pages

Download or read book Speech Enhancement Methods Based on CASA Incorporating Spectral Correlation written by Feng Bao and published by . This book was released on 2018 with total page 141 pages. Available in PDF, EPUB and Kindle. Book excerpt: Computational auditory scene analysis (CASA) has shown a great potential for speech enhancement compared to some statistical model-based methods. A challenge for CASA is how to estimate binary mask or ratio mask effectively in each time-frequency (T-F) unit. In this thesis, four speech enhancement methods with binary mask or ratio mask estimation are proposed based on the spectral relationship among noisy speech, pure noise and clean speech. The common use of fixed thresholds in the conventional CASA method constrains segregation and T-F unit labeling, affecting the performance of de-noising. Thus, an adaptive factor is first derived from the power spectra of noisy speech and estimated noise to replace those fixed thresholds. As a result, noise reduction is achieved with improved pitch contour and T-F unit labeling. A new binary mask estimation method is proposed based on convex optimization to reduce artifacts and temporal discontinuity caused by the inaccuracy of binary mask estimation. Signal segregation and pitch estimation are not needed in this method; only speech power is considered as a key cue for labeling the binary mask. The cross-correlation between the noisy speech and estimated noise power spectra in each channel is employed to build the objective function. The T-F units of speech and noise are labeled with a decision factor derived from the powers of noisy speech, estimated speech, and pre-estimated noise respectively. Erroneous local masks are refined by time-frequency unit smoothing. As a consequence, noise is effectively reduced and the perceptual quality of the enhanced speech is improved. A new estimation method of ratio mask in terms of Wiener filtering is proposed in order to further increase the temporal continuity of reconstructed speech. In this method, the speech power of each T-F unit is obtained by a convex optimization method. The objective function depends also on the cross-correlation between the noisy speech and estimated noise power spectra. To improve the accuracy of estimation, the estimated ratio mask is further modified based on its adjacent time-frequency units and then smoothed by interpolating with the estimated binary masks. The results confirmed that the performances related to noise reduction, speech quality, and speech intelligibility are all improved. A novel ratio mask representation by exploiting the inter-channel correlation (ICC) among the noisy speech, pure noise and clean speech spectra is proposed to further improve enhancement performance. In this way, the power ratio of speech and noise is reallocated adaptively during the construction of ratio mask, so that more speech components are retained and more noise components are masked. In this method, the channel-weight contour based on the equal loudness hearing attribute is adopted to revise the ratio mask in each T-F unit. The developed ratio mask is utilized to train a five-layer Deep Neural Network (DNN) with other features. Experiments show significant improvements in speech quality and intelligibility compared to DNN-based methods without ICC.

Computers

Speech Enhancement

Book Details:

Author : Shoji Makino
Publisher : Springer Science & Business Media
Release : 2005-03-17
ISBN : 9783540240396
Pages : 432 pages

Download or read book Speech Enhancement written by Shoji Makino and published by Springer Science & Business Media. This book was released on 2005-03-17 with total page 432 pages. Available in PDF, EPUB and Kindle. Book excerpt: We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be "cleaned" with digital signal processing tools before it is played out, transmitted, or stored. This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise reduction but also dereverberation and separation of independent signals. These topics are also covered in this book. However, the general emphasis is on noise reduction because of the large number of applications that can benefit from this technology. The goal of this book is to provide a strong reference for researchers, engineers, and graduate students who are interested in the problem of signal and speech enhancement. To do so, we invited well-known experts to contribute chapters covering the state of the art in this focused field.

Technology & Engineering

Canonical Correlation Analysis in Speech Enhancement

Book Details:

Author : Jacob Benesty
Publisher : Springer
Release : 2017-08-31
ISBN : 3319670204
Pages : 124 pages

Download or read book Canonical Correlation Analysis in Speech Enhancement written by Jacob Benesty and published by Springer. This book was released on 2017-08-31 with total page 124 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book focuses on the application of canonical correlation analysis (CCA) to speech enhancement using the filtering approach. The authors explain how to derive different classes of time-domain and time-frequency-domain noise reduction filters, which are optimal from the CCA perspective for both single-channel and multichannel speech enhancement. Enhancement of noisy speech has been a challenging problem for many researchers over the past few decades and remains an active research area. Typically, speech enhancement algorithms operate in the short-time Fourier transform (STFT) domain, where the clean speech spectral coefficients are estimated using a multiplicative gain function. A filtering approach, which can be performed in the time domain or in the subband domain, obtains an estimate of the clean speech sample at every time instant or time-frequency bin by applying a filtering vector to the noisy speech vector. Compared to the multiplicative gain approach, the filtering approach more naturally takes into account the correlation of the speech signal in adjacent time frames. In this study, the authors pursue the filtering approach and show how to apply CCA to the speech enhancement problem. They also address the problem of adaptive beamforming from the CCA perspective, and show that the well-known Wiener and minimum variance distortionless response (MVDR) beamformers are particular cases of a general class of CCA-based adaptive beamformers.

Technology & Engineering

Speech Enhancement

Book Details:

Author : Philipos C. Loizou
Publisher : CRC Press
Release : 2007-06-07
ISBN :
Pages : 640 pages

Download or read book Speech Enhancement written by Philipos C. Loizou and published by CRC Press. This book was released on 2007-06-07 with total page 640 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book covers traditional speech enhancement algorithms, such as spectral subtraction and Wiener filtering algorithms as well as state-of-the-art algorithms including minimum mean-squared error algorithms that incorporate signal-presence uncertainty and subspace algorithms that incorporate psychoacoustic models. The coverage includes objective and subjective measures used to evaluate speech quality and intelligibility. Divided into three parts, the book presents the digital-signal processing and speech signal fundamentals needed to understand speech enhancement algorithms, the various classes of speech enhancement algorithms proposed over the last two decades, and the methods and measures used to evaluate the performance of speech enhancement algorithms.

Adaptive signal processing

Spectral Refinements to Speech Enhancement

Book Details:

Author : Werayuth Charoenruengkit
Publisher :
Release : 2009
ISBN :
Pages : 248 pages

Download or read book Spectral Refinements to Speech Enhancement written by Werayuth Charoenruengkit and published by . This book was released on 2009 with total page 248 pages. Available in PDF, EPUB and Kindle. Book excerpt: The goal of a speech enhancement algorithm is to remove noise and recover the original signal with as little distortion and residual noise as possible. Most successful real-time algorithms thereof have done in the frequency domain where the frequency amplitude of clean speech is estimated per short-time frame of the noisy signal. The state of-the-art short-time spectral amplitude estimator algorithms estimate the clean spectral amplitude in terms of the power spectral density (PSD) function of the noisy signal. The PSD has to be computed from a large ensemble of signal realizations. However, in practice, it may only be estimated from a finite-length sample of a single realization of the signal. Estimation errors introduced by these limitations deviate the solution from the optimal. Various spectral estimation techniques, many with added spectral smoothing, have been investigated for decades to reduce the estimation errors. These algorithms do not address significantly issue on quality of speech as perceived by a human. This dissertation presents analysis and techniques that offer spectral refinements toward speech enhancement. We present an analytical framework of the effect of spectral estimate variance on the performance of speech enhancement. We use the variance quality factor (VQF) as a quantitative measure of estimated spectra. We show that reducing the spectral estimator VQF reduces significantly the VQF of the enhanced speech. The Autoregressive Multitaper (ARMT) spectral estimate is proposed as a low VQF spectral estimator for use in speech enhancement algorithms. An innovative method of incorporating a speech production model using multiband excitation is also presented as a technique to emphasize the harmonic components of the glottal speech input. The preconditioning of the noisy estimates by exploiting other avenues of information, such as pitch estimation and the speech production model, effectively increases the localized narrow-band signal-to noise ratio (SNR) of the noisy signal, which is subsequently denoised by the amplitude gain. Combined with voicing structure enhancement, the ARMT spectral estimate delivers enhanced speech with sound clarity desirable to human listeners. The resulting improvements in enhanced speech are observed to be significant with both Objective and Subjective measurement.

Technology & Engineering

Speech Enhancement

Book Details:

Author : Jacob Benesty
Publisher : Elsevier
Release : 2014-01-04
ISBN : 0128002530
Pages : 143 pages

Download or read book Speech Enhancement written by Jacob Benesty and published by Elsevier. This book was released on 2014-01-04 with total page 143 pages. Available in PDF, EPUB and Kindle. Book excerpt: Speech enhancement is a classical problem in signal processing, yet still largely unsolved. Two of the conventional approaches for solving this problem are linear filtering, like the classical Wiener filter, and subspace methods. These approaches have traditionally been treated as different classes of methods and have been introduced in somewhat different contexts. Linear filtering methods originate in stochastic processes, while subspace methods have largely been based on developments in numerical linear algebra and matrix approximation theory. This book bridges the gap between these two classes of methods by showing how the ideas behind subspace methods can be incorporated into traditional linear filtering. In the context of subspace methods, the enhancement problem can then be seen as a classical linear filter design problem. This means that various solutions can more easily be compared and their performance bounded and assessed in terms of noise reduction and speech distortion. The book shows how various filter designs can be obtained in this framework, including the maximum SNR, Wiener, LCMV, and MVDR filters, and how these can be applied in various contexts, like in single-channel and multichannel speech enhancement, and in both the time and frequency domains. - First short book treating subspace approaches in a unified way for time and frequency domains, single-channel, multichannel, as well as binaural, speech enhancement - Bridges the gap between optimal filtering methods and subspace approaches - Includes original presentation of subspace methods from different perspectives

Technology & Engineering

Speech Enhancement

Book Details:

Author : Philipos C. Loizou
Publisher : CRC Press
Release : 2013-02-25
ISBN : 1466504218
Pages : 715 pages

Download or read book Speech Enhancement written by Philipos C. Loizou and published by CRC Press. This book was released on 2013-02-25 with total page 715 pages. Available in PDF, EPUB and Kindle. Book excerpt: With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at improving speech intelligibility. Fundamentals, Algorithms, Evaluation, and Future Steps Organized into four parts, the book begins with a review of the fundamentals needed to understand and design better speech enhancement algorithms. The second part describes all the major enhancement algorithms and, because these require an estimate of the noise spectrum, also covers noise estimation algorithms. The third part of the book looks at the measures used to assess the performance, in terms of speech quality and intelligibility, of speech enhancement methods. It also evaluates and compares several of the algorithms. The fourth part presents binary mask algorithms for improving speech intelligibility under ideal conditions. In addition, it suggests steps that can be taken to realize the full potential of these algorithms under realistic conditions. What’s New in This Edition Updates in every chapter A new chapter on objective speech intelligibility measures A new chapter on algorithms for improving speech intelligibility Real-world noise recordings (on accompanying CD) MATLAB® code for the implementation of intelligibility measures (on accompanying CD) MATLAB and C/C++ code for the implementation of algorithms to improve speech intelligibility (on accompanying CD) Valuable Insights from a Pioneer in Speech Enhancement Clear and concise, this book explores how human listeners compensate for acoustic noise in noisy environments. Written by a pioneer in speech enhancement and noise reduction in cochlear implants, it is an essential resource for anyone who wants to implement or incorporate the latest speech enhancement algorithms to improve the quality and intelligibility of speech degraded by noise. Includes a CD with Code and Recordings The accompanying CD provides MATLAB implementations of representative speech enhancement algorithms as well as speech and noise databases for the evaluation of enhancement algorithms.

Technology & Engineering

A Perspective on Single Channel Frequency Domain Speech Enhancement

Book Details:

Author : Jacob Benesty
Publisher : Morgan & Claypool Publishers
Release : 2011-03-01
ISBN : 1608456994
Pages : 111 pages

Download or read book A Perspective on Single Channel Frequency Domain Speech Enhancement written by Jacob Benesty and published by Morgan & Claypool Publishers. This book was released on 2011-03-01 with total page 111 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book focuses on a class of single-channel noise reduction methods that are performed in the frequency domain via the short-time Fourier transform (STFT). The simplicity and relative effectiveness of this class of approaches make them the dominant choice in practical systems. Even though many popular algorithms have been proposed through more than four decades of continuous research, there are a number of critical areas where our understanding and capabilities still remain quite rudimentary, especially with respect to the relationship between noise reduction and speech distortion. All existing frequency-domain algorithms, no matter how they are developed, have one feature in common: the solution is eventually expressed as a gain function applied to the STFT of the noisy signal only in the current frame. As a result, the narrowband signal-to-noise ratio (SNR) cannot be improved, and any gains achieved in noise reduction on the fullband basis come with a price to pay, which is speech distortion. In this book, we present a new perspective on the problem by exploiting the difference between speech and typical noise in circularity and interframe self-correlation, which were ignored in the past. By gathering the STFT of the microphone signal of the current frame, its complex conjugate, and the STFTs in the previous frames, we construct several new, multiple-observation signal models similar to a microphone array system: there are multiple noisy speech observations, and their speech components are correlated but not completely coherent while their noise components are presumably uncorrelated. Therefore, the multichannel Wiener filter and the minimum variance distortionless response (MVDR) filter that were usually associated with microphone arrays will be developed for single-channel noise reduction in this book. This might instigate a paradigm shift geared toward speech distortionless noise reduction techniques.

Computers

Phase based Speech Processing

Book Details:

Author : Parham Aarabi
Publisher : World Scientific
Release : 2006
ISBN : 9812566120
Pages : 153 pages

Download or read book Phase based Speech Processing written by Parham Aarabi and published by World Scientific. This book was released on 2006 with total page 153 pages. Available in PDF, EPUB and Kindle. Book excerpt: This is the first book that takes a detailed look at the importance of phase in the design of speech processing systems. Phase, in comparison with amplitude, is often ignored for speech recognition applications. Thus, this book highlights some of the important ways in which the phase of speech signals can be utilized for sound localization, enhancement, and recognition.This book also discusses the state-of-the-art research in phase-based speech processing, starting from the basics of signal processing and recording, to single microphone speech recognition, the recognition of speech and the processing of speech by humans, as well as the importance of phase in human speech recognition and multi-microphone phase-based speech processing.

Technology & Engineering

Fundamentals of Speech Enhancement

Book Details:

Author : Jacob Benesty
Publisher : Springer
Release : 2018-02-09
ISBN : 3319745247
Pages : 112 pages

Download or read book Fundamentals of Speech Enhancement written by Jacob Benesty and published by Springer. This book was released on 2018-02-09 with total page 112 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book presents and develops several important concepts of speech enhancement in a simple but rigorous way. Many of the ideas are new; not only do they shed light on this old problem but they also offer valuable tips on how to improve on some well-known conventional approaches. The book unifies all aspects of speech enhancement, from single channel, multichannel, beamforming, time domain, frequency domain and time–frequency domain, to binaural in a clear and flexible framework. It starts with an exhaustive discussion on the fundamental best (linear and nonlinear) estimators, showing how they are connected to various important measures such as the coefficient of determination, the correlation coefficient, the conditional correlation coefficient, and the signal-to-noise ratio (SNR). It then goes on to show how to exploit these measures in order to derive all kinds of noise reduction algorithms that can offer an accurate and versatile compromise between noise reduction and speech distortion.

Speech Enhancement for Non stationary Noise Based on Spectral Processing

Book Details:

Author : Mads Helle
Publisher :
Release :
ISBN :
Pages : pages

Download or read book Speech Enhancement for Non stationary Noise Based on Spectral Processing written by Mads Helle and published by . This book was released on with total page pages. Available in PDF, EPUB and Kindle. Book excerpt:

Technology & Engineering

Speech Enhancement

Book Details:

Author : Jacob Benesty
Publisher : Springer Science & Business Media
Release : 2006-03-30
ISBN : 3540274898
Pages : 416 pages

Download or read book Speech Enhancement written by Jacob Benesty and published by Springer Science & Business Media. This book was released on 2006-03-30 with total page 416 pages. Available in PDF, EPUB and Kindle. Book excerpt: A strong reference on the problem of signal and speech enhancement, describing the newest developments in this exciting field. The general emphasis is on noise reduction, because of the large number of applications that can benefit from this technology.

Audiology

Feature based Speech Enhancement Techniques Based on Spectral Subtraction and Wiener Filtering

Book Details:

Author : Mike V. Chan
Publisher :
Release : 1999
ISBN :
Pages : 508 pages

Download or read book Feature based Speech Enhancement Techniques Based on Spectral Subtraction and Wiener Filtering written by Mike V. Chan and published by . This book was released on 1999 with total page 508 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Signal processing

A Method of Speech Enhancement Based Upon Across frequency Envelope Correlation

Book Details:

Author : James Timothy Fuerstnau
Publisher :
Release : 2001
ISBN :
Pages : 236 pages

Download or read book A Method of Speech Enhancement Based Upon Across frequency Envelope Correlation written by James Timothy Fuerstnau and published by . This book was released on 2001 with total page 236 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Multi channel Speech Enhancement by Regularized Optimization

Book Details:

Author : Meng Yu
Publisher :
Release : 2012
ISBN : 9781267454515
Pages : 123 pages

Download or read book Multi channel Speech Enhancement by Regularized Optimization written by Meng Yu and published by . This book was released on 2012 with total page 123 pages. Available in PDF, EPUB and Kindle. Book excerpt: Speech enhancement aims to eliminate noise and unexpected interferences that degrade speech quality and intelligibility in realistic listening situations. It is an indispensible technique in telecommunication and assistive listening devices such as hands-free mobile phones and hearing aids. Though a lot of research has been done in this area, only a limited number of methods can be eective in both real time and real world conditions. Diculties include noise types (incoherent, coherent, diuse), a-priori unknown number of noise sources, mobility of source locations, room reverberations, and non-stationarity. In this thesis, we focus on speech enhancement by suppressing coherent noise and reverberation. Classical speech enhancement methods rely on data from a single microphone. Spectral estimation methods, such as spectral subtraction, Wiener ltering and subspace method, are most widely used. In recent years, microphone array techniques have been developed and recognized as more powerful and promising solutions. Crosschannel cancellation is incorporated in the thesis to resolve the spatial dierence between channels, which helps to blindly identify channel impulse responses and forms the constraints between channels as well. L1 regularized minimization framework is incorporated to speech signal processing, with the regularization applied on channel impulse responses and speech spectrogram, respectively. The over-tting problem in the lter and spectrogram estimation is overcome by the sparsity regularization. Split Bregman method is used to derive the updating rules for speech enhancement in the time domain, while in the spectral domain non-negativity is applied on the spectrogram magnitude of speech signal and impulse response. Therefore, the proposed speech dereverberation method is solved under a constrained non-negative matrix factorization framework (NMF) in the spectrogram magnitude domain. The thesis is organized as follows. In chapter 1, the mathematical frameworks on L1 minimization and NMF are introduced, respectively. Under l1 minimization framework, chapter 2, 3 and 4 present the convex speech enhancement model, musical noise reduction and overlapping speech detection method, respectively. The multichannel speech dereverberation method is presented in chapter 5 under a constrained NMF framework. The thesis is concluded in chapter 6.

LPC Analysis synthesis of Noisy Speech Based on Spectral Segments

Book Details:

Author : Sanguoon Chung
Publisher :
Release : 1987
ISBN :
Pages : 408 pages

Download or read book LPC Analysis synthesis of Noisy Speech Based on Spectral Segments written by Sanguoon Chung and published by . This book was released on 1987 with total page 408 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Speech Enhancement in Modulation Domain Using Codebook based Speech and Noise Estimation

Book Details:

Author : Vidhyasagar Mani
Publisher :
Release : 2016
ISBN :
Pages : pages

Download or read book Speech Enhancement in Modulation Domain Using Codebook based Speech and Noise Estimation written by Vidhyasagar Mani and published by . This book was released on 2016 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: "Conventional single-channel speech enhancement methods implement the analysis-modification-synthesis (AMS) framework in the acoustic frequency domain. Recently, it has been shown that the extension of this framework to the modulation domain may result in better noise suppression. However, this conclusion has been reached by relying on a minimum statistics approach for the required noise power spectral density (PSD) estimation.Various noise estimation algorithms have been proposed over the years in the speech and audio processing literature. Among these, the widely used minimum statistics approach is known to introduce a time frame lag in the estimated noise spectrum. This can lead to highly inaccurate PSD estimates when the noise behaviour rapidly changes with time, i.e., non-stationary noise. Speech enhancement methods which employ these inaccurate noise PSD estimates tend to perform poorly in the noise suppression task, and in worst cases, may end up deteriorating the noisy speech signal even further. Noise PSD estimation algorithms using a priori information about the noise statistics have been shown to track non-stationary noise better than the conventional algorithms which rely on the minimum statistics approach.In this thesis, we perform noise suppression in the modulation domain with the noise and speech PSD derived from an estimation scheme which employs the a priori information of various speech and noise types.Specifically, codebooks of gain normalized linear prediction coefficients obtained from training on various speech and noise files are used as the a priori information while performing the estimation of the desired PSD.The PSD estimates derived from this codebook approach are used to obtain a minimum mean square error (MMSE) estimate of the clean speech modulation magnitude spectrum, which is then combined with the phase spectrum of the noisy speech to recover the enhanced speech signal. The enhanced speech signal is subjected to various objective experiments for evaluation. Results of these evaluations indicate improvement in noise suppression with the proposed codebook-based modulation domain approach over competing approaches, particularly in cases of non-stationary noise." --