Datenbestand vom 10. Dezember 2024

Impressum Warenkorb Datenschutzhinweis Dissertationsdruck Dissertationsverlag Institutsreihen     Preisrechner

aktualisiert am 10. Dezember 2024

ISBN 978-3-8439-4914-9

72,00 € inkl. MwSt, zzgl. Versand


978-3-8439-4914-9, Reihe Informationstechnik

Christian Hümmer
A Bayesian Network Approach to Selected Problems in Speech Signal Processing

172 Seiten, Dissertation Universität Erlangen-Nürnberg (2019), Softcover, A5

Zusammenfassung / Abstract

The application of machine learning techniques to signal processing tasks has become of increasing interest in recent years. In particular, directed graphical models, named Bayesian networks, have shown to provide a powerful framework for deriving links between existing and new algorithms from a generalized point of view. This motivates to exploit a systematic Bayesian network approach in this thesis which is described as follows. A sequence of real-world observations is modeled to be produced by a set of latent random variables with unknown statistics. The underlying process producing the observations is described by a probabilistic model and its Bayesian network representation. This is the basis for acquiring information about the latent random variables by applying the steps of inference and decision. The described machine learning methodology will be consistently used to address two distinct speech signal processing tasks from a unifying Bayesian network perspective. First, the problem of single-channel Nonlinear Acoustic Echo Cancellation (NAEC) is considered with the goal to remove the acoustic coupling between a loudspeaker and a microphone. This leads to the derivation of the NLMS algorithms with fixed and optimum adaptive stepsize value as special cases of the Kalman filter. Furthermore, the Elitist Particle Filter based on Evolutionary Strategies (EPFES) is introduced

as a new algorithm to estimate the parameters of a nonlinear acoustic echo path model. As a second application, the task of environmentally-robust Automatic Speech Recognition (ASR) is addressed by modeling acoustic features to be random variables instead of deterministic point estimates. This model is taken into account by modifying the acoustic-model scoring during the recognition phase. To this end, both a well-known and a new uncertainty decoding strategy are derived from a unifying Bayesian network perspective.