Datenbestand vom 15. November 2024
Tel: 0175 / 9263392 Mo - Fr, 9 - 12 Uhr
Impressum Fax: 089 / 66060799
aktualisiert am 15. November 2024
978-3-8439-4914-9, Reihe Informationstechnik
Christian Hümmer A Bayesian Network Approach to Selected Problems in Speech Signal Processing
172 Seiten, Dissertation Universität Erlangen-Nürnberg (2019), Softcover, A5
The application of machine learning techniques to signal processing tasks has become of increasing interest in recent years. In particular, directed graphical models, named Bayesian networks, have shown to provide a powerful framework for deriving links between existing and new algorithms from a generalized point of view. This motivates to exploit a systematic Bayesian network approach in this thesis which is described as follows. A sequence of real-world observations is modeled to be produced by a set of latent random variables with unknown statistics. The underlying process producing the observations is described by a probabilistic model and its Bayesian network representation. This is the basis for acquiring information about the latent random variables by applying the steps of inference and decision. The described machine learning methodology will be consistently used to address two distinct speech signal processing tasks from a unifying Bayesian network perspective. First, the problem of single-channel Nonlinear Acoustic Echo Cancellation (NAEC) is considered with the goal to remove the acoustic coupling between a loudspeaker and a microphone. This leads to the derivation of the NLMS algorithms with fixed and optimum adaptive stepsize value as special cases of the Kalman filter. Furthermore, the Elitist Particle Filter based on Evolutionary Strategies (EPFES) is introduced
as a new algorithm to estimate the parameters of a nonlinear acoustic echo path model. As a second application, the task of environmentally-robust Automatic Speech Recognition (ASR) is addressed by modeling acoustic features to be random variables instead of deterministic point estimates. This model is taken into account by modifying the acoustic-model scoring during the recognition phase. To this end, both a well-known and a new uncertainty decoding strategy are derived from a unifying Bayesian network perspective.