Datenbestand vom 12. November 2024

Warenkorb Datenschutzhinweis Dissertationsdruck Dissertationsverlag Institutsreihen     Preisrechner

aktualisiert am 12. November 2024

ISBN 978-3-8439-1986-9

84,00 € inkl. MwSt, zzgl. Versand


978-3-8439-1986-9, Reihe Informationstechnik

Jürgen Thomas Geiger
Robust Methods for Content Analysis of Auditory Scenes

184 Seiten, Dissertation Technische Universität München (2014), Softcover, A5

Zusammenfassung / Abstract

The increasing progress of audio analysis methods opens possibilities for more new applications. At the same time, recent improvements in these methods bring the established approaches constantly closer to their performance limits, which are defined by disturbing factors such as overlapping speech or noise and reverberation. This thesis presents progress in new possibilities and addressing disturbing factors, first, by proposing ideas for a system for the classification of acoustic scenes and a method for acoustic gait-based person identification. Both of them are two relatively new audio recognition tasks. Furthermore, improvements for two established methods (speaker diarization and robust speech recognition) are presented. To improve speaker diarization, different approaches to detect overlapping speech are proposed. To increase the robustness of a speech recognition system against noise and reverberation, an approach using memory-enhanced acoustic modelling is employed. Together, the proposed modules represent a complete system for auditory scene analysis. Starting from a coarse classification of the scene as a whole, persons can be identified using their step sounds or voice, followed by a transcription of the spoken contents. Experimental evaluations using publicly available databases or within public research challenges demonstrate the efficiency of the proposed methods.