Datenbestand vom 10. Dezember 2024

Impressum Warenkorb Datenschutzhinweis Dissertationsdruck Dissertationsverlag Institutsreihen     Preisrechner

aktualisiert am 10. Dezember 2024

ISBN 978-3-8439-5280-4

39,00 € inkl. MwSt, zzgl. Versand


978-3-8439-5280-4, Reihe Ingenieurwissenschaften

Andreas Spruck
Techniques for Preprocessing, Augmenting, and Synthesizing Image and Video Data for Object Classification Using Neural Networks

168 Seiten, Dissertation Universität Erlangen-Nürnberg (2023), Softcover, A5

Zusammenfassung / Abstract

Within the recent years neural networks became a very prominent application in the field of image processing. However, the effect of input data and influences of methods applied to the input data are not yet studied sufficiently. This thesis investigates techniques for preprocessing, augmenting, and synthesizing image and video data for object classification using neural networks. The developed techniques are evaluated for different object classification applications reaching from single-object classification to optical character recognition. In this thesis, preprocessing is first conducted in the context of laser triangulation-based inspection systems. Here, different size adaption strategies are examined for a classifier assessing a sample item. Furthermore, an effective method for reconstructing corrupted video data is introduced. The method is based on estimating three-dimensional frequency models and introduces a motion compensated spatial weighting function. Moreover, the augmentation of training data is investigated. First, the key point agnostic frequency selective mesh-to-grid resampling is deployed for the resampling within geometric data augmentation operations. Deploying this advanced resampling technique leads to better classification results. Moreover, a rendering-based data augmentation technique is introduced and evaluated for optical character recognition applications. This method is capable of generating several novel viewing angles and illumination scenarios of a training sample. By using this method, the recognition rate can be improved significantly. Finally, a rendering-based pipeline is introduced that synthesizes entirely new training samples. This pipeline generates and annotates synthetic and partly-real data in an automated procedure. Moreover, it aids the acquisition of real data. The data generation pipeline is evaluated for license plate recognition. Using training data synthesized by the pipeline improves the recognition rate clearly.