Datenbestand vom 15. November 2024

Warenkorb Datenschutzhinweis Dissertationsdruck Dissertationsverlag Institutsreihen     Preisrechner

aktualisiert am 15. November 2024

ISBN 978-3-8439-5370-2

45,00 € inkl. MwSt, zzgl. Versand


978-3-8439-5370-2, Reihe Ingenieurwissenschaften

Kristian Fischer
Image and Video Coding for Machines

214 Seiten, Dissertation Universität Erlangen-Nürnberg (2023), Softcover, A5

Zusammenfassung / Abstract

The technological process of the recent years led to a steadily increased amount of applications, where digital communication not only occurs between humans, but also between multiple machines. In the latter machine-to-machine communication scenario, large amounts of data is captured, shared, and analyzed. Especially for the case of transmitting high-dimensional image and video data, a reliable transmission is usually not feasible without compressing the data before transmission. For this type of data, well-known and standardized lossy codecs exist to reduce the bitrate. However, those codecs are optimized to deliver the best possible visual quality for a human observing the decompressed data.

From this, the first research question of this thesis arises how image and video codecs that are intended for human consumption behave when being deployed in coding scenarios for machines. To that end, this thesis first develops an evaluation framework that compares the coding performances for humans and machines for a given setup. Since they achieved bench-marking performances for analyzing images and videos, this thesis considers deep convolutional neural networks as machines. The carried-out investigations reveal that the new coding tools and optimizations proposed for the human, do not necessarily improve the compression performance when coding for machines.

Consequently, the follow-up research question emerges on how to evolve more efficient machine-optimized image and video coding frameworks and tools based on the well-known codecs designed for humans. Therefore, this thesis develops various sophisticated concepts with the goal of reducing the bitrate while still preserving the resulting task accuracy at the decoder side. The concepts are tailored for the two different codec types of traditional, hybrid codecs and learned, neural-network-based codecs. The presented methods help to reduce the required bitrate by up to 75% at the same task accuracy.