Datenbestand vom 15. November 2024
Tel: 0175 / 9263392 Mo - Fr, 9 - 12 Uhr
Impressum Fax: 089 / 66060799
aktualisiert am 15. November 2024
978-3-8439-5370-2, Reihe Ingenieurwissenschaften
Kristian Fischer Image and Video Coding for Machines
214 Seiten, Dissertation Universität Erlangen-Nürnberg (2023), Softcover, A5
The technological process of the recent years led to a steadily increased amount of applications, where digital communication not only occurs between humans, but also between multiple machines. In the latter machine-to-machine communication scenario, large amounts of data is captured, shared, and analyzed. Especially for the case of transmitting high-dimensional image and video data, a reliable transmission is usually not feasible without compressing the data before transmission. For this type of data, well-known and standardized lossy codecs exist to reduce the bitrate. However, those codecs are optimized to deliver the best possible visual quality for a human observing the decompressed data.
From this, the first research question of this thesis arises how image and video codecs that are intended for human consumption behave when being deployed in coding scenarios for machines. To that end, this thesis first develops an evaluation framework that compares the coding performances for humans and machines for a given setup. Since they achieved bench-marking performances for analyzing images and videos, this thesis considers deep convolutional neural networks as machines. The carried-out investigations reveal that the new coding tools and optimizations proposed for the human, do not necessarily improve the compression performance when coding for machines.
Consequently, the follow-up research question emerges on how to evolve more efficient machine-optimized image and video coding frameworks and tools based on the well-known codecs designed for humans. Therefore, this thesis develops various sophisticated concepts with the goal of reducing the bitrate while still preserving the resulting task accuracy at the decoder side. The concepts are tailored for the two different codec types of traditional, hybrid codecs and learned, neural-network-based codecs. The presented methods help to reduce the required bitrate by up to 75% at the same task accuracy.