Datenbestand vom 15. November 2024

Warenkorb Datenschutzhinweis Dissertationsdruck Dissertationsverlag Institutsreihen     Preisrechner

aktualisiert am 15. November 2024

ISBN 9783843903875

60,00 € inkl. MwSt, zzgl. Versand


978-3-8439-0387-5, Reihe Informatik

Michael Duller
Management and Federation of Stream Processing Applications

158 Seiten, Dissertation Eidgenössische Technische Hochschule (ETH) Zürich (2011), Hardcover, A5

Zusammenfassung / Abstract

A decade ago, stream processing has enabled a new class of applications by employing a fundamentally different processing model than conventional data base management systems. These applications process large volumes of continuous streams of input data with high throughput. Data stream management systems have evolved into industrial strength solutions for this class of applications and use long-running queries to process high volumes of continuous input data with low latency. However, they still lack flexibility in terms of large-scale deployment, integration, extensibility, and interoperability.

In the last years, a substantial ecosystem of new applications has emerged that can potentially benefit from stream processing. They range from the federation of existing but heterogeneous streaming applications to automated deployments of streaming applications in large clusters or cloud environments to processing personal information like photos as data streams. These applications introduce different requirements on how stream processing solutions can be deployed, integrated, extended, and federated.

This thesis explores stream processing with the help of traditional stream processing applications as well as applications that process personal information as data streams and identifies the fundamental properties that are common to all stream processing systems. The result is a generic model for stream processing and an architecture for a dynamic platform that supports the model. The model separates processing (operators) and data management (buffers) into distinct entities.

These properties enable the automated deployment of applications, facilitate the federation of applications running on heterogeneous stream processing systems, and leverage stream processing in new application domains. This thesis validates the generality of the model, the feasibility in terms of overhead, and the claims made in terms of deployment and integration. Experiments on PlanetLab, on a cluster, and on individual nodes confirm that the model and platform proposed in the thesis enable the interoperability between heterogeneous stream processing engines, facilitate the distributed deployment, and add functionality to the engines (ability to replace operators at runtime or to run a distribution-agnostic engine in a distributed setup) with negligible overhead.