Datenbestand vom 15. November 2024
Tel: 0175 / 9263392 Mo - Fr, 9 - 12 Uhr
Impressum Fax: 089 / 66060799
aktualisiert am 15. November 2024
978-3-8439-1360-7, Reihe Informatik
Michael Kemmler Machine Learning Methods for the Automatic Identification of Microorganisms from Raman Spectroscopic Data
331 Seiten, Dissertation Friedrich-Schiller-Universität Jena (2013), Softcover, A5
Microbial life is all around us and influences our world in mostly subtle ways. In particular cases, however, the presence of certain microorganisms can have severe consequences to industrial production processes or even to human life. Control mechanisms are therefore necessary in order to identify potentially harmful scenarios, such as the presence of pathogenic particles in clinical environments.
After microbes have been isolated, their characteristics can be captured by recent techniques such as Raman spectroscopy, which extracts a molecular fingerprint of the present sample in a matter of seconds or minutes. However, while being very discriminative, these spectra are hard to interpret by human experts. This makes the use of computer-based analysis techniques necessary. This thesis deals with the automatic identification of microorganisms by exploiting principles from machine learning that aim to learn a relationship between these molecular characteristics and microbial classes.
We propose to use the probabilistic kernel-based Gaussian process classifier for this challenging task and demonstrate its superior performance compared to various state-of-the-art methods already known from chemometrics literature. To overcome the inherent speed and memory issues related to general kernel methods, novel approaches to efficient large-scale Gaussian process classification are presented.
This paves the way for handling very large-scale Raman spectra databases that are required for facing real-world problems. We further introduce the concept of invariant Gaussian processes for tackling the well-known problem of fluorescence. We then concentrate on automatically finding most relevant spectral features, which can help experts to trace back molecular properties associated to the problem at hand.
Most studies on Raman spectra classification follow the assumption that all possible types of microbes are covered in the provided training database. In practical applications, however, this assumption is certainly not met given the large bio-diversity of microbial life. Due to this fact, techniques capable of detecting instances from novel microbial classes are necessary. In addition to evaluating a large body of work appropriate for this task, we propose a new one-class classification method based on Gaussian processes that performs favorably compared to existing work.