Datenbestand vom 15. November 2024

Warenkorb Datenschutzhinweis Dissertationsdruck Dissertationsverlag Institutsreihen     Preisrechner

aktualisiert am 15. November 2024

ISBN 978-3-8439-0592-3

72,00 € inkl. MwSt, zzgl. Versand


978-3-8439-0592-3, Reihe Informatik

Andreas Jahn
Molecular Flexibility Encodings for Virtual Screening and Machine Learning

168 Seiten, Dissertation Eberhard-Karls-Universität Tübingen (2012), Softcover, A5

Zusammenfassung / Abstract

The biological activity of a drug can be considered as a function of its shape and geometrical constitution of important substructures that interact with the target protein. Consequently, the 3D information of the molecules represents an important source of information for the development of new cheminformatic tools that assist the early stages of the drug discovery pipeline. However, the shape and geometry of molecules are not clearly defined but rather flexible. Therefore, the 3D features may show a great variability if the corresponding molecule represents a flexible compound. As a result, the computed similarity values are fluctuating considerably and the results of the models are not robust. Based on this weakness of 3D-based similarity functions, the early stages of the drug discovery pipeline are dominated by robust 2D similarity functions and the loss of 3D information is accepted.

The aim of this thesis is the introduction of several techniques that enable the incorporation of valuable 3D information without the undesired fluctuations of the computed similarity values. For this purpose, two different strategies are proposed.

The first strategy extends structured 2D similarity functions with local 3D information. This local 3D information is based on empirical or semi-empirical expert systems and is only determined for substructures that have a small variability. The 3D information of flexible substructures is encoded in such a way that all possible positions of the substructures are described. Therefore, the features are not based on one possible conformation of the molecules and are not affected by the variance of the geometry and shape.

The second strategy operates on multiple conformations of the molecules and encodes the variability by means of generative models. These models describe the behavior of the molecules and their flexibility. The similarity computation of the molecules utilizes these generative models to compare the behavior in the conformational space.

Both strategies were evaluated in virtual screening and quantitative structure-activity relationship experiments to investigate the benefits of the proposed techniques. The results showed that the utilization of reliable 3D information can increase the performance of the methods for both cheminformatic tasks.