Datenbestand vom 10. Dezember 2024
Verlag Dr. Hut GmbH Sternstr. 18 80538 München Tel: 0175 / 9263392 Mo - Fr, 9 - 12 Uhr
aktualisiert am 10. Dezember 2024
978-3-8439-1219-8, Reihe Informatik
Marc Jürgen Röttig Combining Sequence and Structural Information into Predictors of Enzymatic Activity
167 Seiten, Dissertation Eberhard-Karls-Universität Tübingen (2012), Softcover, A5
Prediction of protein function is one of the major tasks of the annotation phase of any genome sequencing project. For the newly sequenced genome the products of the detected protein-coding genes need to be annotated with their function with high accuracy. Current state of the art methods still rely on sequence-only based function prediction, like annotation via best BLAST hit against a database of curated proteins or via best matching profile HMM.
In this thesis it is demonstrated how a vital source of additional information, namely the structural information about the protein sequences, can be used in an automated manner to further improve the sequence-only based prediction of enzymatic function. Especially in the realm of enzyme sequences we have a wealth of structural information readily available. In the context of enzymes, the structure can give us detailed information about the amino acid configuration found within the active site of any enzyme sequence at hand. Furthermore, modern machine learning techniques like Support Vector Machines (SVM) can help to create even more powerful predictors of enzymatic function. The methods presented in this thesis are called Active Site Classification (ASC) and NRPSpredictor2 and they both rely on the concept of combining sequence and structural information.