Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/240181
Title: | Network based Identification and Analysis of Structural Repeat Proteins |
Researcher: | Broto Chakrabarty |
Guide(s): | Nita Parekh |
Keywords: | database of structural repeats Engineering and Technology,Engineering,Engineering Multidisciplinary identification of structural repeats Protein contact network Protein structure graph Structural repeat proteins |
University: | International Institute of Information Technology, Hyderabad |
Completed Date: | 2018 |
Abstract: | Tandemly repeated structural motifs form integrated assemblies with multiple binding sites and facilitate various protein-protein/DNA/RNA interactions. The evolutionary conservation and disease association of repeat proteins depict their structural and functional importance, and the low sequence conservation suggests the need for their identification at the structural level. newlineIn this thesis, we propose graph based approaches for the identification of repeat proteins and use them to build a database of structural repeats. We observe that the eigenvector centrality of the protein contact network is able to capture the conserved residue interaction pattern within and between the repeating units. The efficacy of the measure is tested by developing a rule based approach, AnkPred, for identifying Ankyrin repeat proteins. We extended the approach by developing a generalized algorithm, PRIGSA, to identify members of known families and also de novo identify previously uncharacterized novel repeats. The performance of the approach is compared to state-of-the-art methods on benchmark datasets and predictions of the approach for 13 known families are compared with UniProt annotation. The PRIGSA algorithm is executed on the complete PDB to build a database of structural repeats, DbStRiPs. The de novo predicted repeats are clustered and manually curated. We report one novel repeat family and 31 De novo Protein Repeat Clusters (DPRCs). In DbStRiPs, 11901 repeats are reported in 10816 PDB chains, which are categorized into known protein repeat families, DPRC or Unclassified. newlineWe also developed a webserver for network analysis of protein structures, NAPS, providing a comprehensive platform for construction of 5 types of networks and their analyses such as centrality, shortest path, k-clique and graph spectral analyses. The web server provides browser independent interactive platform for visual analysis of protein structure and network along with options to download the results in suitable formats. |
Pagination: | |
URI: | http://hdl.handle.net/10603/240181 |
Appears in Departments: | Bioinformatics |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
appendix.pdf | Attached File | 513.85 kB | Adobe PDF | View/Open |
chapter1.pdf | 1.5 MB | Adobe PDF | View/Open | |
chapter2.pdf | 2.52 MB | Adobe PDF | View/Open | |
chapter3.pdf | 2.3 MB | Adobe PDF | View/Open | |
chapter4.pdf | 2.93 MB | Adobe PDF | View/Open | |
chapter5.pdf | 2.05 MB | Adobe PDF | View/Open | |
chapter6.pdf | 310.59 kB | Adobe PDF | View/Open | |
references.pdf | 398.99 kB | Adobe PDF | View/Open | |
startingpages.pdf | 639.45 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: