Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/240181
Title: Network based Identification and Analysis of Structural Repeat Proteins
Researcher: Broto Chakrabarty
Guide(s): Nita Parekh
Keywords: database of structural repeats
Engineering and Technology,Engineering,Engineering Multidisciplinary
identification of structural repeats
Protein contact network
Protein structure graph
Structural repeat proteins
University: International Institute of Information Technology, Hyderabad
Completed Date: 2018
Abstract: Tandemly repeated structural motifs form integrated assemblies with multiple binding sites and facilitate various protein-protein/DNA/RNA interactions. The evolutionary conservation and disease association of repeat proteins depict their structural and functional importance, and the low sequence conservation suggests the need for their identification at the structural level. newlineIn this thesis, we propose graph based approaches for the identification of repeat proteins and use them to build a database of structural repeats. We observe that the eigenvector centrality of the protein contact network is able to capture the conserved residue interaction pattern within and between the repeating units. The efficacy of the measure is tested by developing a rule based approach, AnkPred, for identifying Ankyrin repeat proteins. We extended the approach by developing a generalized algorithm, PRIGSA, to identify members of known families and also de novo identify previously uncharacterized novel repeats. The performance of the approach is compared to state-of-the-art methods on benchmark datasets and predictions of the approach for 13 known families are compared with UniProt annotation. The PRIGSA algorithm is executed on the complete PDB to build a database of structural repeats, DbStRiPs. The de novo predicted repeats are clustered and manually curated. We report one novel repeat family and 31 De novo Protein Repeat Clusters (DPRCs). In DbStRiPs, 11901 repeats are reported in 10816 PDB chains, which are categorized into known protein repeat families, DPRC or Unclassified. newlineWe also developed a webserver for network analysis of protein structures, NAPS, providing a comprehensive platform for construction of 5 types of networks and their analyses such as centrality, shortest path, k-clique and graph spectral analyses. The web server provides browser independent interactive platform for visual analysis of protein structure and network along with options to download the results in suitable formats.
Pagination: 
URI: http://hdl.handle.net/10603/240181
Appears in Departments:Bioinformatics

Files in This Item:
File Description SizeFormat 
appendix.pdfAttached File513.85 kBAdobe PDFView/Open
chapter1.pdf1.5 MBAdobe PDFView/Open
chapter2.pdf2.52 MBAdobe PDFView/Open
chapter3.pdf2.3 MBAdobe PDFView/Open
chapter4.pdf2.93 MBAdobe PDFView/Open
chapter5.pdf2.05 MBAdobe PDFView/Open
chapter6.pdf310.59 kBAdobe PDFView/Open
references.pdf398.99 kBAdobe PDFView/Open
startingpages.pdf639.45 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: