Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/491132
Title: | Protein Function Identification by Predicting Protein Structure using Machine Learning |
Researcher: | Mehta, Apurva A. |
Guide(s): | Mazumdar, Himanshu S. |
Keywords: | Computer Science Computer Science Interdisciplinary Applications Engineering and Technology Machine Learning Protein Function, Protein Structure Random Forest |
University: | Dharmsinh Desai University |
Completed Date: | 2022 |
Abstract: | Current work reflects that influence of Machine Learning in Computational Biology is immense. From the plethora of computational biology problems we choose to address the problem of protein structure (i.e. protein structural classes and protein folds) prediction using machine learning algorithm. The machine learning algorithm is also used to identify significant miRNA biomarkers targeting breast cancer patients. Thus, identification of protein function in terms of the role of miRNA is conceived using machine learning algorithm. newline The main motivation for selecting the problem of protein structure for prediction is as protein tertiary structure discovery growth rate is lagging far behind discovery of protein primary structure, at present. The prediction of protein structures like protein structural classes and protein folds using Machine Learning techniques can help reduce this gap. The Structural Classification of Protein Extended (SCOPe 2.07) is the latest and largest dataset available at present for prediction of protein structural classes and protein folds. It is a valuable repository of thousands of protein sequences responsible for a protein fold and belonging to a respective protein structural class from the Protein Data Bank. The protein sequences with less than 40% identity to each other are used for predicting 4 protein structural classes and 27 protein folds. The sensitive features are extracted from the primary and secondary structure representations of proteins. The secondary structure is predicted using DSSP algorithm. The statistical experiments are performed for selecting and ordering features specific to secondary structure representations. Probability measure providing significance for protein structural classes and protein folds played key role during statistical experiment. Mainly frequency, pitch and spatial arrangements are considered for selecting motifs. newline newline |
Pagination: | X,69 |
URI: | http://hdl.handle.net/10603/491132 |
Appears in Departments: | Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 185.76 kB | Adobe PDF | View/Open |
02_prelim_pages.pdf | 811.04 kB | Adobe PDF | View/Open | |
03_content.pdf | 682.66 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 327.67 kB | Adobe PDF | View/Open | |
05_chapter-1.pdf | 637.97 kB | Adobe PDF | View/Open | |
08_chapter-4.pdf | 327.5 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 1.01 MB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: