Protein Function Identification by Predicting Protein Structure using Machine Learning

Mehta, Apurva A.

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/491132

Title:	Protein Function Identification by Predicting Protein Structure using Machine Learning
Researcher:	Mehta, Apurva A.
Guide(s):	Mazumdar, Himanshu S.
Keywords:	Computer Science Computer Science Interdisciplinary Applications Engineering and Technology Machine Learning Protein Function, Protein Structure Random Forest
University:	Dharmsinh Desai University
Completed Date:	2022
Abstract:	Current work reflects that influence of Machine Learning in Computational Biology is immense. From the plethora of computational biology problems we choose to address the problem of protein structure (i.e. protein structural classes and protein folds) prediction using machine learning algorithm. The machine learning algorithm is also used to identify significant miRNA biomarkers targeting breast cancer patients. Thus, identification of protein function in terms of the role of miRNA is conceived using machine learning algorithm. newline The main motivation for selecting the problem of protein structure for prediction is as protein tertiary structure discovery growth rate is lagging far behind discovery of protein primary structure, at present. The prediction of protein structures like protein structural classes and protein folds using Machine Learning techniques can help reduce this gap. The Structural Classification of Protein Extended (SCOPe 2.07) is the latest and largest dataset available at present for prediction of protein structural classes and protein folds. It is a valuable repository of thousands of protein sequences responsible for a protein fold and belonging to a respective protein structural class from the Protein Data Bank. The protein sequences with less than 40% identity to each other are used for predicting 4 protein structural classes and 27 protein folds. The sensitive features are extracted from the primary and secondary structure representations of proteins. The secondary structure is predicted using DSSP algorithm. The statistical experiments are performed for selecting and ordering features specific to secondary structure representations. Probability measure providing significance for protein structural classes and protein folds played key role during statistical experiment. Mainly frequency, pitch and spatial arrangements are considered for selecting motifs. newline newline
Pagination:	X,69
URI:	http://hdl.handle.net/10603/491132
Appears in Departments:	Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	185.76 kB	Adobe PDF	View/Open
02_prelim_pages.pdf		811.04 kB	Adobe PDF	View/Open
03_content.pdf		682.66 kB	Adobe PDF	View/Open
04_abstract.pdf		327.67 kB	Adobe PDF	View/Open
05_chapter-1.pdf		637.97 kB	Adobe PDF	View/Open
08_chapter-4.pdf		327.5 kB	Adobe PDF	View/Open
80_recommendation.pdf		1.01 MB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET