Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/343497
Title: Data Mining and Intelligent Computing Method For Protein Function Prediction
Researcher: Chhote Lal Prasad Gupta
Guide(s): Sudhakar Tripathi
Keywords: Computer Science
Computer Science Theory and Methods
Engineering and Technology
University: Dr. A.P.J. Abdul Kalam Technical University
Completed Date: 2021
Abstract: Now a day, the study of proteins is one of the emerging fields of study. Protein class prediction is one of the challenging tasks because unknown proteins founded during investigation day by day. Every protein has its own structure that is very complex and difficult to understand. Due to its complex structure, it takes a lot of time and human efforts to identify its category. Also, it is required a massive setup of infrastructure. As a result, identifying proteins of a respective group by clinical verification or community is very complex. It requires a computational mechanism to fast-finding their structures, features, and appropriate groups. In this regard, machine learning techniques help scholars in identifying and predicting protein s enzyme class, structure, and features. The machine learning techniques reduces human efforts as well as time in identification and prediction. There are many techniques available in machine learning, some of them based on statistical methods, some are based on regression techniques, whereas some are based on ensemble-based learning. This research has considered highly used machine learning techniques from all categories of techniques such as CRT, QUEST, CHAID, C5.0, ANN, SVM, Bayesian, Random Forests XgBoost and CatBoost. newlineFor this study, the data has been extracted from UniprotKB protein knowledge base central repository. This repository contains many proteins related to data classified in reviewed known as SWISSProt and non-reviewed data known as TrEMBL. The reviewed data is manually annotated by the human and used in this study; however, the non-reviewed data is automatically annotated, which has many errors. To rectify this error, it is required to review it manually. In manual review it is required lots of errort, time and resources. So, in this regard, the computational techniques help scholars to reduce the effort, time and resources. For this study, only enzyme class of reviewed data for different organism such as Human, RAT and Mouse have been used. Each category of dat
Pagination: 
URI: http://hdl.handle.net/10603/343497
Appears in Departments:dean PG Studies and Research

Files in This Item:
File Description SizeFormat 
80_recommendation.pdfAttached File348.78 kBAdobe PDFView/Open
certificate.pdf87.79 kBAdobe PDFView/Open
chapter 1.pdf378.77 kBAdobe PDFView/Open
chapter 2.pdf350.7 kBAdobe PDFView/Open
chapter 3.pdf579.46 kBAdobe PDFView/Open
chapter 4.pdf585.74 kBAdobe PDFView/Open
chapter 5.pdf631.87 kBAdobe PDFView/Open
chapter 6.pdf625.52 kBAdobe PDFView/Open
chapter 7.pdf815.61 kBAdobe PDFView/Open
chapter 8.pdf122.57 kBAdobe PDFView/Open
preliminary pages.pdf572.22 kBAdobe PDFView/Open
title.pdf16.21 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: