Exon Prediction Using Machine Learning Approaches

Noopur Singh

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/489506

Title:	Exon Prediction Using Machine Learning Approaches
Researcher:	Noopur Singh
Guide(s):	Ravindra Nath, Dev Bukhsh Singh
Keywords:	Bioinformatics Biotechnology and Applied Microbiology Life Sciences Microbiology
University:	Dr. A.P.J. Abdul Kalam Technical University
Completed Date:	2023
Abstract:	newline Machine learning is the area of artificial intelligence that focuses on statistics, algorithms, and other related scientific techniques used for information extraction.A difficult problem in bioinformatics is the accurate identification of exons in eukaryotic Deoxyribonucleic Acid (DNA) sequences due to their 3-base periodicity. The importance of exon prediction is that it also leads to the identification of introns and splice sites. Several machine learning approaches have been used for identification and prediction tasks, with a focus on giving numerical values to the symbolic DNA sequence and then using computer tools and statistical analysis to find periodicity components.The proposed approaches increase processing speed and decrease computing complexity. Another benefit of our approach is the efficient detection of exons in large DNA sequences that many tools fail to perform. Using specificity-sensitivity values, Receiver Operating Curves (ROC), and the area under the ROC curve (AUC), the ability of the proposed method to predict exons is compared with some existing approaches at the nucleotide level. According to the modelling findings, our approach improves exon detection accuracy in comparison to existing exon prediction techniques. First, the Hidden Markov Model HMM) has been used on a random DNA sequence for exon prediction. The model provides a probability distribution for both the gene structure and DNA sequences. The best probable gene architecture is typically determined from the nucleotide training data by a program based on these models. An arbitrary DNA sequence, which might be any size, is released by each state. Employing a training instance of identified sequences at each species, it was possible to compute the dispersion of such sequences as well as the probability of state transition and emission among them. The Forward-Backward method and Markov chain sequence patterns have been used to establish this probability for every state. The value of the probability of observation seq
Pagination:
URI:	http://hdl.handle.net/10603/489506
Appears in Departments:	Dean P.G.S.R

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	2.49 MB	Adobe PDF	View/Open
02_prelims page.pdf		2.5 MB	Adobe PDF	View/Open
03_content.pdf		2.49 MB	Adobe PDF	View/Open
04_abstract.pdf		2.49 MB	Adobe PDF	View/Open
05_chapter 1.pdf		2.5 MB	Adobe PDF	View/Open
06_chapter 2.pdf		2.5 MB	Adobe PDF	View/Open
07_chapter 3.pdf		2.5 MB	Adobe PDF	View/Open
08_chapter 4.pdf		2.5 MB	Adobe PDF	View/Open
09_chapter 5.pdf		2.49 MB	Adobe PDF	View/Open
10_annexures.pdf		2.5 MB	Adobe PDF	View/Open
80_recommendation.pdf		108.88 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET