Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/489506
Title: Exon Prediction Using Machine Learning Approaches
Researcher: Noopur Singh
Guide(s): Ravindra Nath, Dev Bukhsh Singh
Keywords: Bioinformatics
Biotechnology and Applied Microbiology
Life Sciences
Microbiology
University: Dr. A.P.J. Abdul Kalam Technical University
Completed Date: 2023
Abstract: newline Machine learning is the area of artificial intelligence that focuses on statistics, algorithms, and other related scientific techniques used for information extraction.A difficult problem in bioinformatics is the accurate identification of exons in eukaryotic Deoxyribonucleic Acid (DNA) sequences due to their 3-base periodicity. The importance of exon prediction is that it also leads to the identification of introns and splice sites. Several machine learning approaches have been used for identification and prediction tasks, with a focus on giving numerical values to the symbolic DNA sequence and then using computer tools and statistical analysis to find periodicity components.The proposed approaches increase processing speed and decrease computing complexity. Another benefit of our approach is the efficient detection of exons in large DNA sequences that many tools fail to perform. Using specificity-sensitivity values, Receiver Operating Curves (ROC), and the area under the ROC curve (AUC), the ability of the proposed method to predict exons is compared with some existing approaches at the nucleotide level. According to the modelling findings, our approach improves exon detection accuracy in comparison to existing exon prediction techniques. First, the Hidden Markov Model HMM) has been used on a random DNA sequence for exon prediction. The model provides a probability distribution for both the gene structure and DNA sequences. The best probable gene architecture is typically determined from the nucleotide training data by a program based on these models. An arbitrary DNA sequence, which might be any size, is released by each state. Employing a training instance of identified sequences at each species, it was possible to compute the dispersion of such sequences as well as the probability of state transition and emission among them. The Forward-Backward method and Markov chain sequence patterns have been used to establish this probability for every state. The value of the probability of observation seq
Pagination: 
URI: http://hdl.handle.net/10603/489506
Appears in Departments:Dean P.G.S.R

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File2.49 MBAdobe PDFView/Open
02_prelims page.pdf2.5 MBAdobe PDFView/Open
03_content.pdf2.49 MBAdobe PDFView/Open
04_abstract.pdf2.49 MBAdobe PDFView/Open
05_chapter 1.pdf2.5 MBAdobe PDFView/Open
06_chapter 2.pdf2.5 MBAdobe PDFView/Open
07_chapter 3.pdf2.5 MBAdobe PDFView/Open
08_chapter 4.pdf2.5 MBAdobe PDFView/Open
09_chapter 5.pdf2.49 MBAdobe PDFView/Open
10_annexures.pdf2.5 MBAdobe PDFView/Open
80_recommendation.pdf108.88 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: