Design of algorithms for gene predictions

Maji, Srabanti

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/227192

Full metadata record

DC Field	Value	Language
dc.coverage.spatial
dc.date.accessioned	2019-01-25T10:30:33Z	-
dc.date.available	2019-01-25T10:30:33Z	-
dc.identifier.uri	http://hdl.handle.net/10603/227192	-
dc.description.abstract	Identification of coding sequence from genomic DNA sequence is the major step in pursuit of gene identification. In the prediction of splice site, which is the separation between exons and introns, though the sequences adjacent to the splice sites have a high conservation, but still, the accuracy is lower than 90%. Therefore, here, both approaches Conventional as well as Computational Intelligences (CI) have been pursued to predict the splice site in DNA sequence of the Eukaryotic organism and, both have been evaluated and compared in terms of their performance. In the conventional approach, i.e., Hidden Markov Model (HMM) System , the model architecture includes the probabilistic descriptions of the splicing, translational, and transcriptional signals. Splice sites predictor based on Unique Hidden Markov Model (HMM) is developed and trained using Modified Expectation Maximization (MEM) algorithm. A 12 fold cross validation technique is also applied to check the reproducibility of the results obtained and to further increase the prediction accuracy. The proposed system is able to achieve the accuracy of 98% of true donor site and 93% for true acceptor site in the standard DNA (nucleotide) sequence. The second proposed method, based on combination of conventional and computational intelligences, namely, Markov Model 2 Feature Support Vector Machine (MM2F-SVM) consists of three stages initial stage, in which a second order Markov Model (MM2) is used; intermediate, or the second stage in which principal feature analysis (PFA) is done; and the third or final stage, in which a support vector machine (SVM) with Gaussian kernel is used. The first stage is known as feature extraction ; the second stage is called feature selection and, the final stage is known as classification . The model is proficient of indicating the reliability of each predicted splice site with high accuracy.
dc.format.extent	xv, 95p.
dc.language	English
dc.relation
dc.rights	university
dc.title	Design of algorithms for gene predictions
dc.title.alternative
dc.creator.researcher	Maji, Srabanti
dc.subject.keyword	Bioinformatics
dc.subject.keyword	Gene Identification
dc.subject.keyword	Splice Site
dc.subject.keyword	Support Vector Machine
dc.description.note
dc.contributor.guide	Garg, Deepak
dc.publisher.place	Patiala
dc.publisher.university	Thapar Institute of Engineering and Technology
dc.publisher.institution	Department of Computer Science and Engineering
dc.date.registered
dc.date.completed	2013
dc.date.awarded
dc.format.dimensions
dc.format.accompanyingmaterial	None
dc.source.university	University
dc.type.degree	Ph.D.
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
file10(publications).pdf	Attached File	11.51 kB	Adobe PDF	View/Open
file11(references).pdf		121.92 kB	Adobe PDF	View/Open
file1(title).pdf		16.3 kB	Adobe PDF	View/Open
file2(certificate).pdf		42.83 kB	Adobe PDF	View/Open
file3(preliminary pages).pdf		106.09 kB	Adobe PDF	View/Open
file4(chapter 1).pdf		39.63 kB	Adobe PDF	View/Open
file5(chapter 2).pdf		66.4 kB	Adobe PDF	View/Open
file6(chapter 3).pdf		763.53 kB	Adobe PDF	View/Open
file7(chapter 4).pdf		1.76 MB	Adobe PDF	View/Open
file8(chapter 5).pdf		359.79 kB	Adobe PDF	View/Open
file9(chapter 6).pdf		22.48 kB	Adobe PDF	View/Open

Show simple item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET