Analysis and Prediction of Cancer using Genome by Applying Data Mining Algorithms

Upadhyay, Tejal

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/430624

Title:	Analysis and Prediction of Cancer using Genome by Applying Data Mining Algorithms
Researcher:	Upadhyay, Tejal
Guide(s):	Patel, Samir
Keywords:	Computer Science Computer Science Interdisciplinary Applications Engineering and Technology
University:	Charotar University of Science and Technology
Completed Date:	2022
Abstract:	The complete set of DNA is represented by a sequence called genome which has all genes. To build and maintain any organism, every genome contains the entirety of the data required. In human body, more than 300 crores DNA base pairs are maintained with all cells and their nucleus. The complete study of genome is called genomics. newlineInformation mining is the course toward finding structures in huge educational collections including techniques at the convergence motivation behind bits of information and database frameworks. It is an interdisciplinary sub field of programming building and estimations with a general goal to discard information (with sharp methods) from an educational assortment and change the information into a coherent organization for extra use. There are two functionalities-arrangement and grouping which can be applied on information to digest information from the huge dataset. newlineIn this research work, genome study is done and on that study we have applied few data mining techniques like supervised and unsupervised learning on cancer data sets. The data set is available on the website of Bioconductor and R packages are used for further analysis. The implementation detail is divided into four different parts. First part of research is based on classification where we have identified Leukaemia types by applying classification algorithms. The second part is to identify the subtypes of cancer using non supervised learning clustering. The next parts are focused on different types of clustering methods which can be applied on genome study and also perform some fusion of clusters. newlineAs a part of pre-processing techniques, outliers needs to be removed so cleaning of data, transformation and reduction of data is applied. In the existing algorithms of data mining, there are many short comings. In this, an attempt is to make to overcome the disadvantages offered by existing algorithms by applying some mix and modified approaches and try to improve the prediction of the cancer disease from the genomics. newline
Pagination:
URI:	http://hdl.handle.net/10603/430624
Appears in Departments:	Faculty of Technology and Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	84.49 kB	Adobe PDF	View/Open
02_prelim pages.pdf		321.96 kB	Adobe PDF	View/Open
03_contents.pdf		142.72 kB	Adobe PDF	View/Open
04_abstract.pdf		53.37 kB	Adobe PDF	View/Open
05_chapter 1.pdf		152.33 kB	Adobe PDF	View/Open
06_chapter 2.pdf		477 kB	Adobe PDF	View/Open
07_chapter 3.pdf		764.62 kB	Adobe PDF	View/Open
08_chapter 4.pdf		436.52 kB	Adobe PDF	View/Open
09_chapter 5.pdf		1.37 MB	Adobe PDF	View/Open
10_chapter 6.pdf		113.24 kB	Adobe PDF	View/Open
11_anexures.pdf		2.46 MB	Adobe PDF	View/Open
80_recommendation.pdf		119.65 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET