Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/430624
Title: Analysis and Prediction of Cancer using Genome by Applying Data Mining Algorithms
Researcher: Upadhyay, Tejal
Guide(s): Patel, Samir
Keywords: Computer Science
Computer Science Interdisciplinary Applications
Engineering and Technology
University: Charotar University of Science and Technology
Completed Date: 2022
Abstract: The complete set of DNA is represented by a sequence called genome which has all genes. To build and maintain any organism, every genome contains the entirety of the data required. In human body, more than 300 crores DNA base pairs are maintained with all cells and their nucleus. The complete study of genome is called genomics. newlineInformation mining is the course toward finding structures in huge educational collections including techniques at the convergence motivation behind bits of information and database frameworks. It is an interdisciplinary sub field of programming building and estimations with a general goal to discard information (with sharp methods) from an educational assortment and change the information into a coherent organization for extra use. There are two functionalities-arrangement and grouping which can be applied on information to digest information from the huge dataset. newlineIn this research work, genome study is done and on that study we have applied few data mining techniques like supervised and unsupervised learning on cancer data sets. The data set is available on the website of Bioconductor and R packages are used for further analysis. The implementation detail is divided into four different parts. First part of research is based on classification where we have identified Leukaemia types by applying classification algorithms. The second part is to identify the subtypes of cancer using non supervised learning clustering. The next parts are focused on different types of clustering methods which can be applied on genome study and also perform some fusion of clusters. newlineAs a part of pre-processing techniques, outliers needs to be removed so cleaning of data, transformation and reduction of data is applied. In the existing algorithms of data mining, there are many short comings. In this, an attempt is to make to overcome the disadvantages offered by existing algorithms by applying some mix and modified approaches and try to improve the prediction of the cancer disease from the genomics. newline
Pagination: 
URI: http://hdl.handle.net/10603/430624
Appears in Departments:Faculty of Technology and Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File84.49 kBAdobe PDFView/Open
02_prelim pages.pdf321.96 kBAdobe PDFView/Open
03_contents.pdf142.72 kBAdobe PDFView/Open
04_abstract.pdf53.37 kBAdobe PDFView/Open
05_chapter 1.pdf152.33 kBAdobe PDFView/Open
06_chapter 2.pdf477 kBAdobe PDFView/Open
07_chapter 3.pdf764.62 kBAdobe PDFView/Open
08_chapter 4.pdf436.52 kBAdobe PDFView/Open
09_chapter 5.pdf1.37 MBAdobe PDFView/Open
10_chapter 6.pdf113.24 kBAdobe PDFView/Open
11_anexures.pdf2.46 MBAdobe PDFView/Open
80_recommendation.pdf119.65 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: