Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/290372
Title: Design and Development of Data Mining based Integrated Diabetes risk score system for Indian Population
Researcher: Chandrakar Omprakash
Guide(s): Bhatti Dharmendra G and Saini Jatinderkumar R
Keywords: Computer Science
Data Mining
Engineering and Technology
University: Uka Tarsadia University
Completed Date: 2020
Abstract: Data preprocessing plays a crucial role in the data mining process. For example, discretization is used as a pre-processing step to improve the accuracy of data mining algorithms. In the present study, researchers have proposed, implemented and applied a novel algorithm for discretization using clustering and association rule mining. It considers the semantics of the data apart from the data value. The existing newlinediscretization methods focus on the value of data and ignore the context. Researchers have also proposed and applied a majority vote based iterative feature selection algorithm to identify the optimal set of attributes for classification. It uses three parameters namely, correlation, attribute gain ratio, and attribute information gain, iteratively to identify the least significant attributes. By removeing these least significant attributes, an optimal set of attributes is obtained. To study the impact of the proposed algorithms, researchers have applied it on the standard Pima Indian Diabetes dataset (taken from the University of California, Irvine Machine Learning Repository) and observed an average improvement in classification accuracy by 2.05%. newlineIndia is going to be diabetic capital of the world very soon and the undiagnosed diabetic persons are more than half of the total diabetic population. So, researchers have proposed a novel algorithm for calculating diabetes risk score and applied it in deriving Indian Weighted Diabetes Risk Score for Type-2 Diabetes. The proposed algorithm for calculating the diabetes risk score is based on the data mining newlinetechniques. It uses the porposed majority vote based iterative feature selection algorithm to identify the most significant risk factors and semanitic discretization algoritm for discretizing the risk factors. newlineResearchers have also proposed, designed, and developed an aggregate classification model for type-2 diabetes risk prediction Indian population using five different classifiers, namely Naive Bayes, Decision Table, Decision Stump, PART, and J48.
Pagination: xvii,188p
URI: http://hdl.handle.net/10603/290372
Appears in Departments:Faculty of Computer Science

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File226.48 kBAdobe PDFView/Open
02_certificates.pdf1.26 MBAdobe PDFView/Open
03_preliminary.pdf240 kBAdobe PDFView/Open
04_chapter 1.pdf381.85 kBAdobe PDFView/Open
05_chapter 2.pdf920.05 kBAdobe PDFView/Open
06_chapter 3.pdf943.69 kBAdobe PDFView/Open
07_chapter 4.pdf926.41 kBAdobe PDFView/Open
08_chapter 5.pdf369.52 kBAdobe PDFView/Open
09_chapter 6.pdf444.21 kBAdobe PDFView/Open
10_chapter 7.pdf565.27 kBAdobe PDFView/Open
11_chapter 8.pdf793.81 kBAdobe PDFView/Open
12_chapter 9.pdf184.14 kBAdobe PDFView/Open
13_references.pdf314.99 kBAdobe PDFView/Open
14_aanexure.pdf516.5 kBAdobe PDFView/Open
15_publications.pdf215.35 kBAdobe PDFView/Open
80_recommendation.pdf599.64 kBAdobe PDFView/Open


Items in Shodhganga are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: