Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/450875
Title: A Big Data Wrangling Framework for Healthcare Data using Machine Intelligence
Researcher: Anto Praveena M D
Guide(s): Bharathi B
Keywords: Computer Science
Computer Science Information Systems
Engineering and Technology
University: Sathyabama Institute of Science and Technology
Completed Date: 2021
Abstract: Healthcare datasets have been successfully used to build a newlinevariety of decision making systems. Nevertheless, there is still much newlineroom for improvement in the decision-making system. Many healthcare newlinedatasetsare incomplete and inconsistent as their volume is huge and newlinecollected from various sources. Hence they cause unreliable and newlineincorrect insights in decision making system. It is essential to impute the newlinemissing values, remove the noisy, inconsistent data and reduce the newlineattributes to produce quality insights from the data. newlineData wrangling is a vital step in ensuring data consistency, newlineassisting in the creation of reliable decision-making systems. There are newlinenumber of standard preprocessing methods or tools available. Those newlinemethodologies cannot be applied to all healthcare datasetsgenerically as newlinetheir characteristics differ. Therefore, a novel approach is needed to newlineovercome the issues in standard preprocessing methods to generate newlineaccurate insights. newlineThe goal of this research isto preprocess the raw healthcare newlinedataset and improve the accuracy of the insights obtained from the newlinedecision-making system. The below are the key areas of contribution of newlinethis research. newline1. Missing value Imputation newline2. De-duplication of records newline3. Dimensionality reduction newlineix newline4. Outlier detection and removal newlineMissing values in healthcare dataset are due to Missing newlineCompletely At Random (MCAR), Missing At Random (MAR), and newlineMissing Not At Random(MNAR). A Deep Auto encoder Restricted newlineBoltzmann Machine based imputation system is developed for data newlineimputation. newlineAn Optimal Removal of Deduplication (ORD) utilizing a newlinehybrid trust-based neural network method dubbed Mimic Deep Neural newlineNetwork (MDNN) removes duplicated records. The Chaotic Whale newlineOptimization (CWO) method is utilized in the ORD scheme to compute newlinethe trust value using various decision metrics. The calculated trust value newlineand the type of the data will be fed into the MDNN in order to classify newlineduplicated data. newlineAnt Colony Optimization algorithm (ACO) with Quick Branch newlineand Bound algorithm (QBB) are toget
Pagination: A5, VII, 157
URI: http://hdl.handle.net/10603/450875
Appears in Departments:COMPUTER SCIENCE DEPARTMENT

Files in This Item:
File Description SizeFormat 
10.chapter 6.pdfAttached File796.88 kBAdobe PDFView/Open
11.chapter 7.pdf425.6 kBAdobe PDFView/Open
12.annextures.pdf1.77 MBAdobe PDFView/Open
1.title.pdf33.08 kBAdobe PDFView/Open
2.prelim pages.pdf1.23 MBAdobe PDFView/Open
3.abstract.pdf313.48 kBAdobe PDFView/Open
4.contents.pdf211.27 kBAdobe PDFView/Open
5.chapter 1.pdf397.93 kBAdobe PDFView/Open
6.chapter 2.pdf218.49 kBAdobe PDFView/Open
7.chapter 3.pdf747.38 kBAdobe PDFView/Open
80_recommendation.pdf33.08 kBAdobe PDFView/Open
8.chapter 4.pdf796.08 kBAdobe PDFView/Open
9.chapter 5.pdf831.28 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: