A Big Data Wrangling Framework for Healthcare Data using Machine Intelligence

Anto Praveena M D

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/450875

Title:	A Big Data Wrangling Framework for Healthcare Data using Machine Intelligence
Researcher:	Anto Praveena M D
Guide(s):	Bharathi B
Keywords:	Computer Science Computer Science Information Systems Engineering and Technology
University:	Sathyabama Institute of Science and Technology
Completed Date:	2021
Abstract:	Healthcare datasets have been successfully used to build a newlinevariety of decision making systems. Nevertheless, there is still much newlineroom for improvement in the decision-making system. Many healthcare newlinedatasetsare incomplete and inconsistent as their volume is huge and newlinecollected from various sources. Hence they cause unreliable and newlineincorrect insights in decision making system. It is essential to impute the newlinemissing values, remove the noisy, inconsistent data and reduce the newlineattributes to produce quality insights from the data. newlineData wrangling is a vital step in ensuring data consistency, newlineassisting in the creation of reliable decision-making systems. There are newlinenumber of standard preprocessing methods or tools available. Those newlinemethodologies cannot be applied to all healthcare datasetsgenerically as newlinetheir characteristics differ. Therefore, a novel approach is needed to newlineovercome the issues in standard preprocessing methods to generate newlineaccurate insights. newlineThe goal of this research isto preprocess the raw healthcare newlinedataset and improve the accuracy of the insights obtained from the newlinedecision-making system. The below are the key areas of contribution of newlinethis research. newline1. Missing value Imputation newline2. De-duplication of records newline3. Dimensionality reduction newlineix newline4. Outlier detection and removal newlineMissing values in healthcare dataset are due to Missing newlineCompletely At Random (MCAR), Missing At Random (MAR), and newlineMissing Not At Random(MNAR). A Deep Auto encoder Restricted newlineBoltzmann Machine based imputation system is developed for data newlineimputation. newlineAn Optimal Removal of Deduplication (ORD) utilizing a newlinehybrid trust-based neural network method dubbed Mimic Deep Neural newlineNetwork (MDNN) removes duplicated records. The Chaotic Whale newlineOptimization (CWO) method is utilized in the ORD scheme to compute newlinethe trust value using various decision metrics. The calculated trust value newlineand the type of the data will be fed into the MDNN in order to classify newlineduplicated data. newlineAnt Colony Optimization algorithm (ACO) with Quick Branch newlineand Bound algorithm (QBB) are toget
Pagination:	A5, VII, 157
URI:	http://hdl.handle.net/10603/450875
Appears in Departments:	COMPUTER SCIENCE DEPARTMENT

Files in This Item:

File	Description	Size	Format
10.chapter 6.pdf	Attached File	796.88 kB	Adobe PDF	View/Open
11.chapter 7.pdf		425.6 kB	Adobe PDF	View/Open
12.annextures.pdf		1.77 MB	Adobe PDF	View/Open
1.title.pdf		33.08 kB	Adobe PDF	View/Open
2.prelim pages.pdf		1.23 MB	Adobe PDF	View/Open
3.abstract.pdf		313.48 kB	Adobe PDF	View/Open
4.contents.pdf		211.27 kB	Adobe PDF	View/Open
5.chapter 1.pdf		397.93 kB	Adobe PDF	View/Open
6.chapter 2.pdf		218.49 kB	Adobe PDF	View/Open
7.chapter 3.pdf		747.38 kB	Adobe PDF	View/Open
80_recommendation.pdf		33.08 kB	Adobe PDF	View/Open
8.chapter 4.pdf		796.08 kB	Adobe PDF	View/Open
9.chapter 5.pdf		831.28 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET