Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/556101
Title: Enhanced Data Cleaning and Outlier Detection Models with Hybrid Metaheuristic Approaches
Researcher: Kumar Rahul
Guide(s): Rohitash Kumar Banyal
Keywords: Computer Science
Computer Science Information Systems
Engineering and Technology
University: Rajasthan Technical University, Kota
Completed Date: 2023
Abstract: Applications run based on clean data analysis reports. Data are generated through newlinedifferent sources such as social media, e-commerce, blogs, banking, healthcare, transactions, newlineapps, websites, opinion platforms, etc. So, there is a chance of dirty data increase. It is newlineprocessed for effective utilization in different industries, including healthcare. Every newlinebusiness enterprises require noise-free and clean data. Dirty data includes incorrect data, newlinedummy data, duplicate data, inappropriate data, violation of data, etc. Due to different newlinecauses, including abbreviation misuse, incorrect spelling, outdated data, and dirty data newlinegenerated. To avoid the wrong conclusions, the data cleaning process becomes vital. The newlineeffort is to introduce a novel data-cleaning technique to remove dirty data effectively. This newlineprocess involves (i) Dirty data detection and (ii) Dirty data cleaning. newlineThe dirty data detection process has been assigned the following: data normalization, newlinehashing, clustering, and finding the suspected data. In the clustering process, the optimal newlinecentroid selection is promising and is carried out by employing the optimization concept. newlineAfter the finishing of dirty data prediction, the subsequent process of dirty data cleaning newlinebegins to activate. The cleaning process also assigns some processes: the leveling process, newlineHuffman coding, and cleaning the suspected data. The cleaning of suspected data is newlineperformed based on the optimization concept. newlineHence, to solve all optimization problems, a new hybridized algorithm is developed newlinecalled the Firefly Update enabled Rider Optimization Algorithm (FU-ROA), which is the newlinehybridization of the Rider Optimization Algorithm (ROA) and Firefly (FF) algorithms. In newlinethe end, the analysis of the performance of the implanted data cleaning method is compared newlineover the other traditional methods like Particle Swarm Optimization (PSO), FF, Grey Wolf newlineOptimizer (GWO), and ROA in terms of their positive and negative measures.
Pagination: 30.4 mb
URI: http://hdl.handle.net/10603/556101
Appears in Departments:Computer Engineering

Files in This Item:
File Description SizeFormat 
80_recommendation.pdfAttached File241.18 kBAdobe PDFView/Open
abstract.pdf84.24 kBAdobe PDFView/Open
annexures.pdf4.96 MBAdobe PDFView/Open
chapter 1.pdf207.48 kBAdobe PDFView/Open
chapter 2.pdf236.71 kBAdobe PDFView/Open
chapter 3.pdf545.72 kBAdobe PDFView/Open
chapter 4.pdf831.45 kBAdobe PDFView/Open
chapter 5.pdf522.16 kBAdobe PDFView/Open
chapter 6.pdf87.24 kBAdobe PDFView/Open
contents.pdf527.7 kBAdobe PDFView/Open
prelim pages.pdf3.27 MBAdobe PDFView/Open
title.pdf100.87 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: