Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/575205
Title: Anomaly Detection in High Dimensional Data with Volume and Velocity Aspects
Researcher: Upasana Gupta
Guide(s): Vaishali Singh
Keywords: Computer Science
Computer Science Interdisciplinary Applications
Engineering and Technology
University: Maharishi University of Information Technology
Completed Date: 2024
Abstract: Detecting anomalies in high-dimensional data is very difficult; this is particularly true in industries like healthcare, where datasets are both large and dynamic. As a result, we address these issues directly and improve anomaly detection accuracy by introducing a robust technique in R. The data is quite complicated and non-linear, making it difficult for linear methods like Principal Component Analysis (PCA) to understand. We acknowledge the inherent low-dimensional manifold that often characterizes high-dimensional data and our method incorporates iterative learning techniques to overcome this restriction. This will allow our technique to better decipher the intricate relationship between the factors. Our technique goes beyond only finding abnormalities; it aims to provide a thorough groundwork for data exploration, analysis, and anomaly identification in the future. To begin, we use R packages such as ggplot2, dplyr, and Keras to prepare the data thoroughly, fixing problems like missing values and obtaining visual insights from the dataset. In addition, we use sophisticated statistical methods such as Mahalanobis distance to identify and remove outliers, guaranteeing that the results of the following studies are accurate. A multi-pronged technique integrating Auto-encoders and t-SNE reliably captures complicated interactions among variables, achieving dimensionality reduction a crucial step in reducing the curse of dimensionality. An Artificial Neural Network called a Multi-Layer Perceptron (MLP) uses these cleaned-up datasets to identify outliers by analyzing rebuilding mistakes. We also investigate several methods, such as Metric Multi-Dimensional Scaling using Artificial Neural Networks, to evaluate how well they operate with complicated and huge datasets, such those seen in healthcare. We provide a viable solution for reliable anomaly identification in high-dimensional data worlds by validating our methodology via rigorous empirical testing and comparison against existing approaches. Our approach is resil
Pagination: 
URI: http://hdl.handle.net/10603/575205
Appears in Departments:Department of Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
80_recommendation.pdfAttached File286.19 kBAdobe PDFView/Open
abstract.pdf285.17 kBAdobe PDFView/Open
chapter 1.pdf800.32 kBAdobe PDFView/Open
chapter 2.pdf586.22 kBAdobe PDFView/Open
chapter 3.pdf1.05 MBAdobe PDFView/Open
chapter 4.pdf719.9 kBAdobe PDFView/Open
chapter 5.pdf862.29 kBAdobe PDFView/Open
chapter 6.pdf310.65 kBAdobe PDFView/Open
contents.pdf290.13 kBAdobe PDFView/Open
declaration_merged.pdf1.14 MBAdobe PDFView/Open
reference_papers_merged.pdf4.43 MBAdobe PDFView/Open
title.pdf200.02 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: