Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/522606
Title: Improved big data privacy and security using hybrid elliptic curve cryptography with pillar K means clustering algorithm in secured map reduce layer
Researcher: Arogya Presskila X
Guide(s): Ramesh K
Keywords: Computer Science
Computer Science Information Systems
Engineering and Technology
Hadoop Framework
Pillar K-Means Clustering
Swarm Optimization
University: Anna University
Completed Date: 2023
Abstract: The traditional Big Data framework does not clearly state that both the structured and unstructured data sensitivity like health care, personal information data, online transaction data and so on. In addition, the concept of privacy and security of Big Data needs to be incorporated in the cluster nodes of the Map Reduce Layer in Hadoop framework, which is vulnerable and prone to be attacked. Accordingly, the threat of revealing personal data is subsequently alleviated in recent years. The proposed work Secured Map Reduce Layer (SMR) in the Big Data framework for improving the security and privacy of sensitive data through hybrid approach by integrating PSO (Particle Swarm Optimization) and Elliptic Curve Cryptographic mechanism. This proposed framework deals with, choosing the optimal private key for authentication, using Particle Swarm Optimization and protecting the data by Cryptographic mechanism of Elliptic Curve Cryptography. Particle swarm Optimization authorizes the global best fitness value to find the optimal private key with efficient authentication. The proposed methodology focuses on the storage of Big Data. In the first phase, Hadoop Distributed File System is used to store huge data in efficient ways and ensuring security and preserve privacy while data is processing in Map Reduce Layer. Efficient storage in HDFS is achieved by applying clustering for segregating the related data and group the similar data into blocks, which are stored in distributed nodes of HDFS for the ease of processing the data and preserving privacy. In this study, Semi Structured Medical Transcription text dataset is used. The initial step of clustering is text preprocessing to polish the text data to enhance the cluster outcomes by removing the noise. Traditional text pre-processing techniques like stop word removal, tokenization and stemming are applied in the dataset. The next step to convert the text data into numerical format for clustering is using tf-idf (term frequency-inverse document frequency) method. K-means and P
Pagination: xvi, 134 p.
URI: http://hdl.handle.net/10603/522606
Appears in Departments:Faculty of Information and Communication Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File25.87 kBAdobe PDFView/Open
02_prelim_pages.pdf3.56 MBAdobe PDFView/Open
03_content.pdf16.5 kBAdobe PDFView/Open
04_abstract.pdf12.85 kBAdobe PDFView/Open
05_chapter 1.pdf118.48 kBAdobe PDFView/Open
06_chapter 2.pdf405.76 kBAdobe PDFView/Open
07_chapter 3.pdf262.58 kBAdobe PDFView/Open
08_chapter 4.pdf920.61 kBAdobe PDFView/Open
09_chapter 5.pdf1.19 MBAdobe PDFView/Open
10_chapter 6.pdf8.11 kBAdobe PDFView/Open
11_annexures.pdf195.79 kBAdobe PDFView/Open
80_recommendation.pdf49.28 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: