Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/405784
Title: Privacy Preservation Techniques for High Dimensional Data
Researcher: SHASHIDHAR VIRUPAKSHA
Guide(s): VENKATESULU DONDETI
Keywords: Engineering and Technology
Computer Science
Computer Science Theory and Methods
University: Vignans Foundation for Science Technology and Research
Completed Date: 2022
Abstract: Data is being collected every day by the organizations and data mining is performed on the collected data. When data is shared for data mining, sensitive information about individuals is also revealed. Many governments have enacted legislation also to preserve privacy. Hence Privacy Preserving Data Mining (PPDM) algorithms were also developed. PPDM works by transforming the dataset to preserve confidential information while performing data mining. newline newlineIn the last few years, many applications and organizations deal with High Dimensional (HD) continuous datasets in data mining. PPDM on HD continuous datasets is challenging because HD datasets contain many irrelevant dimensions. Data is now available in subspaces. Data characteristics are not the same in these subspaces. Identifying to which cluster a point belongs is difficult. Thus, PPDM on HD datasets results in data loss, information loss, and some original clusters are lost. HD continuous datasets are especially being used in medical and healthcare for disease diagnosis, newborn screen and gene analysis. PPDM on such HD continuous data is even more difficult because some datasets are noise sensitive, have less records, privacy offered is low since a small distortion leads to very high data loss, information loss and original clusters are lost. Hence in this thesis, PPDM algorithms are proposed for high dimensional continuous data. newlinePPDM algorithms on continuous data are classified into two major categories anonymization and noise addition. This thesis proposes novel anonymization algorithm Subspace Based Aggregation (SBA), novel noise addition algorithm Subspace Based Noise Addition (SBNA) and a novel hybrid Anonymized Noise Addition in Subspaces (ANAS). SBA, SBNA and ANAS decreases data loss, information loss and enhance clusters that are identified. SBA computes the subspaces first. Records in these subspaces are then grouped based on squared Euclidean distances. These groups of records are now aggregated.
Pagination: 145
URI: http://hdl.handle.net/10603/405784
Appears in Departments:Department of Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
10_chapter-3.pdfAttached File326.01 kBAdobe PDFView/Open
11_chapter-4.pdf266.22 kBAdobe PDFView/Open
12-chapter-5.pdf283.88 kBAdobe PDFView/Open
13_chapter-6.pdf60.53 kBAdobe PDFView/Open
14_publications.pdf174.12 kBAdobe PDFView/Open
1_title.pdf126.27 kBAdobe PDFView/Open
2_declaration.pdf56.98 kBAdobe PDFView/Open
3_certificate.pdf122.37 kBAdobe PDFView/Open
4_acknowledgement.pdf129.37 kBAdobe PDFView/Open
5_content.pdf72.05 kBAdobe PDFView/Open
6_list of graphs & tables.pdf144.37 kBAdobe PDFView/Open
7_abstract.pdf60.29 kBAdobe PDFView/Open
80_recommendation.pdf765.78 kBAdobe PDFView/Open
8_chapter-1.pdf182.05 kBAdobe PDFView/Open
9-chapter-2.pdf148.42 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: