Privacy Preservation Techniques for High  Dimensional Data

SHASHIDHAR VIRUPAKSHA

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/405784

Title:	Privacy Preservation Techniques for High Dimensional Data
Researcher:	SHASHIDHAR VIRUPAKSHA
Guide(s):	VENKATESULU DONDETI
Keywords:	Engineering and Technology Computer Science Computer Science Theory and Methods
University:	Vignans Foundation for Science Technology and Research
Completed Date:	2022
Abstract:	Data is being collected every day by the organizations and data mining is performed on the collected data. When data is shared for data mining, sensitive information about individuals is also revealed. Many governments have enacted legislation also to preserve privacy. Hence Privacy Preserving Data Mining (PPDM) algorithms were also developed. PPDM works by transforming the dataset to preserve confidential information while performing data mining. newline newlineIn the last few years, many applications and organizations deal with High Dimensional (HD) continuous datasets in data mining. PPDM on HD continuous datasets is challenging because HD datasets contain many irrelevant dimensions. Data is now available in subspaces. Data characteristics are not the same in these subspaces. Identifying to which cluster a point belongs is difficult. Thus, PPDM on HD datasets results in data loss, information loss, and some original clusters are lost. HD continuous datasets are especially being used in medical and healthcare for disease diagnosis, newborn screen and gene analysis. PPDM on such HD continuous data is even more difficult because some datasets are noise sensitive, have less records, privacy offered is low since a small distortion leads to very high data loss, information loss and original clusters are lost. Hence in this thesis, PPDM algorithms are proposed for high dimensional continuous data. newlinePPDM algorithms on continuous data are classified into two major categories anonymization and noise addition. This thesis proposes novel anonymization algorithm Subspace Based Aggregation (SBA), novel noise addition algorithm Subspace Based Noise Addition (SBNA) and a novel hybrid Anonymized Noise Addition in Subspaces (ANAS). SBA, SBNA and ANAS decreases data loss, information loss and enhance clusters that are identified. SBA computes the subspaces first. Records in these subspaces are then grouped based on squared Euclidean distances. These groups of records are now aggregated.
Pagination:	145
URI:	http://hdl.handle.net/10603/405784
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
10_chapter-3.pdf	Attached File	326.01 kB	Adobe PDF	View/Open
11_chapter-4.pdf		266.22 kB	Adobe PDF	View/Open
12-chapter-5.pdf		283.88 kB	Adobe PDF	View/Open
13_chapter-6.pdf		60.53 kB	Adobe PDF	View/Open
14_publications.pdf		174.12 kB	Adobe PDF	View/Open
1_title.pdf		126.27 kB	Adobe PDF	View/Open
2_declaration.pdf		56.98 kB	Adobe PDF	View/Open
3_certificate.pdf		122.37 kB	Adobe PDF	View/Open
4_acknowledgement.pdf		129.37 kB	Adobe PDF	View/Open
5_content.pdf		72.05 kB	Adobe PDF	View/Open
6_list of graphs & tables.pdf		144.37 kB	Adobe PDF	View/Open
7_abstract.pdf		60.29 kB	Adobe PDF	View/Open
80_recommendation.pdf		765.78 kB	Adobe PDF	View/Open
8_chapter-1.pdf		182.05 kB	Adobe PDF	View/Open
9-chapter-2.pdf		148.42 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET