Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/405784
Title: | Privacy Preservation Techniques for High Dimensional Data |
Researcher: | SHASHIDHAR VIRUPAKSHA |
Guide(s): | VENKATESULU DONDETI |
Keywords: | Engineering and Technology Computer Science Computer Science Theory and Methods |
University: | Vignans Foundation for Science Technology and Research |
Completed Date: | 2022 |
Abstract: | Data is being collected every day by the organizations and data mining is performed on the collected data. When data is shared for data mining, sensitive information about individuals is also revealed. Many governments have enacted legislation also to preserve privacy. Hence Privacy Preserving Data Mining (PPDM) algorithms were also developed. PPDM works by transforming the dataset to preserve confidential information while performing data mining. newline newlineIn the last few years, many applications and organizations deal with High Dimensional (HD) continuous datasets in data mining. PPDM on HD continuous datasets is challenging because HD datasets contain many irrelevant dimensions. Data is now available in subspaces. Data characteristics are not the same in these subspaces. Identifying to which cluster a point belongs is difficult. Thus, PPDM on HD datasets results in data loss, information loss, and some original clusters are lost. HD continuous datasets are especially being used in medical and healthcare for disease diagnosis, newborn screen and gene analysis. PPDM on such HD continuous data is even more difficult because some datasets are noise sensitive, have less records, privacy offered is low since a small distortion leads to very high data loss, information loss and original clusters are lost. Hence in this thesis, PPDM algorithms are proposed for high dimensional continuous data. newlinePPDM algorithms on continuous data are classified into two major categories anonymization and noise addition. This thesis proposes novel anonymization algorithm Subspace Based Aggregation (SBA), novel noise addition algorithm Subspace Based Noise Addition (SBNA) and a novel hybrid Anonymized Noise Addition in Subspaces (ANAS). SBA, SBNA and ANAS decreases data loss, information loss and enhance clusters that are identified. SBA computes the subspaces first. Records in these subspaces are then grouped based on squared Euclidean distances. These groups of records are now aggregated. |
Pagination: | 145 |
URI: | http://hdl.handle.net/10603/405784 |
Appears in Departments: | Department of Computer Science and Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
10_chapter-3.pdf | Attached File | 326.01 kB | Adobe PDF | View/Open |
11_chapter-4.pdf | 266.22 kB | Adobe PDF | View/Open | |
12-chapter-5.pdf | 283.88 kB | Adobe PDF | View/Open | |
13_chapter-6.pdf | 60.53 kB | Adobe PDF | View/Open | |
14_publications.pdf | 174.12 kB | Adobe PDF | View/Open | |
1_title.pdf | 126.27 kB | Adobe PDF | View/Open | |
2_declaration.pdf | 56.98 kB | Adobe PDF | View/Open | |
3_certificate.pdf | 122.37 kB | Adobe PDF | View/Open | |
4_acknowledgement.pdf | 129.37 kB | Adobe PDF | View/Open | |
5_content.pdf | 72.05 kB | Adobe PDF | View/Open | |
6_list of graphs & tables.pdf | 144.37 kB | Adobe PDF | View/Open | |
7_abstract.pdf | 60.29 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 765.78 kB | Adobe PDF | View/Open | |
8_chapter-1.pdf | 182.05 kB | Adobe PDF | View/Open | |
9-chapter-2.pdf | 148.42 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: