Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/326268
Title: | Big Data Clustering Based Recommendation System Model Through Correlations |
Researcher: | Pandove, Divya |
Guide(s): | Rani, Rinkle and Goel, Shivani |
Keywords: | Big Data Correlation Clustering Recommendation System |
University: | Thapar Institute of Engineering and Technology |
Completed Date: | 2017 |
Abstract: | Technological advancement has enabled us to store and process huge amounts of data in relatively short spans of time. The nature of data is rapidly increasing its dimensionality to become multi and high-dimensional. There is an immediate need to expand our focus to include analysis of high-dimensional and large datasets. Data analysis is becoming a mammoth task as a result of incremental increase in data volume and complexity in terms of heterogony of data. It is because of this dynamic computing environment that the existing techniques either need to be modified or discarded to handle new data in multiple high-dimensions. Data clustering is a tool that is used in many disciplines, including data mining, so that meaningful knowledge can be extracted from seemingly unstructured data. Correlation clustering possibly represents the most intuitive form of clustering construction. It gives solutions that can be approximated while automatically selecting the number of clusters. This approach handles scenarios where the focus is on relationships between the objects instead of on actual representations of the objects. The suitability of this method extends to the structured objects for which feature vectors are not easy to obtain. Given the increasing scale of data these days, correlation clustering has become a powerful addition to the fields of data mining and agnostic learning. In this thesis, we start by proposing an algorithm that defines an intuitive and accurate correlation coefficient metric, known as the General (rank based) correlation coefficient (G). Further, a framework is proposed, based on this algorithm, and is named as G Based Agglomerative Clustering (GBAC). Our approach has been found to be effective for small, large and high-dimensional data that generate high quality clusters. This framework combines the predictive power of correlation coefficients with the ability to find patterns in data obtained from agglomerative hierarchical clustering. |
Pagination: | 215p. |
URI: | http://hdl.handle.net/10603/326268 |
Appears in Departments: | Department of Computer Science and Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 423.48 kB | Adobe PDF | View/Open |
02_contents.pdf | 62.91 kB | Adobe PDF | View/Open | |
03_list of figures.pdf | 75.38 kB | Adobe PDF | View/Open | |
04_list of tables.pdf | 57.72 kB | Adobe PDF | View/Open | |
05_list of algorithms.pdf | 49.16 kB | Adobe PDF | View/Open | |
06_certificate.pdf | 310.78 kB | Adobe PDF | View/Open | |
07_acknowledgement.pdf | 188.62 kB | Adobe PDF | View/Open | |
08_abstract.pdf | 43.7 kB | Adobe PDF | View/Open | |
09_chapter 1.pdf | 517.09 kB | Adobe PDF | View/Open | |
10_chapter 2.pdf | 678.12 kB | Adobe PDF | View/Open | |
11_chapter 3.pdf | 349.55 kB | Adobe PDF | View/Open | |
12_chapter 4.pdf | 322.38 kB | Adobe PDF | View/Open | |
13_chapter 5.pdf | 460.46 kB | Adobe PDF | View/Open | |
14_chapter 6.pdf | 4.51 MB | Adobe PDF | View/Open | |
15_chapter 7.pdf | 408.93 kB | Adobe PDF | View/Open | |
16_chapter 8.pdf | 79.2 kB | Adobe PDF | View/Open | |
17_bibliography.pdf | 149.13 kB | Adobe PDF | View/Open | |
18_list of publications.pdf | 50.49 kB | Adobe PDF | View/Open | |
19_appendix a.pdf | 110.89 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 462.39 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: