Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/326268
Title: Big Data Clustering Based Recommendation System Model Through Correlations
Researcher: Pandove, Divya
Guide(s): Rani, Rinkle and Goel, Shivani
Keywords: Big Data
Correlation Clustering
Recommendation System
University: Thapar Institute of Engineering and Technology
Completed Date: 2017
Abstract: Technological advancement has enabled us to store and process huge amounts of data in relatively short spans of time. The nature of data is rapidly increasing its dimensionality to become multi and high-dimensional. There is an immediate need to expand our focus to include analysis of high-dimensional and large datasets. Data analysis is becoming a mammoth task as a result of incremental increase in data volume and complexity in terms of heterogony of data. It is because of this dynamic computing environment that the existing techniques either need to be modified or discarded to handle new data in multiple high-dimensions. Data clustering is a tool that is used in many disciplines, including data mining, so that meaningful knowledge can be extracted from seemingly unstructured data. Correlation clustering possibly represents the most intuitive form of clustering construction. It gives solutions that can be approximated while automatically selecting the number of clusters. This approach handles scenarios where the focus is on relationships between the objects instead of on actual representations of the objects. The suitability of this method extends to the structured objects for which feature vectors are not easy to obtain. Given the increasing scale of data these days, correlation clustering has become a powerful addition to the fields of data mining and agnostic learning. In this thesis, we start by proposing an algorithm that defines an intuitive and accurate correlation coefficient metric, known as the General (rank based) correlation coefficient (G). Further, a framework is proposed, based on this algorithm, and is named as G Based Agglomerative Clustering (GBAC). Our approach has been found to be effective for small, large and high-dimensional data that generate high quality clusters. This framework combines the predictive power of correlation coefficients with the ability to find patterns in data obtained from agglomerative hierarchical clustering.
Pagination: 215p.
URI: http://hdl.handle.net/10603/326268
Appears in Departments:Department of Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File423.48 kBAdobe PDFView/Open
02_contents.pdf62.91 kBAdobe PDFView/Open
03_list of figures.pdf75.38 kBAdobe PDFView/Open
04_list of tables.pdf57.72 kBAdobe PDFView/Open
05_list of algorithms.pdf49.16 kBAdobe PDFView/Open
06_certificate.pdf310.78 kBAdobe PDFView/Open
07_acknowledgement.pdf188.62 kBAdobe PDFView/Open
08_abstract.pdf43.7 kBAdobe PDFView/Open
09_chapter 1.pdf517.09 kBAdobe PDFView/Open
10_chapter 2.pdf678.12 kBAdobe PDFView/Open
11_chapter 3.pdf349.55 kBAdobe PDFView/Open
12_chapter 4.pdf322.38 kBAdobe PDFView/Open
13_chapter 5.pdf460.46 kBAdobe PDFView/Open
14_chapter 6.pdf4.51 MBAdobe PDFView/Open
15_chapter 7.pdf408.93 kBAdobe PDFView/Open
16_chapter 8.pdf79.2 kBAdobe PDFView/Open
17_bibliography.pdf149.13 kBAdobe PDFView/Open
18_list of publications.pdf50.49 kBAdobe PDFView/Open
19_appendix a.pdf110.89 kBAdobe PDFView/Open
80_recommendation.pdf462.39 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: