Big Data Clustering Based Recommendation System Model Through Correlations

Pandove, Divya

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/326268

Title:	Big Data Clustering Based Recommendation System Model Through Correlations
Researcher:	Pandove, Divya
Guide(s):	Rani, Rinkle and Goel, Shivani
Keywords:	Big Data Correlation Clustering Recommendation System
University:	Thapar Institute of Engineering and Technology
Completed Date:	2017
Abstract:	Technological advancement has enabled us to store and process huge amounts of data in relatively short spans of time. The nature of data is rapidly increasing its dimensionality to become multi and high-dimensional. There is an immediate need to expand our focus to include analysis of high-dimensional and large datasets. Data analysis is becoming a mammoth task as a result of incremental increase in data volume and complexity in terms of heterogony of data. It is because of this dynamic computing environment that the existing techniques either need to be modified or discarded to handle new data in multiple high-dimensions. Data clustering is a tool that is used in many disciplines, including data mining, so that meaningful knowledge can be extracted from seemingly unstructured data. Correlation clustering possibly represents the most intuitive form of clustering construction. It gives solutions that can be approximated while automatically selecting the number of clusters. This approach handles scenarios where the focus is on relationships between the objects instead of on actual representations of the objects. The suitability of this method extends to the structured objects for which feature vectors are not easy to obtain. Given the increasing scale of data these days, correlation clustering has become a powerful addition to the fields of data mining and agnostic learning. In this thesis, we start by proposing an algorithm that defines an intuitive and accurate correlation coefficient metric, known as the General (rank based) correlation coefficient (G). Further, a framework is proposed, based on this algorithm, and is named as G Based Agglomerative Clustering (GBAC). Our approach has been found to be effective for small, large and high-dimensional data that generate high quality clusters. This framework combines the predictive power of correlation coefficients with the ability to find patterns in data obtained from agglomerative hierarchical clustering.
Pagination:	215p.
URI:	http://hdl.handle.net/10603/326268
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	423.48 kB	Adobe PDF	View/Open
02_contents.pdf		62.91 kB	Adobe PDF	View/Open
03_list of figures.pdf		75.38 kB	Adobe PDF	View/Open
04_list of tables.pdf		57.72 kB	Adobe PDF	View/Open
05_list of algorithms.pdf		49.16 kB	Adobe PDF	View/Open
06_certificate.pdf		310.78 kB	Adobe PDF	View/Open
07_acknowledgement.pdf		188.62 kB	Adobe PDF	View/Open
08_abstract.pdf		43.7 kB	Adobe PDF	View/Open
09_chapter 1.pdf		517.09 kB	Adobe PDF	View/Open
10_chapter 2.pdf		678.12 kB	Adobe PDF	View/Open
11_chapter 3.pdf		349.55 kB	Adobe PDF	View/Open
12_chapter 4.pdf		322.38 kB	Adobe PDF	View/Open
13_chapter 5.pdf		460.46 kB	Adobe PDF	View/Open
14_chapter 6.pdf		4.51 MB	Adobe PDF	View/Open
15_chapter 7.pdf		408.93 kB	Adobe PDF	View/Open
16_chapter 8.pdf		79.2 kB	Adobe PDF	View/Open
17_bibliography.pdf		149.13 kB	Adobe PDF	View/Open
18_list of publications.pdf		50.49 kB	Adobe PDF	View/Open
19_appendix a.pdf		110.89 kB	Adobe PDF	View/Open
80_recommendation.pdf		462.39 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET