An Investigation of Machine Learning Algorithm for Clustering

Shrivastava, Shailendra Kumar

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/366490

Title:	An Investigation of Machine Learning Algorithm for Clustering
Researcher:	Shrivastava, Shailendra Kumar
Guide(s):	Jain,R.C. and Rana, J.L.
Keywords:	Computer Science Computer Science Software Engineering Engineering and Technology
University:	Rajiv Gandhi Proudyogiki Vishwavidyalaya
Completed Date:	2013
Abstract:	Clustering is an unsupervised learning method for finding the groups in given data set. newlineThe task of clustering is NP hard. Generally, problem of clustering the data set newlinerequires the machine learning techniques. Clustering builds much interest in the newlinemachine learning community. One of the latest concepts to find the clusters is affinity newlinepropagation. Affinity propagation concept is based on exemplar. The input in affinity newlinepropagation method is similarities among data points. The output of this method is a newlineset of representative data points that best describes the clusters. These representative newlinedata points are known as exemplar and assignments of all non-exemplar data points to newlineits nearby exemplar are to generate the clusters. newlineIn this thesis, we have developed four algorithms based on affinity propagation and newlinemachine learning concepts. Extensive experiments have been carried out to evaluate newlinethe performance of these algorithms. The names of algorithm are Fast Affinity newlinePropagation based on machine learning (FAPML), Phrase affinity clustering (PAC), newlineK-means based on Heterogeneous Transfer Learning (K-Means based on HTL) and newlineAffinity Propagation based on Heterogeneous Transfer Learning (AP based on HTL). newlineThe FAMPL is based on Learning by experience which is the principle of machine newlinelearning. FAPML tries to put data points into clusters based on the history of the data newlinepoints belonging to clusters in early stages. In FAPML we have introduced affinity newlinelearning constant and dispersion constant which supervise the clustering process. newlineFAPML also enforces the exemplar consistency and one of N constraints. newlinePAC first finds the phrase by Ukkonen suffix tree construction algorithm, then it finds newlinethe vector space model using tf-idf weighting scheme of phrase. After that it newlinecalculates the similarity matrix form VSD using cosine similarity. Affinity newlinepropagation is used to generate the clusters. newlineIn the K-Means output clusters depends on initialization of centroids. K-Means newlinebased on HTL tries to solve this problem.
Pagination:	17.1MB
URI:	http://hdl.handle.net/10603/366490
Appears in Departments:	Computer Science Engineering

Files in This Item:

File	Description	Size	Format
01 _ title.pdf	Attached File	661.84 kB	Adobe PDF	View/Open
03 _ tables of contents.pdf		265.73 kB	Adobe PDF	View/Open
04 _list of tables.pdf		179.78 kB	Adobe PDF	View/Open
05_ list of figures.pdf		963.83 kB	Adobe PDF	View/Open
06 _ acknowledgements.pdf		167.85 kB	Adobe PDF	View/Open
07 _chapter 1.pdf		266.38 kB	Adobe PDF	View/Open
08 _chapter 2.pdf		1.1 MB	Adobe PDF	View/Open
09 _chapter 3.pdf		1.17 MB	Adobe PDF	View/Open
10 _ a chapter 5.pdf		1.4 MB	Adobe PDF	View/Open
10 _ b chapter 6.pdf		1.21 MB	Adobe PDF	View/Open
10 _ c chapter 7.pdf		1.22 MB	Adobe PDF	View/Open
10 _chapter 4.pdf		1.3 MB	Adobe PDF	View/Open
10 _ d chapter 8.pdf		246.3 kB	Adobe PDF	View/Open
11 _ references.pdf		161.3 kB	Adobe PDF	View/Open
12 _ list of publications.pdf		4.78 MB	Adobe PDF	View/Open
80_recommendation.pdf		187.69 kB	Adobe PDF	View/Open
abstract.pdf		187.69 kB	Adobe PDF	View/Open
certificate.pdf		782.31 kB	Adobe PDF	View/Open
declaration by the candidate.pdf		429.33 kB	Adobe PDF	View/Open
list of abbreviations.pdf		173.92 kB	Adobe PDF	View/Open
preliminary page.pdf		661.84 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET