Query estimation in data streams using micro clustering

Gupta, Sudhanshu

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/227194

Full metadata record

DC Field	Value	Language
dc.coverage.spatial
dc.date.accessioned	2019-01-25T10:31:05Z	-
dc.date.available	2019-01-25T10:31:05Z	-
dc.identifier.uri	http://hdl.handle.net/10603/227194	-
dc.description.abstract	Advancement in technology has lead to availability of inexpensive electronic devices everywhere. These devices and various applications automatically generate a large amount of data which is increasing exponentially. The data can grow at a high rate of millions of data items per day for business and scienti c applications. A large number of applications generate continuous, transient large stream of data. For example the applications that naturally generate data streams are nancial tickers, log records or click-streams in web tracking and personalization, manufacturing processes, data feeds from sensor applications, sensor network, performance measurements in network monitoring and tra c management, call detail records in telecommunications, email messages. The analysis of large amount of data generated by various applications can create a lot of opportunities. For example, analyzing data of patients to diagnose the cause of disease, to design marketing strategies, predicting investment strategies, analyzing customer behavior. We need e cient techniques to analyze and process these unbounded data streams for useful information. However conventional techniques may not be applicable for their analysis. The processing of data stream requires single pass processing with limited memory. A number of techniques have been proposed for analysis of data streams meeting rigid processing requirement. These methods use various synopsis techniques such as sampling, wavelets, sketch etc. Micro-clustering is a synopsis technique used for clustering and classi cation of data stream. In this work we investigate how to estimate queries over large data streams using micro-clustering and cosine series. We store summary of data stream in micro-clusters and process clusters of data for estimating queries over streams. In order to assess the technique we conducted an experimental study. As the results of this study reveal, our technique outperform competitor method. newline
dc.format.extent	x, 102p.
dc.language	English
dc.relation
dc.rights	university
dc.title	Query estimation in data streams using micro clustering
dc.title.alternative
dc.creator.researcher	Gupta, Sudhanshu
dc.subject.keyword	Clustering
dc.subject.keyword	Computer science
dc.subject.keyword	Data streams
dc.subject.keyword	Micro clustering
dc.subject.keyword	Query estimation
dc.description.note
dc.contributor.guide	Garg, Deepak
dc.publisher.place	Patiala
dc.publisher.university	Thapar Institute of Engineering and Technology
dc.publisher.institution	Department of Computer Science and Engineering
dc.date.registered
dc.date.completed	2014
dc.date.awarded
dc.format.dimensions
dc.format.accompanyingmaterial	None
dc.source.university	University
dc.type.degree	Ph.D.
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
file10(appendix).pdf	Attached File	140 kB	Adobe PDF	View/Open
file11(bibliography).pdf		100.8 kB	Adobe PDF	View/Open
file1(title).pdf		660.19 kB	Adobe PDF	View/Open
file2(certificate).pdf		422.01 kB	Adobe PDF	View/Open
file3(preliminary pages).pdf		102.9 kB	Adobe PDF	View/Open
file4(chapter 1).pdf		132.12 kB	Adobe PDF	View/Open
file5(chapter 2).pdf		220.55 kB	Adobe PDF	View/Open
file6(chapter 3).pdf		557.26 kB	Adobe PDF	View/Open
file7(chapter 4).pdf		257.65 kB	Adobe PDF	View/Open
file8(chapter 5).pdf		5.37 MB	Adobe PDF	View/Open
file9(chapter 6).pdf		72.56 kB	Adobe PDF	View/Open

Show simple item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET