Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/301891
Title: | Investigation on indexing algorithms for big data retrieval |
Researcher: | Gayathiri N R |
Guide(s): | Natarajan A M |
Keywords: | Big data MongoDB a NoSQL B-Tree indexing |
University: | Anna University |
Completed Date: | 2019 |
Abstract: | Big Data and its implications have received their own recognition in many aspects of which healthcare system emerges as one of the promising sectors The healthcare and biomedical sciences have rapidly become dataintensive as investigators are generating and using large complex high dimensional and diverse domain specific datasets Due to the existence of diversified data formats huge volume and associated uncertainty that exist among the sources of Big Data the task of data retrieval from huge sources plays a vital role Data retrieval is the process of using a query to extract data from the huge sources of data particularly a large database Indexing is considered as one of the important aspect of retrieval system Indexing structures are data structures aiming to reduce comparisons and consequently reduce the search time The proposed approach uses Hash based indexing schemes which vary according to the respective hashing function trying to reduce the search process and minimize the retrieval time Hashing is chosen because it outperforms trees when input is large for example in billions Two popular Big Data platforms such as MongoDB a NoSQL document database and Hadoop a distributed computing framework are used for data storage and fast processing times In the first phase of the work the most used indexes such as B Tree indexing and the Hash indexing are introduced for handling the Big Data In this work B Tree indexing method and the Hash algorithm that is written using java programming language are analyzed in the MongoDB database The time complexity of Hashing Algorithm is O 1 whereas the time complexity for B Tree is Olog n however when the number of records increases there is a gradual increase in the execution time for the B tree indexing. Hashing is efficient when there are more records say in billions whereas B tree works fine for limited records in distributed sharded and unsharded databases |
Pagination: | xvii,172p. |
URI: | http://hdl.handle.net/10603/301891 |
Appears in Departments: | Faculty of Information and Communication Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf.pdf | Attached File | 10.03 kB | Adobe PDF | View/Open |
02_certificates.pdf.pdf | 827.12 kB | Adobe PDF | View/Open | |
03_abstracts.pdf.pdf | 92.15 kB | Adobe PDF | View/Open | |
04_acknowledgements.pdf.pdf | 5.29 kB | Adobe PDF | View/Open | |
05_contents.pdf.pdf | 178.35 kB | Adobe PDF | View/Open | |
06_list_of_tables.pdf.pdf | 5.11 kB | Adobe PDF | View/Open | |
07_list_of_figures.pdf.pdf | 89.89 kB | Adobe PDF | View/Open | |
08_list_of_abbreviations.pdf | 190.77 kB | Adobe PDF | View/Open | |
09_chapter1.pdf.pdf | 330.76 kB | Adobe PDF | View/Open | |
10_chapter2.pdf.pdf | 199.69 kB | Adobe PDF | View/Open | |
11_chapter3.pdf.pdf | 165.4 kB | Adobe PDF | View/Open | |
12_chapter4.pdf.pdf | 235.55 kB | Adobe PDF | View/Open | |
13_chapter5.pdf.pdf | 624.95 kB | Adobe PDF | View/Open | |
14_conclusion.pdf.pdf | 13.87 kB | Adobe PDF | View/Open | |
15_references.pdf.pdf | 135.06 kB | Adobe PDF | View/Open | |
16_list_of_publications.pdf | 88.27 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 132.39 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: