Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/310263
Title: | Development of optimized algorithm for extract transform load process using soft computing techniques |
Researcher: | Gupta, Gaurav |
Guide(s): | Chhabra, Indu and Kumar, Neelesh |
Keywords: | Extraction Hadoop Loading Transformation Web Template |
University: | Panjab University |
Completed Date: | 2020 |
Abstract: | Extract Transform Load refers to a database process framework entrusted with the task of extraction, transformation, and loading of data in data warehouse. Web Data Extraction algorithm is proposed wherein web templates are identified by developing feature-based web data extraction algorithm by clustering the similar web pages together based on feature similarity of their DOM structure. The hybrid transformation technique is proposed that employs token-wise sentence sorting alongwith Levenshtein distance for noise reduction. The RDBMS is replaced with distributed failsafe data clusters as data warehouse using Hadoop based techniques. This delimits the constraint of data processing, storage and retrieval of large data structure. The developed algorithm is validated on USPTO web site. newline |
Pagination: | xvi, 116p. |
URI: | http://hdl.handle.net/10603/310263 |
Appears in Departments: | Department of Computer Science and Application |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 13.27 kB | Adobe PDF | View/Open |
02_certificate.pdf | 932.37 kB | Adobe PDF | View/Open | |
03_acknowledgement.pdf | 12.45 kB | Adobe PDF | View/Open | |
04_contents.pdf | 607.67 kB | Adobe PDF | View/Open | |
05_abstract.pdf | 34.46 kB | Adobe PDF | View/Open | |
06_abbreviations.pdf | 22.16 kB | Adobe PDF | View/Open | |
07_list_of_figures.pdf | 22.43 kB | Adobe PDF | View/Open | |
08_list_of_tables.pdf | 22.41 kB | Adobe PDF | View/Open | |
09_list_of_publications.pdf | 14.58 kB | Adobe PDF | View/Open | |
10_chapter1.pdf | 814.73 kB | Adobe PDF | View/Open | |
11_chapter2.pdf | 865.39 kB | Adobe PDF | View/Open | |
12_chapter3.pdf | 1.08 MB | Adobe PDF | View/Open | |
13_chapter4.pdf | 1.75 MB | Adobe PDF | View/Open | |
14_chapter5.pdf | 1.02 MB | Adobe PDF | View/Open | |
15_chapter6.pdf | 645.73 kB | Adobe PDF | View/Open | |
16_references.pdf | 776.4 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 645.73 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: