Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/599540
Title: | Identification of potential biomarkers for esophageal squamous cell carcinoma using unsupervised machine learning |
Researcher: | Baruah, Bikash |
Guide(s): | Banerjee, Subhasish and Dutta, Manash Pratim |
Keywords: | Biclustering Algorithms Community Detection Algorithms Machine Learning Robust Analysis |
University: | National Institute of Technology Arunachal Pradesh |
Completed Date: | 2024 |
Abstract: | Esophageal Squamous Cell Carcinoma (ESCC) is known for its high prevalence and aggressivness. It is often diagnosed at advanced stages due to the lack of specific symptoms, highlighting the urgent need to explore new diagnostic and therapeutic approaches. The identification of reliable biomarkers is pivotal for accurate diagnosis, prognosis, and the development of personalized treatment approaches tailored to individual patient profiles. This comprehensive study harnesses diverse datasets, including microarray, RNA sequencing (RNA-seq), and single cell RNA sequencing (scRNA-seq), to deeply explore the molecular landscape of ESCC. As the large-scale biological datasets missing data always becomes a challenging issue for the researchers , hence, this study introduces a novel ensemble algorithm for missing data imputation. The algorithm integrates four robust techniques: k- nearest neighbor, local least squares, K- means clustering, and missForest algorithm to effectively mitigate gaps in the datasets. Comparative analyses across eight distinct datasets demonstrate the superior performance and robustness of the proposed imputation method, showcasing its ability to enhance data completeness and reliability. Afterward, the research focuses on biomarker discovery using various biclustering algorithms to identify groups of genes with coherent expression patterns. Additionally, EnsemBic, an ensemble biclustering algorithm, is introduced to bolster the reliability and comprehensiveness of biomarker identification. Topological and biological analyses focusing on elite genes within identified biclusters aid in pinpointing potential biomarkers intricately linked to ESCC, providing insights into the underlying molecular mechanisms of the disease. Subsequently, community detection algorithms are applied to unveil latent structures within the datasets, revealing hidden biological communities. The development and evaluation of two novel community detection algorithms highlight their efficacy in identifying potential biomarkers. |
Pagination: | xviii, 157 |
URI: | http://hdl.handle.net/10603/599540 |
Appears in Departments: | Department of Computer Science and Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 23.86 kB | Adobe PDF | View/Open |
02_prelim.pdf | 1.37 MB | Adobe PDF | View/Open | |
03_content.pdf | 431.59 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 418.99 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 372.69 kB | Adobe PDF | View/Open | |
06_chapter 2.pdf | 470.35 kB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 1.05 MB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 1.16 MB | Adobe PDF | View/Open | |
09_chapter 5.pdf | 1.32 MB | Adobe PDF | View/Open | |
10_chapter 6.pdf | 509.45 kB | Adobe PDF | View/Open | |
11_annexures.pdf | 397.87 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 210.75 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: