Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/588713
Title: Machine learning based multi omics data analysis to identify subgroups in cancer for precision medicine
Researcher: Khadirnaikar, Seema R
Guide(s): Mahadeva Prasanna, S R and Shukla, Sudhanshu
Keywords: Classification, clustering
Conditional WGAN (cWGAN)
Data augmentation
Dimensionality reduction
Engineering
Engineering and Technology
Engineering Electrical and Electronic
Machine learning
Non-small cell lung cancer (NSCLC)
Precision medicine
University: Indian Institute of Technology Dharwad
Completed Date: 2023
Abstract: The mortality rate associated with cancer is increasing at an exponential rate each year. Can- cer is a complex illness with notable diversity, making it crucial to adopt precision medicine approaches. Precision medicine endeavors to categorize patients into smaller subgroups based on molecular similarities. It also advocates for customized treatment plans that address the molecular variations within these subgroups, ultimately enhancing patient care. Currently, the prevailing practice involves classifying cancer patients primarily according to tumor grade and stage, which overlooks molecular variations and proves effective only in certain cases. Hence, there is an imperative to identify subgroups that consider molecular-level variations. More- over, characterizing patients based on these subgroups can yield valuable insights that facilitate precision therapy. newline newlineThis work initially focuses on identifying subgroups in non-small cell lung cancer (NSCLC), a leading cause of cancer-related deaths worldwide. To accomplish this, data from multiple molecular levels, including mRNA expression, miRNA expression, methylation, and protein expression, are combined and reduced to a lower dimension using an auto-encoder (AE), a machine learning technique for non-linear dimensionality reduction. Consensus K-means clus- tering is then applied to group patients with similar characteristics, resulting in the classification of NSCLC patients into five subgroups. Several statistical tests are then employed to identify the specific features that are differentially expressed (DE) in each subgroup, which further aids in their characterization. The subgroup with the most favorable survival time is found to ex- hibit the fewest genomic alterations. To identify the subgroup for a new sample, classification models such as support vector machines (SVM), random forest (RF), and feed-forward neural networks (FFNN) are trained using the DE features. Moreover, decision-level fused models are constructed by combining the prediction probabilities
Pagination: xxvii, 234 p.
URI: http://hdl.handle.net/10603/588713
Appears in Departments:Department of Electrical Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File267.59 kBAdobe PDFView/Open
02_prelims page.pdf371.69 kBAdobe PDFView/Open
03_content.pdf77.43 kBAdobe PDFView/Open
04_abstract.pdf80.08 kBAdobe PDFView/Open
05_chapter 1.pdf526.81 kBAdobe PDFView/Open
06_chapter 2.pdf262.88 kBAdobe PDFView/Open
07_chapter 3.pdf3.29 MBAdobe PDFView/Open
08_chapter 4.pdf2.62 MBAdobe PDFView/Open
09_chapter 5.pdf2.36 MBAdobe PDFView/Open
10_chapter 6.pdf2.94 MBAdobe PDFView/Open
11_chapter 7.pdf115.95 kBAdobe PDFView/Open
12_annexures.pdf330.21 kBAdobe PDFView/Open
80_recommendation.pdf315.08 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: