Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/510695
Title: | Feature engineering for low dimensional representation of genes expression and pathological activities across diverse human tissues |
Researcher: | Rai, Priyadarshini |
Guide(s): | Sengupta, Debarka and Majumdar, Angshul |
Keywords: | Biology Biology and Biochemistry Life Sciences |
University: | Indraprastha Institute of Information Technology, Delhi (IIIT-Delhi) |
Completed Date: | 2022 |
Abstract: | The advent of tissue and single cell based transcriptomic profiling technologies has allowed precise characterization of tissue specific gene activities in the context of development and disease. Human cells express about 20,000 genes whose interplay enables all physical activities that define our life. However, with expression signals, most transcriptomic platforms also offer bewildering levels of noise. This has become more prominent in the case of single cell transcriptomic experiments. As such, it is important to represent cells and tissues with the help of minimal genesets. This poses the classical challenge of dimension reduction. To reduce this feature space, we developed a de novo feature selection algorithm, SelfE (self expression), a novel l2,0-minimization algorithm that determines an optimal subset of feature vectors (genes) that preserves subspace structures as observed in single cell RNA-sequencing data. We compared SelfE with the commonly used feature selection methods for single-cell expression data analysis. Unlike bulk RNA sequencing data, single cell gene expression readouts feature excessive dropout events, thereby confounding downstream bioinformatic analyses. Keeping these limitations in mind, we proposed a method that employs deep dictionary learning for the clustering of single cell data. This is the first piece of the effort to create a deep learning-based approach for clustering. We render the framework clustering compatible by introducing a cluster-aware loss (K-means and sparse subspace) into the learning problem. The potential of our method is demonstrated by comparison with general deep learning-based clustering techniques and with specially designed single-cell RNA clustering techniques. newline |
Pagination: | 127 p. |
URI: | http://hdl.handle.net/10603/510695 |
Appears in Departments: | Department of Computational Biology |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 104.03 kB | Adobe PDF | View/Open |
02_prelim pages.pdf | 324.18 kB | Adobe PDF | View/Open | |
03_content.pdf | 67.89 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 90.79 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 288.3 kB | Adobe PDF | View/Open | |
06_chapter 2.pdf | 1.24 MB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 652.38 kB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 5.88 MB | Adobe PDF | View/Open | |
09_annexures.pdf | 225.07 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 161.16 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: