Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/510695
Title: Feature engineering for low dimensional representation of genes expression and pathological activities across diverse human tissues
Researcher: Rai, Priyadarshini
Guide(s): Sengupta, Debarka and Majumdar, Angshul
Keywords: Biology
Biology and Biochemistry
Life Sciences
University: Indraprastha Institute of Information Technology, Delhi (IIIT-Delhi)
Completed Date: 2022
Abstract: The advent of tissue and single cell based transcriptomic profiling technologies has allowed precise characterization of tissue specific gene activities in the context of development and disease. Human cells express about 20,000 genes whose interplay enables all physical activities that define our life. However, with expression signals, most transcriptomic platforms also offer bewildering levels of noise. This has become more prominent in the case of single cell transcriptomic experiments. As such, it is important to represent cells and tissues with the help of minimal genesets. This poses the classical challenge of dimension reduction. To reduce this feature space, we developed a de novo feature selection algorithm, SelfE (self expression), a novel l2,0-minimization algorithm that determines an optimal subset of feature vectors (genes) that preserves subspace structures as observed in single cell RNA-sequencing data. We compared SelfE with the commonly used feature selection methods for single-cell expression data analysis. Unlike bulk RNA sequencing data, single cell gene expression readouts feature excessive dropout events, thereby confounding downstream bioinformatic analyses. Keeping these limitations in mind, we proposed a method that employs deep dictionary learning for the clustering of single cell data. This is the first piece of the effort to create a deep learning-based approach for clustering. We render the framework clustering compatible by introducing a cluster-aware loss (K-means and sparse subspace) into the learning problem. The potential of our method is demonstrated by comparison with general deep learning-based clustering techniques and with specially designed single-cell RNA clustering techniques. newline
Pagination: 127 p.
URI: http://hdl.handle.net/10603/510695
Appears in Departments:Department of Computational Biology

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File104.03 kBAdobe PDFView/Open
02_prelim pages.pdf324.18 kBAdobe PDFView/Open
03_content.pdf67.89 kBAdobe PDFView/Open
04_abstract.pdf90.79 kBAdobe PDFView/Open
05_chapter 1.pdf288.3 kBAdobe PDFView/Open
06_chapter 2.pdf1.24 MBAdobe PDFView/Open
07_chapter 3.pdf652.38 kBAdobe PDFView/Open
08_chapter 4.pdf5.88 MBAdobe PDFView/Open
09_annexures.pdf225.07 kBAdobe PDFView/Open
80_recommendation.pdf161.16 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: