Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/426451
Title: Algorithms for Stochastic Optimization Statistical Estimation and Markov Decision Processes
Researcher: Kamanchi, Chandramouli
Guide(s): Bhatnagar, Shalabh
Keywords: Automation and Control Systems
Computer Science
Engineering and Technology
University: Indian Institute of Science Bangalore
Completed Date: 2020
Abstract: Stochastic approximation deals with the problem of finding zeros of a function expressed as an expectation of a random variable. In this thesis we propose convergent algorithms for problems in optimization, statistical estimation and reinforcement learning. We first formulate these problems in the stochastic approximation setting. Subsequently we utilize the ordinary differential equations based analysis of stochastic approximation algorithms to prove the convergence of our proposed algorithms. Additionally we explore second order methods in the context of Markov Decision Processes. Through experimental evaluations, we validate our theoretical results. First, we consider a Stochastic Optimization (SO) problem where the aim is to optimize a given objective function in the presence of noise. Most of the solution techniques in SO estimate gradients from the noise corrupted observations of the objective and adjust parameters of the objective along the direction of the estimated gradients to obtain locally optimal solutions. Two prominent algorithms in SO namely Random Direction Kiefer-Wolfowitz (RDKW) and Simultaneous Perturbation Stochastic Approximation (SPSA) obtain noisy gradient estimates by randomly perturbing all the parameters simultaneously. This forces the search direction to be random in these algorithms and causes them to suffer additional noise on top of the noise incurred from the samples of the objective. Owing to this additional noise, the idea of using deterministic perturbations instead of random perturbations for gradient estimation has also been studied in the past. Specifically, two constructions of the deterministic perturbation sequence using lexicographical ordering and Hadamard matrices have been explored in the context of SPSA algorithms and encouraging results have been reported in the literature. In our work, we characterize the class of deterministic perturbation sequences that can be utilized in the RDKW algorithm...
Pagination: x, 78
URI: http://hdl.handle.net/10603/426451
Appears in Departments:Computer Science and Automation

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File68.71 kBAdobe PDFView/Open
02_prelim pages.pdf155.38 kBAdobe PDFView/Open
03_table of content.pdf41.27 kBAdobe PDFView/Open
04_abstract.pdf42.77 kBAdobe PDFView/Open
05_chapter 1.pdf83.19 kBAdobe PDFView/Open
06_chapter 2.pdf351.51 kBAdobe PDFView/Open
07_chapter 3.pdf492.49 kBAdobe PDFView/Open
08_chapter 4.pdf260.16 kBAdobe PDFView/Open
09_chapter 5.pdf239.07 kBAdobe PDFView/Open
10_annexure.pdf168.68 kBAdobe PDFView/Open
80_recommendation.pdf119.1 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: