Media Summary: A version of maximum principle in discrete time control system. A Lagrangian method coupled with the method of multipliers. Convergence proof using Banach contraction mapping theorem. Markov decision problems, discounted cost, average cost, total cost problems, optimality of Markov policies.
Ece 5759 Nonlinear Programming Lec - Detailed Analysis & Overview
A version of maximum principle in discrete time control system. A Lagrangian method coupled with the method of multipliers. Convergence proof using Banach contraction mapping theorem. Markov decision problems, discounted cost, average cost, total cost problems, optimality of Markov policies. Approximation of dynamic programs using rolling horizon approach, rollout algorithm, and reinforcement learning. Application of contraction mapping principle to establish convergence of Lagrangian methods. Convexity of dual problem, geometric interpretation of weak duality theorem, dual of