BEGIN:VCALENDAR
VERSION:2.0
PRODID:OpenCms 20.0.18
BEGIN:VTIMEZONE
TZID:Europe/Berlin
X-LIC-LOCATION:Europe/Berlin
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700329T020000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701025T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
END:STANDARD
END:VTIMEZONE				
BEGIN:VEVENT
DTSTAMP:20230606T151051
UID:8ff7616a-046b-11ee-8aed-000e0c3db68b
SUMMARY:Talk of Dr. Luiz Chamon 
DESCRIPTION:Dr. Luiz Chamon\nELLIS-SimTech Independent Research Group Leader\nUniversity of Stuttgart\nStuttgart, Germany&nbsp; \nTuesday 2023-06-20 4:00 p.m.\nIST Seminar Room 2.255 - Pfaffenwaldring 9 - Campus Stuttgart-Vaihingen&nbsp; \nAbstract\nThe transformative power of learning lies in automating the design of complex systems. Today,\nhowever, learning does not incorporate requirements organically, which leads to data-driven\nsolutions prone to tampering and unsafe behavior. In this talk, I will show when and how it is\npossible to learn under requirements by developing the theoretical underpinnings of constrained\nlearning. For concreteness, I will start by considering the learning of safe policies in the\nreinforcement learning (RL) setting where we aim to control a Markov Decision Process (MDP) whose\ntransition probabilities are unknown but from which we can sample trajectories. By safety, I mean\nthe agent must remain in a safe state-space set with high probability during operation. We begin by\ntransforming this problem into a constrained MDP that we show has small duality gap for rich policy\nparametrizations despite its non-convexity. This leads to a practical primal-dual algorithm that\nleverages traditional RL methods. I illustrate the performance of this method in a navigation\nproblem. Despite its effectiveness, however, I will show that there are problems whose optimal\npolicy cannot be obtained by linear combinations of rewards. Hence, not all constrained RL problems\ncan be solved using regularized or primal-dual methods. Nevertheless, this shortcoming can be\naddressed by augmenting the state with Lagrange multipliers and reinterpreting dual updates as the\ndynamics that drive these multipliers' evolution. This approach provides a systematic state\naugmentation procedure that is guaranteed to solve reinforcement learning problems with\nconstraints. Thus, while primal-dual methods can fail to find optimal policies, we show that this\nalgorithm provably samples actions from the optimal policy.&nbsp; \nBiographical Information\nLuiz F. O. Chamon received the B.Sc. and M.Sc. degrees in electrical engineering from the\nUniversity of São Paulo, São Paulo, Brazil, in 2011 and 2015 and the Ph.D. degree in electrical and\nsystems engineering from the University of Pennsylvania (Penn), Philadelphia, in 2020. Until 2022,\nhe was a postdoctoral fellow at the Simons Institute of the University of California, Berkeley. He\nis currently an independent research group leader at the University of Stuttgart, Germany. In 2009,\nhe was an undergraduate exchange student of the Masters in Acoustics of the École Centrale de Lyon,\nLyon, France, and worked as an Assistant Instructor and Consultant on nondestructive testing at\nINSACAST Formation Continue. From 2010 to 2014, he worked as a Signal Processing and Statistics\nConsultant on a research project with EMBRAER. He received both the best student paper and the best\npaper awards at IEEE ICASSP 2020 and was recognized by the IEEE Signal Processing Society for his\ndistinguished work for the editorial board of the IEEE Transactions on Signal Processing in 2018.\nHis research interests include optimization, signal processing, machine learning, statistics, and\ncontrol.&nbsp;
DTSTART;TZID=Europe/Berlin;VALUE=DATE:20230620
URL;VALUE=URI:https://www.ist.uni-stuttgart.de/events/Talk-of-Dr.-Luiz-Chamon/
END:VEVENT
END:VCALENDAR