Mese: Marzo 2021

Machine Learning Monitoring

Grazie alla diffusione su larga scala dei sistemi basati su tecniche di Machine Learning e vista la loro applicazione in contesti sempre più complessi, gli esperti di questi modelli si stanno ponendo sempre più la sfida di monitorare la qualità delle predizioni effettuate da questi sistemi nel tempo. Qual è la differenza tra un software […]
Read More

Policy Optimization as Online Learning with Mediator Feedback

Policy Optimization as Online Learning with Mediator Feedback Authors: Alberto Maria Metelli, Matteo Papini, Pierluca D’Oro, Marcello Restelli Conference: AAAI 2021 Abstract: Policy Optimization (PO) is a widely used approach to address continuous control tasks. In this paper, we introduce the notion of mediator feedback that frames PO as an online learning problem over the […]
Read More

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate Authors: Mirco Mutti, Lorenzo Pratissoli, Marcello Restelli Conference: AAAI 2021 Abstract: In a reward-free environment, what is a suitable intrinsic objective for an agent to pursue so that it can learn an optimal task-agnostic exploration policy? In this paper, we argue that the entropy […]
Read More

Newton Optimization on Helmholtz Decomposition for Continuous Games

Newton Optimization on Helmholtz Decomposition for Continuous Games Authors: Giorgia Ramponi, Marcello Restelli Conference: AAAI 2021 Abstract: Many learning problems involve multiple agents optimizing different interactive functions. In these problems, the standard policy gradient algorithms fail due to the non-stationarity of the setting and the different interests of each agent. In fact, algorithms must take […]
Read More
ML cube