Ricerca

03 Mar 22

Policy Optimization via Optimal Policy Evaluation

Authors Alberto Maria Metelli, Samuele Meta, Marcello Restelli Abstract Off-policy methods are the basis of a large number of effective Policy Optimization (PO) algorithms. In this setting, Importance Sampling (IS) is typically employed as a what-if analysis tool, with the goal of estimating the performance of a target policy, given samples collected with a different […]

03 Mar 22

Time-variant variational transfer for value functions

Authors Giuseppe Canonaco, Andrea Soprani, Matteo Giuliani, Andrea Castelletti, Manuel Roveri, Marcello Restelli Abstract In most of the transfer learning approaches to reinforcement learning (RL) the distribution over the tasks is assumed to be stationary. Therefore, the target and source tasks are i.i.d. samples of the same distribution. Unfortunately, this assumption rarely holds in real-world […]

03 Mar 22

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection

Authors Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta Abstract We study the role of the representation of state-action value functions in regret minimization in finite-horizon Markov Decision Processes (MDPs) with linear structure. We first derive a necessary condition on the representation, called universally spanning optimal features (UNISOFT), to achieve constant […]

03 Mar 22

Learning in Non-Cooperative Configurable Markov Decision Processes

Authors Giorgia Ramponi, Alberto Maria Metelli, Alessandro Concetti, Marcello Restelli Abstract The Configurable Markov Decision Process framework includes two entities: a Reinforcement Learning agent and a configurator that can modify some environmental parameters to improve the agent’s performance. This presupposes that the two actors have the same reward functions. What if the configurator does not […]

03 Mar 22

Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning

Authors Alberto Maria Metelli, Alessio Russo, Marcello Restelli Abstract Importance Sampling (IS) is a widely used building block for a large variety of off-policy estimation and learning algorithms. However, empirical and theoretical studies have progressively shown that vanilla IS leads to poor estimations whenever the behavioral and target policies are too dissimilar. In this paper, […]

03 Mar 22

A voltage dynamic-based state of charge estimation method for batteries storage systems

Authors Marco Mussi, Luigi Pellegrino, Marcello Restelli, Francesco Trovò. Abstract In recent years, the use of Lithium-ion batteries in smart power systems and hybrid/electric vehicles has become increasingly popular since they provide a flexible and cost-effective way to store and deliver power. Their full integration into more complex systems requires an accurate estimate of the […]

03 Mar 22

Advancing drought monitoring via feature extraction

Authors Alberto Metelli, Andrea Castelletti, Marcello Restelli. Abstract A drought is a slowly developing natural phenomenon that can occur in all climatic zones and can be defined as a temporary but significant decrease in water availability. Over the past three decades, the cost of droughts in Europe has amounted to over 100 billion euros and […]

05 Mar 21

Policy Optimization as Online Learning with Mediator Feedback

Policy Optimization as Online Learning with Mediator Feedback Authors: Alberto Maria Metelli, Matteo Papini, Pierluca D’Oro, Marcello Restelli Conference: AAAI 2021 Abstract: Policy Optimization (PO) is a widely used approach to address continuous control tasks. In this paper, we introduce the notion of mediator feedback that frames PO as an online learning problem over the […]

05 Mar 21

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate Authors: Mirco Mutti, Lorenzo Pratissoli, Marcello Restelli Conference: AAAI 2021 Abstract: In a reward-free environment, what is a suitable intrinsic objective for an agent to pursue so that it can learn an optimal task-agnostic exploration policy? In this paper, we argue that the entropy […]

18 Feb 21

Learning Probably Approximately Correct Maximin Strategies in Games with Infinite Strategy Spaces

Learning Probably Approximately Correct Maximin Strategies in Games with Infinite Strategy Spaces Authors: Alberto Marchesi, Francesco Trovò, Nicola Gatti Conference: AAAI 2021 Abstract: We tackle the problem of learning equilibria in simulationbased games. In such games, the players’ utility functions cannot be described analytically, as they are given through a black-box simulator that can be […]

Policy Optimization via Optimal Policy Evaluation

Time-variant variational transfer for value functions

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection

Learning in Non-Cooperative Configurable Markov Decision Processes

Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning

A voltage dynamic-based state of charge estimation method for batteries storage systems

Advancing drought monitoring via feature extraction

Policy Optimization as Online Learning with Mediator Feedback

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate

Learning Probably Approximately Correct Maximin Strategies in Games with Infinite Strategy Spaces

I3Lung: cure mediche personalizzate basate sull’intelligenza artificiale

Machine Learning Models Life Cycle

Configurable Environments in Reinforcement Learning: An Overview

Bayesian Persuasion in Online Settings

Multi-Receiver Online Bayesian Persuasion

Connecting Optimal Ex-Ante Collusion in Teams to Extensive-Form Correlation: Faster Algorithms and Positive Complexity Results

Bayesian Agency: Linear versus Tractable Contracts

Election Manipulation on Social Networks: Seeding, Edge Removal, Edge Addition