Psychological, economic, and ethical factors in human feedback for a chatbot-based smoking cessation intervention Nele Albers, Francisco S. Melo, Mark A. Neerincx, Olya Kudina, Willem-Paul Brinkman Npj Digital Medicine, 2025 Integrating human support with chatbot-based behavior change interventions raises three challenges: (1) attuning the support to an individual’s state (e.g., motivation) for enhanced engagement, (2) limiting the use of the concerning human resources for enhanced efficiency, and (3) optimizing outcomes on ethical aspects (e.g., fairness). Therefore, we conducted a study in which 679 smokers and vapers had a 20% chance of receiving human feedback between five chatbot sessions. We find that having received feedback increases retention and effort spent on preparatory activities. However, analyzing a reinforcement learning (RL) model fit on the data shows there are also states where not providing feedback is better. Even this “standard” benefit-maximizing RL model is value-laden. It not only prioritizes people who would benefit most, but also those who are already doing well and want feedback. We show how four other ethical principles can be incorporated to favor other smoker subgroups, yet, interdependencies exist.
Centralized training with hybrid execution in multi-agent reinforcement learning via predictive observation imputation Pedro P. Santos, Diogo S. Carvalho, Miguel Vasco, Alberto Sardinha, Pedro A. Santos, Ana Paiva, Francisco S. Melo Artificial Intelligence, 2025 We study hybrid execution in multi-agent reinforcement learning (MARL), a paradigm where agents aim to complete cooperative tasks with arbitrary communication levels at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully decentralized), to a setting featuring full communication (fully centralized), but the agents do not know beforehand which communication level they will encounter at execution time. We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a centralized manner, to estimate missing agents' observations at execution time. We evaluate MARO on standard scenarios and extensions of previous benchmarks tailored to emphasize the impact of partial observability in MARL. Experimental results show that our method consistently outperforms relevant baselines, allowing agents to act with faulty communication while successfully exploiting shared information.
Regularization and Two Time Scales for Convergence of Reinforcement Learning Diogo S. Carvalho, Pedro A. Santos, Francisco S. Melo Applied Mathematics and Optimization, 2025 Reinforcement learning algorithms aim at solving discrete time stochastic control problems with unknown underlying dynamical systems by an iterative process of interaction. The process is formalized as a Markov decision process, where at each time step, a control action is given, the system provides a reward, and the state changes stochastically. The objective of the controller is the expected sum of rewards obtained throughout the interaction. When the set of states and or actions is large, it is necessary to use some form of function approximation. But even if the function approximation set is simply a linear span of fixed features, the reinforcement learning algorithms may diverge. In this work, we propose and analyze regularized two-time-scale variations of the algorithms, and prove that they are guaranteed to converge almost-surely to a unique solution to the reinforcement learning problem.
Reinforcement learning in convergently non-stationary environments: Feudal hierarchies and learned representations Diogo S. Carvalho, Pedro A. Santos, Francisco S. Melo Artificial Intelligence, 2025 We study the convergence of Q -learning-based methods in convergently non-stationary environments, particularly in the context of hierarchical reinforcement learning and of dynamic features encountered in deep reinforcement learning. We demonstrate that Q -learning achieves convergence in tabular representations when applied to convergently non-stationary dynamics, such as the ones arising in a feudal hierarchical setting. Additionally, we establish convergence for Q -learning-based deep reinforcement learning methods with convergently non-stationary features, such as the ones arising in representation-based settings. Our findings offer theoretical support for the application of Q -learning in these complex scenarios and present methodologies for extending established theoretical results from standard cases to their convergently non-stationary counterparts.
Optimize and Coordinate Multiple DMPs under Constraints to Achieve a Collaborative Manipulation Task Ali H. Kordia, Francisco S. Melo Proceedings IEEE International Conference on Robotics and Automation, 2025 This paper addresses a significant challenge in achieving collaborative tasks; how can a robot or multiple robots, endowed with a library of pre-learned primitive movements, generate multiple simultaneous coordinated robotic movements, adapting and optimizing those in the library, to complete one collaborative task? This work can thus be seen as a follow-up to the work with a motion presented as dynamic movement primitive (DMP) that now considers collaborative tasks and the existence of multiple robots/manipulators. Specifically, we start with a simple task using one DMP and extend it to accommodate the coordinated execution of multiple DMPs in robots with multiple manipulators or-alternatively-multiple robots with a single manipulator. We investigate mechanisms to jointly optimize multiple DMPs to perform one task in a coordinated fashion. The joint trajectory is built from initial DMPs learned for a single manipulator, and its optimization must comply with task-specific constraints. We illustrate the application of our approach both in a simulated environment and in a simulated and real Baxter robot.
Networked Agents in the Dark: Team Value Learning under Partial Observability Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2025
A Comparative Study of Continual Backpropagation Jacopo Silvestrin, Francisco S. Melo, Manuel Lopes Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2025
The Number of Trials Matters in Infinite-Horizon General-Utility Markov Decision Processes Proceedings of Machine Learning Research, 2025
Distributed Value Decomposition Networks with Networked Agents: Extended Abstract Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2025
Implicit Repair with Reinforcement Learning in Emergent Communication Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2025
Preface Steven Davy, Danyal Aftab Frontiers in Artificial Intelligence and Applications, 2024
NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks Advances in Neural Information Processing Systems, 2024
Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2024
Learning to Perceive in Deep Model-Free Reinforcement Learning Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2023
How to Sense the World: Leveraging Hierarchy in Multimodal Perception for Robust Reinforcement Learning Agents Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2022
Cooperation and Learning Dynamics under Risk Diversity and Financial Incentives Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2022
Teaching unknown learners to classify via feature importance Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2021
Interactive Teaching with Groups of Unknown Bayesian Learners Carla Guerra, Francisco S. Melo, Manuel Lopes Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2021
Cooperation between independent reinforcement learners under wealth inequality and collective risks Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2021
Helping People on the Fly: Ad Hoc Teamwork for Human-Robot Teams João G. Ribeiro, Miguel Faria, Alberto Sardinha, Francisco S. Melo Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2021
Ad Hoc Teamwork in the Presence of Non-stationary Teammates Pedro M. Santos, João G. Ribeiro, Alberto Sardinha, Francisco S. Melo Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2021
Preface Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2021
A new convergent variant of Q-learning with linear function approximation Advances in Neural Information Processing Systems, 2020
Emergence of Cooperation in N-Person Dilemmas through Actor-Critic Reinforcement Learning Ala 2020 Adaptive and Learning Agents Workshop at Aamas 2020, 2020
Playing games in the dark: An approach for cross-modality transfer in reinforcement learning Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2020
Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy Francisco S. Melo, Alberto Sardinha, David Belo, Marta Couto, Miguel Faria, Anabela Farias, Hugo Gambôa, Cátia Jesus, Mithun Kinarullathil, Pedro Lima, Luís Luz, André Mateus, Isabel Melo, Plinio Moreno, Daniel Osório, Ana Paiva, Jhielson Pimentel, João Rodrigues, Pedro Sequeira, Rubén Solera-Ureña, Miguel Vasco, Manuela Veloso, Rodrigo Ventura Artificial Intelligence in Medicine, 2019
Group Intelligence on Social Robots Filipa Correia, Francisco S. Melo, Ana Paiva ACM IEEE International Conference on Human Robot Interaction, 2019
Exploring Prosociality in Human-Robot Teams Filipa Correia, Samuel F. Mascarenhas, Samuel Gomes, Patricia Arriaga, Iolanda Leite, Rui Prada, Francisco S. Melo, Ana Paiva ACM IEEE International Conference on Human Robot Interaction, 2019
An optimization approach for structured agent-based provider/receiver tasks Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2019
Online motion concept learning: A novel algorithm for sample-efficient learning and recognition of human actions Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2019
Effects of agents’ transparency on teamwork Silvia Tulli, Filipa Correia, Samuel Mascarenhas, Samuel Gomes, Francisco S. Melo, Ana Paiva Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2019
For the record - A public goods game for exploring human-robot collaboration Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2019
Flow adaptation in serious games for health Tomas Alves, Sandra Gama, Francisco S. Melo 2018 IEEE 6th International Conference on Serious Games and Applications for Health Segah 2018, 2018
Exploring the impact of fault justification in human-robot trust: Socially Interactive Agents Track Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2018
Learning and Teaching Biodiversity Through a Storyteller Robot Maria José Ferreira, Valentina Nisi, Francisco Melo, Ana Paiva Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2017
Online learning for conversational agents Vânia Mendonça, Francisco S. Melo, Luísa Coheur, Alberto Sardinha Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2017
A conversational agent powered by online learning Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2017
Associate latent encodings in learning from demonstrations 31st Aaai Conference on Artificial Intelligence Aaai 2017, 2017
A social robot as a card game player Proceedings of the 13th Aaai Conference on Artificial Intelligence and Interactive Digital Entertainment Aiide 2017, 2017
Ad hoc teamwork by learning teammates' task Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2016
Emergence of emotional appraisal signals in reinforcement learning agents Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2016
Rapidly-Exploring Random Tree approach for Geometry Friends Proceedings of the 1st Joint International Conference of Digital Games Research Association and Foundation of Digital Games Digra Fdg 2016, 2016
Me and you together: A study on collaboration in manipulation tasks Aaai Fall Symposium Technical Report, 2016
Dynamics of fairness in groups of autonomous learning agents Fernando P. Santos, Francisco C. Santos, Francisco S. Melo, Ana Paiva, Jorge M. Pacheco Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2016
Synthesizing robotic handwriting motion by learning from human demonstrations Ijcai International Joint Conference on Artificial Intelligence, 2016
Learning to be fair in multiplayer Ultimatum Games Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2016
An interactive tangram game for children with Autism Beatriz Bernardo, Patrícia Alves-Oliveira, Maria Graça Santos, Francisco S. Melo, Ana Paiva Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2016
The geometry friends game AI competition Rui Prada, Phil Lopes, Joao Catarino, Joao Quiterio, Francisco S. Melo 2015 IEEE Conference on Computational Intelligence and Games Cig 2015 Proceedings, 2015
Modeling students self-studies behaviors Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2015
The "favors game": A framework to study the emergence of cooperation through social importance (extended abstract) Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2015
Personalized assistance for dressing users Steven D. Klee, Beatriz Quintino Ferreira, Rui Silva, João Paulo Costeira, Francisco S. Melo, Manuela Veloso Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2015
It's amazing, we are all feeling it!" emotional climate as a group-level emotional expression in HRI Aaai Fall Symposium Technical Report, 2015
A testbed for autonomous robot surveillance 13th International Conference on Autonomous Agents and Multiagent Systems Aamas 2014, 2014
A flexible approach to modeling unpredictable events in MDPs Icaps 2013 Proceedings of the 23rd International Conference on Automated Planning and Scheduling, 2013
Towards agents with human-like decisions under uncertainty Cooperative Minds Social Interaction and Group Dynamics Proceedings of the 35th Annual Meeting of the Cognitive Science Society Cogsci 2013, 2013
Decentralized multiagent planning for balance control in smart grids Ceur Workshop Proceedings, 2012
QueryPOMDP: POMDP-based communication in multiagent systems Francisco S. Melo, Matthijs T. J. Spaan, Stefan J. Witwicki Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2012
Learning from demonstration using MDP induced metrics Francisco S. Melo, Manuel Lopes Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2010
Learning of coordination: Exploiting sparse interactions in multiagent systems Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2009
Interaction-driven Markov games for decentralized multiagent planning under uncertainty Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2008
Emerging coordination in infinite team Markov games Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2008
A unified framework for imitation-like behaviors Aisb 07 Artificial and Ambient Intelligence, 2007
Q-learning with linear function approximation Francisco S. Melo, M. Isabel Ribeiro Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2007
Convergence of independent adaptive learners Francisco S. Melo, Manuel C. Lopes Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2007
Transition entropy in partially observable Markov decision processes Intelligent Autonomous Systems 9 IAS 2006, 2006
Entropic Risk-Aware Monte Carlo Tree Search PP Santos, J Silvestrin, A Sardinha, FS Melo arXiv preprint arXiv:2601.17667 , 2026 2026
Regularization and Two Time Scales for Convergence of Reinforcement Learning DS Carvalho, PA Santos, FS Melo Applied Mathematics & Optimization 92 (2), 30 , 2025 2025
Reinforcement learning in convergently non-stationary environments: Feudal hierarchies and learned representations DS Carvalho, PA Santos, FS Melo Artificial Intelligence 347, 104382 , 2025 2025 Citations: 7
Optimizing 2D Packing Strategies for Autoclave Loading Using Deep Reinforcement Learning VU Pugliese, DS Carvalho, OF Ferreira, FA Faria, FS Melo EPIA Conference on Artificial Intelligence, 41-53 , 2025 2025
" Teammates, Am I Clear?": Analysing Legible Behaviours in Teams M Faria, FS Melo, A Paiva arXiv preprint arXiv:2507.21631 , 2025 2025
RecBayes: Recurrent Bayesian Ad Hoc Teamwork in Large Partially Observable Domains JG Ribeiro, Y Oren, A Sardinha, M Spaan, FS Melo arXiv preprint arXiv:2506.15756 , 2025 2025
Psychological, economic, and ethical factors in human feedback for a chatbot-based smoking cessation intervention N Albers, FS Melo, MA Neerincx, O Kudina, WP Brinkman npj Digital Medicine 8 (1), 326 , 2025 2025 Citations: 1
Solving General-Utility Markov Decision Processes in the Single-Trial Regime with Online Planning PP Santos, A Sardinha, FS Melo arXiv preprint arXiv:2505.15782 , 2025 2025
Optimize and coordinate multiple DMPs under constraints to achieve a collaborative manipulation task AH Kordia, FS Melo 2025 IEEE International Conference on Robotics and Automation (ICRA), 1-7 , 2025 2025
Implicit repair with reinforcement learning in emergent communication F Vital, A Sardinha, FS Melo arXiv preprint arXiv:2502.12624 , 2025 2025 Citations: 1
Distributed Value Decomposition Networks with Networked Agents GS Varela, A Sardinha, FS Melo arXiv preprint arXiv:2502.07635 , 2025 2025
Networked agents in the dark: Team value learning under partial observability GS Varela, A Sardinha, FS Melo arXiv preprint arXiv:2501.08778 , 2025 2025 Citations: 4
NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks B Esteves, M Vasco, FS Melo Advances in Neural Information Processing Systems 37, 3458-3498 , 2024 2024
Combining active learning and learning to reject for anomaly detection L Stradiotti, L Perini, J Davis 27th European Conference on Artificial Intelligence, 19–24 October 2024 … , 2024 2024 Citations: 5
The number of trials matters in infinite-horizon general-utility markov decision processes PP Santos, A Sardinha, FS Melo arXiv preprint arXiv:2409.15128 , 2024 2024 Citations: 2
A comparative study of continual backpropagation J Silvestrin, FS Melo, M Lopes EPIA Conference on Artificial Intelligence, 324-334 , 2024 2024 Citations: 1
The impact of data distribution on Q-learning with function approximation PP Santos, DS Carvalho, A Sardinha, FS Melo Machine Learning 113 (9), 6141-6163 , 2024 2024 Citations: 7
When a robot is your teammate F Correia, FS Melo, A Paiva Topics in Cognitive Science 16 (3), 527-553 , 2024 2024 Citations: 18
HOTSPOT: An ad hoc teamwork platform for mixed human-robot teams JG Ribeiro, LM Henriques, S Colcher, JC Duarte, FS Melo, RL Milidiú, ... Plos one 19 (6), e0305705 , 2024 2024 Citations: 2
“Guess what I'm doing”: Extending legibility to sequential decision tasks M Faria, FS Melo, A Paiva Artificial Intelligence 330, 104107 , 2024 2024 Citations: 7
MOST CITED SCHOLAR PUBLICATIONS
An analysis of reinforcement learning with function approximation FS Melo, SP Meyn, MI Ribeiro Proceedings of the 25th international conference on Machine learning, 664-671 , 2008 2008 Citations: 369
Active learning for reward estimation in inverse reinforcement learning M Lopes, F Melo, L Montesano Joint European conference on machine learning and knowledge discovery in … , 2009 2009 Citations: 268
Affordance-based imitation learning in robots M Lopes, FS Melo, L Montesano 2007 IEEE/RSJ international conference on intelligent robots and systems … , 2007 2007 Citations: 177
Q -Learning with Linear Function Approximation FS Melo, MI Ribeiro International Conference on Computational Learning Theory, 308-322 , 2007 2007 Citations: 160
Exploring the impact of fault justification in human-robot trust F Correia, C Guerra, S Mascarenhas, FS Melo, A Paiva Proceedings of the 17th international conference on autonomous agents and … , 2018 2018 Citations: 140
Decentralized MDPs with sparse interactions FS Melo, M Veloso Artificial Intelligence 175 (11), 1757-1789 , 2011 2011 Citations: 139
Empathic robot for group learning: A field study P Alves-Oliveira, P Sequeira, FS Melo, G Castellano, A Paiva ACM Transactions on Human-Robot Interaction (THRI) 8 (1), 1-34 , 2019 2019 Citations: 129
Geometric multimodal contrastive representation learning P Poklukar, M Vasco, H Yin, FS Melo, A Paiva, D Kragic International Conference on Machine Learning, 17782-17800 , 2022 2022 Citations: 117
Learning of coordination: Exploiting sparse interactions in multiagent systems FS Melo, M Veloso Proceedings of The 8th International Conference on Autonomous Agents and … , 2009 2009 Citations: 117
Interaction-driven Markov games for decentralized multiagent planning under uncertainty MTJ Spaan, FS Melo Proceedings of the 7th international joint conference on Autonomous agents … , 2008 2008 Citations: 112
Group-based emotions in teams of humans and robots F Correia, S Mascarenhas, R Prada, FS Melo, A Paiva Proceedings of the 2018 ACM/IEEE international conference on human-robot … , 2018 2018 Citations: 109
Just follow the suit! trust in human-robot interactions during card game playing F Correia, P Alves-Oliveira, N Maia, T Ribeiro, S Petisca, FS Melo, ... 2016 25th IEEE international symposium on robot and human interactive … , 2016 2016 Citations: 76
Personalized assistance for dressing users SD Klee, BQ Ferreira, R Silva, JP Costeira, FS Melo, M Veloso International Conference on Social Robotics, 359-369 , 2015 2015 Citations: 71
An empathic robotic tutor for school classrooms: Considering expectation and satisfaction of children as end-users P Alves-Oliveira, T Ribeiro, S Petisca, E Di Tullio, FS Melo, A Paiva International Conference on social robotics, 21-30 , 2015 2015 Citations: 66
Project INSIDE: towards autonomous semi-unstructured human–robot social interaction in autism therapy FS Melo, A Sardinha, D Belo, M Couto, M Faria, A Farias, H Gamboa, ... Artificial intelligence in medicine 96, 198-216 , 2019 2019 Citations: 65
Monte carlo tree search experiments in hearthstone A Santos, PA Santos, FS Melo 2017 IEEE conference on computational intelligence and games (CIG), 272-279 , 2017 2017 Citations: 65
Discovering social interaction strategies for robots from restricted-perception Wizard-of-Oz studies P Sequeira, P Alves-Oliveira, T Ribeiro, E Di Tullio, S Petisca, FS Melo, ... 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI … , 2016 2016 Citations: 63
Emotion-based intrinsic motivation for reinforcement learning agents P Sequeira, FS Melo, A Paiva International conference on affective computing and intelligent interaction … , 2011 2011 Citations: 63
Exploring prosociality in human-robot teams F Correia, SF Mascarenhas, S Gomes, P Arriaga, I Leite, R Prada, ... 2019 14th ACM/IEEE international conference on human-robot interaction (HRI … , 2019 2019 Citations: 62
Abstraction levels for robotic imitation: Overview and computational approaches M Lopes, F Melo, L Montesano, J Santos-Victor From Motor Learning to Interaction Learning in Robots, 313-355 , 2010 2010 Citations: 61