COM-MABs: From Users' Feedback to Recommendation

Alexandre Letard; Tassadit Amghar; Olivier Camp; Nicolas Gutowski

doi:10.32473/flairs.v35i.130560

COM-MABs: From Users' Feedback to Recommendation

Autores/as

Alexandre Letard Kara Technology, LERIA
Tassadit Amghar
Olivier Camp
Nicolas Gutowski

DOI:

https://doi.org/10.32473/flairs.v35i.130560

Resumen

Recently, the COMbinatorial Multi-Armed Bandits (COM-MAB) problem has arisen as an active research field. In systems interacting with humans, those reinforcement learning approaches use a feedback strategy as their reward function. On the study of those strategies, this paper present three contributions: 1) We model a feedback strategy as a three-step process, where each step influences the performances of an agent ; 2) Based on this model, we propose a novel Reward Computing process, BUSBC, which significantly increases the global accuracy reached by optimistic COM-MAB algorithms -- up to 16.2\% -- ; 3) We conduct an empirical analysis of our approach and several feedback strategies from the literature on three real-world application datasets, confirming our propositions.

Descargas

PDF (English)

Publicado

2022-05-04

Cómo citar

Letard, A., Amghar, T., Camp, O., & Gutowski, N. (2022). COM-MABs: From Users’ Feedback to Recommendation. The International FLAIRS Conference Proceedings, 35. https://doi.org/10.32473/flairs.v35i.130560

Descargar cita

Número

Vol. 35 (2022): Proceedings of FLAIRS-35

Sección

Main Track Proceedings

Licencia

Derechos de autor 2022 Alexandre Letard, Tassadit Amghar, Olivier Camp, Nicolas Gutowski

Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial 4.0.

COM-MABs: From Users' Feedback to Recommendation

Autores/as

DOI:

Resumen

Descargas

Publicado

Cómo citar

Número

Sección

Licencia

Desarrollado por

Enviar un artículo

Idioma