Multi-armed stochastic bandits and their applications to quantum information - Josep Lumbreras Zarapico

In this talk, I'll address the exploration-exploitation dilemma in reinforcement learning, focusing on its formalization in multi-armed stochastic bandits (MAB). I'll discuss key techniques like optimism in the face of uncertainty, exemplified by algorithms such as upper confidence bounds (UCB) and LinUCB. Transitioning to quantum tasks, I'll apply the MAB framework for learning quantum state properties, online/adaptive quantum state tomography, and developing recommender systems for quantum data.