stochastic multi-armed bandits, regret minimization
chernoff bounds
运用Markov inequality
最后根据可以得到
Hoeffding's equality
Stochastic mult-armed bandits
运用Markov inequality
最后根据可以得到
Hoeffding's equality
Stochastic mult-armed bandits
相关推荐