At round ( t ) we draw independent samples ( \tilde\theta_a^(1),\dots,\tilde\theta_a^(k) ) from the posterior and compute the expected reward for each arm under each draw:
850 words
At round ( t ) we draw independent samples ( \tilde\theta_a^(1),\dots,\tilde\theta_a^(k) ) from the posterior and compute the expected reward for each arm under each draw:
850 words