Thompson sampling

Type of heuristic technique / From Wikipedia, the free encyclopedia

Dear Wikiwand AI, let's keep it short by simply answering these key questions:

Can you list the top facts and stats about Thompson sampling?

Summarize this article for a 10 year old

SHOW ALL QUESTIONS

Thompson sampling,^[1]^[2]^[3] named after William R. Thompson, is a heuristic for choosing actions that address the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

This article needs additional citations for verification. (May 2012)

Concrete example of Thompson sampling applied to simulate treatment efficacy evaluation