Skip to content

Latest commit

 

History

History
20 lines (17 loc) · 303 Bytes

README.md

File metadata and controls

20 lines (17 loc) · 303 Bytes

Multi Armed Bandit Algorithm W\ Deep Learning

Build using python 3.6.6 with tox 3.4.0 and tensorflow 1.12.2

Policies implemented:

  • Epsilon Decreasing
  • Epsilon First
  • Epsilon Greedy
  • EXP3
  • GreedyMix
  • Softmax
  • Softmax Decreasing
  • SoftMix
  • UCB1
  • UCB1-Tuned
  • UCB2

Arms:

  • Bernoulli
  • Normal