This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
beam_search [2017/08/27 20:21] external edit
beam_search [2018/11/03 11:11] (current)
Line 87: Line 87:
 http://​julian.togelius.com/​Bravi2017Evolving.pdf Evolving game-specific UCB alternatives http://​julian.togelius.com/​Bravi2017Evolving.pdf Evolving game-specific UCB alternatives
 for General Video Game Playing for General Video Game Playing
 +https://​arxiv.org/​pdf/​1708.00111.pdf A Continuous Relaxation of Beam Search for End-to-end Training of Neural
 +Sequence Models
 +https://​arxiv.org/​abs/​1811.00512v1 Learning Beam Search Policies via Imitation Learning
 +Beam search is widely used for approximate decoding in structured prediction problems. Models often use a beam at test time but ignore its existence at train time, and therefore do not explicitly learn how to use the beam. We develop an unifying meta-algorithm for learning beam search policies using imitation learning. In our setting, the beam is part of the model, and not just an artifact of approximate decoding. Our meta-algorithm captures existing learning algorithms and suggests new ones. It also lets us show novel no-regret guarantees for learning beam search policies.