Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
optimization_layer [2018/02/22 16:59]
admin
optimization_layer [2018/10/26 01:50]
admin
Line 33: Line 33:
  
 Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility,​ DP algorithms are usually non-differentiable,​ which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically,​ we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation. Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility,​ DP algorithms are usually non-differentiable,​ which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically,​ we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.
 +
 +https://​arxiv.org/​abs/​1704.07183v1 Stochastic Constraint Programming as Reinforcement Learning
 +
 +https://​arxiv.org/​abs/​1510.03317v1 The Inductive Constraint Programming Loop
 +
 +https://​arxiv.org/​abs/​1802.03676v2Differentiable Dynamic Programming for Structured Prediction and Attention
 +
 +we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. ​
 +
 +https://​arxiv.org/​abs/​1807.01672v1 Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization
 +
 +https://​arxiv.org/​abs/​1807.04676v1 A Library for Constraint Consistent Learning
 +
 +http://​proceedings.mlr.press/​v80/​rosenfeld18a.html Learning to Optimize Combinatorial Functions
 +
 +https://​arxiv.org/​abs/​1804.06355v1 An Exponential Speedup in Parallel Running Time for Submodular Maximization without Loss in Approximation
 +
 +https://​arxiv.org/​abs/​1810.10659 Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search
 +