Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Last revision Both sides next revision
activation [2017/10/18 10:36]
admin
activation [2017/12/07 13:19]
admin
Line 134: Line 134:
 https://​arxiv.org/​abs/​1710.05941 Swish: a Self-Gated Activation Function https://​arxiv.org/​abs/​1710.05941 Swish: a Self-Gated Activation Function
  
 +https://​arxiv.org/​pdf/​1712.01897.pdf Online Learning with Gated Linear Networks  
 +Rather than relying on non-linear transfer functions, our method gains representational power by the use of data conditioning. We state under general conditions a learnable capacity theorem that shows this approach can in principle learn any bounded Borel-measurable function on a compact subset of euclidean space; the result is stronger than many universality results for connectionist architectures because we provide both the model and the learning procedure for which convergence is guaranteed.