Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
asynchronous_training [2018/10/09 15:45]
admin
asynchronous_training [2018/10/09 15:47] (current)
admin
Line 46: Line 46:
  
 https://​arxiv.org/​abs/​1810.01021v1 Large batch size training of neural networks with adversarial training and second-order information https://​arxiv.org/​abs/​1810.01021v1 Large batch size training of neural networks with adversarial training and second-order information
 +
 +Our method allows one to increase batch size and learning
 +rate automatically,​ based on Hessian information. This helps significantly reduce the number of parameter
 +updates, and it achieves superior generalization performance,​ without the need to tune any
 +of the additional hyper-parameters. Finally, we show that a block Hessian can be used to approximate
 +the trend of the full Hessian to reduce the overhead of using second-order information. These
 +improvements are useful to reduce NN training time in practice