Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
compression [2017/11/14 01:08]
admin
compression [2018/10/24 11:04] (current)
admin
Line 78: Line 78:
 REDUCING THE COMMUNICATION BANDWIDTH FOR REDUCING THE COMMUNICATION BANDWIDTH FOR
 DISTRIBUTED TRAINING DISTRIBUTED TRAINING
 +
 +https://​arxiv.org/​abs/​1804.08275v1 Deep Semantic Hashing with Generative Adversarial Networks
 +
 +, a novel deep semantic hashing with GANs (DSH-GANs) is presented, which mainly consists of four components: a deep convolution neural networks (CNN) for learning image representations,​ an adversary stream to distinguish synthetic images from real ones, a hash stream for encoding image representations to hash codes and a classification stream. The whole architecture is trained end-to-end by jointly optimizing three losses, i.e., adversarial loss to correct label of synthetic or real for each sample, triplet ranking loss to preserve the relative similarity ordering in the input real-synthetic triplets and classification loss to classify each sample accurately. ​
 +
 +https://​github.com/​NervanaSystems/​distiller ​
 +
 +https://​arxiv.org/​abs/​1802.07044v3 The Description Length of Deep Learning Models
 +
 +This might explain the relatively poor practical performance of variational methods in deep learning. On the other hand, simple incremental encoding methods yield excellent compression values on deep networks, vindicating Solomonoff'​s approach.
 +
 +https://​arxiv.org/​abs/​1810.09274v1 From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference
 +
 +https://​arxiv.org/​abs/​1807.10251v2 Aggregated Learning: A Vector Quantization Approach to Learning with Neural Networks
 +
 +https://​arxiv.org/​abs/​1704.02681v1 Pyramid Vector Quantization for Deep Learning
 +