Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
compression [2017/10/16 10:03]
admin
compression [2018/10/24 11:04] (current)
admin
Line 71: Line 71:
 https://​arxiv.org/​abs/​1709.01041 Domain-adaptive deep network compression https://​arxiv.org/​abs/​1709.01041 Domain-adaptive deep network compression
  
-We focus on compression algorithms based on low-rank matrix decomposition. Existing methods base compression solely on learned network weights and ignore the statistics of network activations. We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing. We demonstrate that considering activation statistics when compressing weights leads to a rank-constrained regression problem with a closed-form solution. Because our method takes into account the target domain, it can more optimally remove the redundancy in the weights. ​+We focus on compression algorithms based on low-rank matrix decomposition. Existing methods base compression solely on learned network weights and ignore the statistics of network activations. We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing. We demonstrate that considering activation statistics when compressing weights leads to a rank-constrained regression problem with a closed-form solution. Because our method takes into account the target domain, it can more optimally remove the redundancy in the weights. 
 + 
 +https://​arxiv.org/​abs/​1711.01068 Compressing Word Embeddings via Deep Compositional Code Learning  
 + 
 +https://​openreview.net/​pdf?​id=SkhQHMW0W DEEP GRADIENT COMPRESSION:​ 
 +REDUCING THE COMMUNICATION BANDWIDTH FOR 
 +DISTRIBUTED TRAINING 
 + 
 +https://​arxiv.org/​abs/​1804.08275v1 Deep Semantic Hashing with Generative Adversarial Networks 
 + 
 +, a novel deep semantic hashing with GANs (DSH-GANs) is presented, which mainly consists of four components: a deep convolution neural networks (CNN) for learning image representations,​ an adversary stream to distinguish synthetic images from real ones, a hash stream for encoding image representations to hash codes and a classification stream. The whole architecture is trained end-to-end by jointly optimizing three losses, i.e., adversarial loss to correct label of synthetic or real for each sample, triplet ranking loss to preserve the relative similarity ordering in the input real-synthetic triplets and classification loss to classify each sample accurately.  
 + 
 +https://​github.com/​NervanaSystems/​distiller  
 + 
 +https://​arxiv.org/​abs/​1802.07044v3 The Description Length of Deep Learning Models 
 + 
 +This might explain the relatively poor practical performance of variational methods in deep learning. On the other hand, simple incremental encoding methods yield excellent compression values on deep networks, vindicating Solomonoff'​s approach. 
 + 
 +https://​arxiv.org/​abs/​1810.09274v1 From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference 
 + 
 +https://​arxiv.org/​abs/​1807.10251v2 Aggregated Learning: A Vector Quantization Approach to Learning with Neural Networks 
 + 
 +https://​arxiv.org/​abs/​1704.02681v1 Pyramid Vector Quantization for Deep Learning