Related To

Filter Groups DenseNet

References

https://arxiv.org/pdf/1611.07661v1.pdf Neural Multigrid

Rather than manipulating representations living on a single spatial grid, our network layers operate across scale space, on a pyramid of tensors. They consume multigrid inputs and produce multigrid outputs; convolutional filters themselves have both within-scale and cross-scale extent. This aspect is distinct from simple multiscale designs, which only process the input at different scales. Viewed in terms of information flow, a multigrid network passes messages across a spatial pyramid. As a consequence, receptive field size grows exponentially with depth, facilitating rapid integration of context. Most critically, multigrid structure enables networks to learn internal attention and dynamic routing mechanisms, and use them to accomplish tasks on which modern CNNs fail. https://github.com/buttomnutstoast/Multigrid-Neural-Architectures

https://arxiv.org/pdf/1512.02767v2.pdf Affinity CNN: Learning Pixel-Centric Pairwise Relations for Figure/Ground Embedding

http://redwood.berkeley.edu/vs265/olshausen-etal93.pdf A Neurobiological Model of Visual Attention and Invariant Pattern Recognition Based on Dynamic Routing of Information

https://arxiv.org/pdf/1611.09326v1.pdf The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation

In this paper, we extend DenseNets to deal with the problem of semantic segmentation. We achieve state-of-the-art results on urban scene benchmark datasets such as CamVid and Gatech, without any further post-processing module nor pretraining. Moreover, due to smart construction of the model, our approach has much less parameters than currently published best entries for these datasets.

https://arxiv.org/pdf/1708.07038v1.pdf Non-linear Convolution Filters for CNN-based Learning

Typical convolutional layers are linear systems, hence their expressiveness is limited. To overcome this, various non-linearities have been used as activation functions inside CNNs, while also many pooling strategies have been applied. We address the issue of developing a convolution method in the context of a computational model of the visual cortex, exploring quadratic forms through the Volterra kernels. Such forms, constituting a more rich function space, are used as approximations of the response profile of visual cells.

The Volterra series model is a sequence of approximations for continuous functions, developed to represent the input-output relationship of non-linear dynamical systems, using a polynomial functional expansion. Their equations can be composed by terms of infinite orders, but practical implementations based on them use truncated versions, retaining the terms up to some order r.

http://vcl.iti.gr/volterra-based-convolution-filter-implementation-in-torch/

https://arxiv.org/pdf/1707.08308v1.pdf Tensor Regression Networks

To date, most convolutional neural network architectures output predictions by flattening 3rd-order activation tensors, and applying fully-connected output layers. This approach has two drawbacks: (i) we lose rich, multi-modal structure during the flattening process and (ii) fully-connected layers require many parameters. We present the first attempt to circumvent these issues by expressing the output of a neural network directly as the the result of a multi-linear mapping from an activation tensor to the output. By imposing low-rank constraints on the regression tensor, we can efficiently solve problems for which existing solutions are badly parametrized. Our proposed tensor regression layer replaces flattening operations and fullyconnected layers by leveraging multi-modal structure in the data and expressing the regression weights via a low rank tensor decomposition. Additionally, we combine tensor regression with tensor contraction to further increase efficiency. Augmenting the VGG and ResNet architectures, we demonstrate large reductions in the number of parameters with negligible impact on performance on the ImageNet dataset.