## Towards de-mystification of deep learning: function space analysis of the representation layers

- Nov. 21, 2017
- 2 p.m.
- LeConte 312

## Abstract

We propose a function space approach to Representation Learning [1] and the analysis of the representation layers in deep learning architectures. We show how to compute a 'weak-type' Besov smoothness index that quantifies the geometry of the clustering in the feature space. This approach was already applied successfully to improve the performance of machine learning algorithms such as the Random Forest [2] and tree-based Gradient Boosting [3]. Our experiments demonstrate that in well-known and well-performing trained networks, the Besov smoothness of the training set, measured in the corresponding hidden layer feature map representation, increases from layer to layer which relates to the `unfolding' of the clustering in the feature space. We also contribute to the understanding of generalization [4] by showing how the Besov smoothness of the representations, decreases as we add more mis-labeling to the training data. We hope this approach will contribute to the de-mystification of some aspects of deep learning.

This is joint work with Oren Elisha, TAU. Preprint is available at: https://www.shaidekel.com/

References:

[1] Y. Bengio , A. Courville and P. Vincenty, Representation Learning: A Review and New Perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence 8 (2013), 1798-1828.

[2] O. Elisha and S. Dekel , Wavelet decompositions of Random Forests - smoothness analysis,sparse approximation and applications, Journal of machine learning research 17 (2016), 1-38.

[3] S. Dekel, O. Elisha and O. Morgan, Wavelet decomposition of Gradient Boosting, preprint.

[4] C. Zhang, S. Bengio, M. Hardt, B. Recht and O. Vinyals, Understanding deep learning requires rethinking generalization, In ICLR 2017 conference proceedings.