## Lecture 1 – Deep Learning Challenge. Is There Theory?

**Readings**

- Deep Deep Trouble
- Why 2016 is The Global Tipping Point...
- Are AI and ML Killing Analyticals...
- The Dark Secret at The Heart of AI
- AI Robots Learning Racism...
- FaceApp Forced to Pull ‘Racist' Filters...
- Losing a Whole Generation of Young Men to Video Games

## Lecture 2 – Overview of Deep Learning From a Practical Point of View

**Readings**

- Emergence of simple cell
- ImageNet Classification with Deep Convolutional Neural Networks (Alexnet)
- Very Deep Convolutional Networks for Large-Scale Image Recognition (VGG)
- Going Deeper with Convolutions (GoogLeNet)
- Deep Residual Learning for Image Recognition (ResNet)
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Visualizing and Understanding Convolutional Neural Networks

**Blogs**

**Videos**

## Lecture 3

**Readings**

- A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction
- Energy Propagation in Deep Convolutional Neural Networks
- Discrete Deep Feature Extraction: A Theory and New Architectures
- Topology Reduction in Deep Convolutional Feature Extraction Networks

## Lecture 4

**Readings**

- A Probabilistic Framework for Deep Learning
- Semi-Supervised Learning with the Deep Rendering Mixture Model
- A Probabilistic Theory of Deep Learning

## Lecture 5

**Readings**

- Why and When Can Deep-but Not Shallow-networks Avoid the Curse of Dimensionality: A Review
- Learning Functions: When is Deep Better Than Shallow

## Lecture 6

**Readings**

- Convolutional Patch Representations for Image Retrieval: an Unsupervised Approach
- Convolutional Kernel Networks
- Kernel Descriptors for Visual Recognition
- End-to-End Kernel Learning with Supervised Convolutional Kernel Networks
- Learning with Kernels
- Kernel Based Methods for Hypothesis Testing

## Lecture 7

**Readings**

- Geometry of Neural Network Loss Surfaces via Random Matrix Theory
- Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice
- Nonlinear random matrix theory for deep learning

## Lecture 8

**Readings**

- Deep Learning without Poor Local Minima
- Topology and Geometry of Half-Rectified Network Optimization
- Convexified Convolutional Neural Networks
- Implicit Regularization in Matrix Factorization

## Lecture 9

**Readings**

- Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position
- Perception as an inference problem
- A Neurobiological Model of Visual Attention and Invariant Pattern Recognition Based on Dynamic Routing of Information

## Lecture 10

**Readings**

- Working Locally Thinking Globally: Theoretical Guarantees for Convolutional Sparse Coding
- Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
- Multi-Layer Convolutional Sparse Modeling: Pursuit and Dictionary Learning
- Convolutional Dictionary Learning via Local Processing

## To be discussed and extra

- Emergence of simple cell by Olshausen and Field
- Auto-Encoding Variational Bayes by Kingma and Welling
- Generative Adversarial Networks by Goodfellow et al.
- Understanding Deep Learning Requires Rethinking Generalization by Zhang et al.
- Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy? by Giryes et al.
- Robust Large Margin Deep Neural Networks by Sokolic et al.
- Tradeoffs between Convergence Speed and Reconstruction Accuracy in Inverse Problems by Giryes et al.
- Understanding Trainable Sparse Coding via Matrix Factorization by Moreau and Bruna
- Why are Deep Nets Reversible: A Simple Theory, With Implications for Training by Arora et al.
- Stable Recovery of the Factors From a Deep Matrix Product and Application to Convolutional Network by Malgouyres and Landsberg
- Optimal Approximation with Sparse Deep Neural Networks by Bolcskei et al.
- Convolutional Rectifier Networks as Generalized Tensor Decompositions by Cohen and Shashua
- Emergence of Invariance and Disentanglement in Deep Representations by Achille and Soatto
- Deep Learning and the Information Bottleneck Principle by Tishby and Zaslavsky