This semester, we are running a Deep Learning Reading group every Tuesday at 7.00 pm before the weekly talk. Below is a collection of the papers we have read so far.
Using diffusion models with a new architecture and classifier guidance, this method generates images with a higher quality than current state-of-the-art models, achieving an FID of 2.97 on ImageNet 128x128.
MLP-Mixer, a new architecture that uses only multi-layer perceptrons (MLPs) and is competitive with state-of-the-art models on image classification benchmarks, without using convolutional or attention-based networks.
A novel criterion to efficiently prune convolutional neural networks inspired by explaining nonlinear classification decisions in terms of input variables is introduced.
"Big Transfer" (BiT) improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision.
Self-supervised Vision Transformer (ViT) features are effective for semantic segmentation and k-NN classification.
The paper presents the Transformer, a new simple network architecture based solely on attention mechanisms, which outperforms current sequence transduction models and is more parallelizable and requires less time to train.
Vision Transformer (ViT) applied to image patches performs competitively with convolutional networks on image classification tasks.
The paper presents the "lottery ticket hypothesis", which states that dense, randomly-initialized, feed-forward networks contain subnetworks (winning tickets) that when trained in isolation reach test accuracy comparable to the original network in a similar number of iterations.
The deep Q-network agent can excel at a diverse array of challenging tasks by bridging the divide between high-dimensional sensory inputs and actions.
This paper describes Layer-wise Relevance Propagation (LRP), a method for making deep neural networks explainable by highlighting the input features used for predictions.