- A fast learning algorithm for deep belief nets. [[pdf](docs/2006/A fast learning algorithm for deep belief nets(2006).pdf)] [url] ⭐
- A Tutorial on Energy-Based Learning. [[pdf]](docs/2006/A Tutorial on Energy-Based Learning.pdf) [url]
- [LeNet] Gradient-based learning applied to document recognition. [pdf] ⭐
- Constructing Informative Priors using Transfer Learning. [url]
- Connectionist Temporal Classification: Labelling unsegmented Sequence Data with Recurrent Neural Networks. [url]
- Deep Boltzmann Machines. [[pdf](docs/2009/Deep Boltzmann Machines.pdf)] [url] ⭐
- Exploring Strategies for Training Deep Neural Networks. [[pdf]](docs/2009/Exploring Strategies for Training Deep Neural Networks.pdf) [url]
- Efficient Learning of Sparse Representations with an Energy-Based Model. [[pdf]](docs/2006/Efficient Learning of Sparse Representations with an Energy-Based Model(2006).pdf) [url] ⭐
- Efficient sparse coding algorithms. [[pdf]](docs/2007/Efficient sparse coding algorithms(2007).pdf) [url] ⭐
- Energy-Based Models in Document Recognition and Computer Vision. [[pdf]](docs/2007/Energy-Based Models in Document Recognition and Computer Vision.pdf) [url]
- Extracting and Composing Robust Features with Denoising Autoencoders. [[pdf]](docs/2008/Extracting and Composing Robust Features with Denoising Autoencoders.pdf) [url] ⭐
- Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition. [[pdf]](docs/2008/Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition.pdf) [url]
- Gaussian Process Models for Link Analysis and Transfer Learning. [url]
- Greedy Layer-Wise Training of Deep Networks. [[pdf]](docs/2007/Greedy Layer-Wise Training of Deep Networks(2007).pdf) [url] ⭐
- Learning Invariant Features through Topographic Filter Maps. [[pdf]](docs/2009/Learning Invariant Features through Topographic Filter Maps.pdf) [url]
- Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification.[[pdf]](docs/2009/Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification.pdf) [url] ⭐
- Mapping and Revising Markov Logic Networks for Transfer Learning. [url]
- Nonlinear Learning using Local Coordinate Coding. [[pdf](docs/2009/Nonlinear Learning using Local Coordinate Coding.pdf)] [url] ⭐
- Notes on Convolutional Neural Networks. [[pdf]](docs/2006/Notes on Convolutional Neural Networks(2006).pdf) [url]
- Reducing the Dimensionality of Data with Neural Networks. [science] ⭐
- To Recognize Shapes, First Learn to Generate Images. [[pdf]](docs/2006/To Recognize Shapes,First Learn to Generate Images(2006).pdf) [url]
- Scaling Learning Algorithms towards AI. [[pdf]](docs/2007/Scaling Learning Algorithms towards AI.pdf) [url] ⭐
- Sparse deep belief net model for visual area V2. [[pdf]](docs/2007/Sparse deep belief net model for visual area V2.pdf) [url] ⭐
- Sparse Feature Learning for Deep Belief Networks. [[pdf]](docs/2007/Sparse Feature Learning for Deep Belief Networks.pdf) [url]
- Training restricted Boltzmann machines using approximations to the likelihood gradient. [[pdf]](docs/2008/Training restricted Boltzmann machines using approximations to the likelihood gradient.pdf) [url]
- Training Products of experts by minimizing contrastive divergence.[[pdf]](docs/before2016/Training Products of experts by minimizing contrastive divergence(2000).pdf) [[url]](Training Products of Experts by Minimizing Contrastive Divergence)] ⭐
- Using Fast Weights to Improve Persistent Contrastive Divergence. [[pdf]](docs/2009/Using Fast Weights to Improve Persistent Contrastive Divergence.pdf) [url] ⭐
- Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition. [[pdf]](docs/2007/Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition.pdf) [url]
- What is the Best Multi-Stage Architecture for Object Recognition?. [[pdf]](docs/2009/What is the Best Multi-Stage Architecture for Object Recognition?.pdf) [url] ⭐
- A Survey on Transfer Learning. [url]] ⭐
- Modeling Transfer Relationships Between Learning Tasks for Improved Inductive Transfer. [pdf]
- To Transfer or Not To Transfer.[url]
- Transfer learning for text classification. [url]
- Transfer learning for collaborative filtering via a rating-matrix generative model.[url]
- Transfer learning from multiple source domains via consensus regularization. [url]
- Transfer Learning for Reinforcement Learning Domains: A Survey. [url] ⭐
- An improved categorization of classifier’s sensitivity on sample selection bias. [pdf]
- Boosting for transfer learning. [url] ⭐
- A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. [pdf] ⭐
- Correcting sample selection bias by unlabeled data. [pdf]
- Cross domain distribution adaptation via kernel mapping. [pdf]
- Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation.[pdf]
- Discriminative learning for differing training and test distributions. [pdf]
- Domain Adaptation via Transfer Component Analysis. [pdf] ⭐
- Instance Weighting for Domain Adaptation in NLP. [pdf]
- Logistic regression with an auxiliary data source. [pdf]
- Transferring Naive Bayes Classifiers for Text Classification. [pdf]
- A Spectral Regularization Framework for Multi-Task Structure Learning. [pdf]
- Biographies, bollywood, boom- boxes and blenders: Domain adaptation for sentiment classification. [pdf]
- Co-clustering based Classification for Out-of-domain Documents. [pdf] ⭐
- Domain adaptation with structural correspondence learning. [pdf]
- Frustratingly easy domain adaptation. [pdf] ⭐
- Kernel-based inductive transfer. [pdf]
- Learning a meta-level prior for feature relevance from multiple related tasks. [pdf]
- Multi-task feature and kernel selection for svms. [pdf]
- Multi-task feature learning. [pdf] ⭐
- Self-taught Clustering. [pdf]
- Self-taught Learning-Transfer Learning from Unlabeled Data. [[pdf](docs/2007/Self-taught Learning-Transfer Learning from Unlabeled Data.pdf)] [url] ⭐
- Spectral domain-transfer learning. [url] ⭐
- Transfer learning via dimensionality reduction. [pdf]
- Knowledge transfer via multiple model local structure mapping. [pdf]
- Learning Gaussian Process Kernels via Hierarchical Bayes. [pdf]
- Learning to learn with the informative vector machine. [pdf]
- Multi-task Gaussian Process Prediction. [pdf]
- Regularized multi-task learning. [pdf]
- The more you know, the less you learn: from knowledge transfer to one-shot learning of object categories.[pdf]