说明: The pre-dominant approach to language model- ing to date is based on recurrent neural networks. In this paper we present a convolutional approach to language modeling. We introduce a novel gating mechanism that eases gradient propaga- tion and which <panda_wendao> 上传 | 大小:572kb
说明: Distilling the Knowledge in a Neural Network by Geoffrey Hinton A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions <panda_wendao> 上传 | 大小:104kb