Convolutional Neural Networks for Sentence Classification
摘要
We report on a series of experiments with
convolutional neural networks (CNN)
trained on top of pre-trained word vectors for sentence-level classification tasks.
We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific
vectors through fine-tuning offers further
gains in performance. We additionally
propose a simple modification to the architecture to allow for the use of both
task-specific and static vectors. The CNN
models discussed herein improve upon the
state of the art on 4 out of 7 tasks, which
include sentiment analysis and question
classification
- 任务:句子级分类任务sentence-level classification tasks
- a simple CNN with little hyperparameter tuniing and static vectors.
模型架构
xi∈Rkx_i \in R^kxi∈Rk the k-dimensional word vector
A sentence of length:nnn
x1:n=x1⊗x2⊗⋯,⊗xnx_{1:n} = x_1\otimes x_2\otimes \cdots,\otimes x_nx1:n=x1⊗x2⊗⋯,⊗xn
⊗\otimes⊗is the concatenation operator.
xi:i+jx_{i:i+j}xi:i+j 代表单词的拼接
w∈Rhkw \in R^{hk}w∈Rhk: 卷积滤波器。
卷积操作
a max-over-time pooling operation
c^=maxc\hat{c} = \max{c}c^=maxc
倒数第二层加入dropout ,防止过拟合。
解决了句子长度可变问题。
- the penultimate layer 倒数第二层
a fully connected softmax layer
数据集
MR
SST-1:
SST-2:
Subj
TREC
CR
• MPQA
更新算法
- 随机梯度下降法:Adadelta 算法
- 预训练词向量:the publicly available word2vec vectors
模型变体
- CNN-rand
- CNN-static
- CNN-non-static
- CNN-multichannel