Recurrent neural networks, particularly the long short-term memory networks, are extremely appealing for sequence-tosequence learning tasks. Despite their great success, they typically suffer from a fundamental shortcoming: they are prone to generat
State-of-the-art sequence labeling systems traditionally require large mounts of task-specific knowledge in the form of hand-crafted features and data pre-processing.In this paper, we introduce a novel neu-tral network architecture that benefits fro