The effect of morphology in named entity recognition with sequence tagging
Onur Güngör, Tunga Güngör, Suzan Uskudarli
Natural Language Engineering
Abstract
This work proposes a sequential tagger for named entity recognition in morphologically rich languages. Several schemes for representing the morphological analysis of a word in the context of named entity recognition are examined. Word representations are formed by concatenating word and character embeddings with the morphological embeddings based on these schemes. The impact of these representations is measured by training and evaluating a sequential tagger composed of a conditional random field layer on top of a bidirectional long short-term memory layer. Experiments with Turkish, Czech, Hungarian, Finnish and Spanish produce the state-of-the-art results for all these languages, indicating that the representation of morphological information improves performance.
BibTeX
@article{gungor2019effect,
title = {The effect of morphology in named entity recognition with sequence tagging},
author = {G{\"u}ng{\"o}r, Onur and G{\"u}ng{\"o}r, Tunga and Uskudarli, Suzan},
year = 2019,
journal = {Natural Language Engineering},
publisher = {Cambridge University Press},
volume = 25,
number = 1,
pages = {147--169},
doi = {10.1017/S1351324918000281},
link = {https://www.cambridge.org/core/journal/natural-language-engineering/article/abs/effect-of-morphology-in-named-entity-recognition-with-sequence-tagging/81DCFC0417AF7719AAA1F4C4F0117761},
abstract = {This work proposes a sequential tagger for named entity recognition in morphologically rich languages. Several schemes for representing the morphological analysis of a word in the context of named entity recognition are examined. Word representations are formed by concatenating word and character embeddings with the morphological embeddings based on these schemes. The impact of these representations is measured by training and evaluating a sequential tagger composed of a conditional random field layer on top of a bidirectional long short-term memory layer. Experiments with Turkish, Czech, Hungarian, Finnish and Spanish produce the state-of-the-art results for all these languages, indicating that the representation of morphological information improves performance.},
group = {journal}
}