Annotation and Representation of a Diachronic Corpus of Spanish

@inproceedings{Marco2010AnnotationAR,
  title={Annotation and Representation of a Diachronic Corpus of Spanish},
  author={Cristina S{\'a}nchez Marco and Gemma Boleda and Josep Maria Fontana and Judith Domingo},
  booktitle={LREC},
  year={2010}
}
In this article we describe two different strategies for the automatic tagging of a Spanish diachronic corpus involving the adaptation of existing NLP tools developed for modern Spanish. In the initial approach we follow a state-of-the-art strategy, which consists on standardizing the spelling and the lexicon. This approach boosts POS-tagging accuracy to 90, which represents a raw improvement of over 20% with respect to the results obtained without any pre-processing. In order to enable non… CONTINUE READING
Highly Cited
This paper has 19 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.

References

Publications referenced by this paper.
Showing 1-10 of 19 references

The York-Toronto-Helsinki Parsed Corpus of Old English Prose

Ann Taylor.
J.C. Beal, K. P. Corrigan, and H. L. Moisl, editors, Creating and Digitizing Language Corpora. Volume 2: Diachronic Databases, pages 196–227. Palgrave Macmillan, Hampshire. • 2007

The York-Toronto-Helsinki Parsed Corpus of Old English Prose

Helena Raumolin-Brunberg, Terttu Nevalainen.
J.C. Beal, K. P. Corrigan, and H. L. Moisl, editors, Creating and Digitizing Language Corpora. Volume 2: Diachronic Databases, pages 148–171. Palgrave • 2007
View 2 Excerpts

La normalización del castellano escrito en el siglo xiii

Pedro Sánchez-Prieto.
Los caracteres de la lengua: grafı́as y fonemas. In Rafael Cano, editor, Historia de la lengua española, pages 199–213. Ariel, Barcelona. • 2005
View 2 Excerpts

Computational and linguistic aspects of the construction of the tycho brahe parsed corpus of historical portuguese

Helena Britto, Marcelo Finger, Charlotte Galves
2002
View 2 Excerpts

Corpus del español (100 millones de palabras, 1200s-1900s)

Mark Davies
2002
View 2 Excerpts

Edición y estudio lingüı́stico del Fuero de Alcalá (Fuero viejo)

M. Jesús Torrens.
Fundación Colegio del Rey, Alcalá de Henares. 2718 • 2002
View 1 Excerpt