Using an Alignment ‐ based Lexicon for Canonicalization of Historical Text

@inproceedings{Jurish2013UsingAA,
  title={Using an Alignment ‐ based Lexicon for Canonicalization of Historical Text},
  author={Bryan Jurish},
  year={2013}
}
Virtually all conventional text-based natural language processing techniques – from traditional information retrieval systems to full-fledged parsers – require reference to a fixed lexicon accessed by surface form, typically trained from or constructed for synchronic input text adhering strictly to contemporary orthographic conventions. Unorthodox input such as historical text which violates these conventions therefore presents difficulties for any such system due to lexical variants present in… CONTINUE READING

Citations

Publications citing this paper.

References

Publications referenced by this paper.
Showing 1-10 of 15 references

Edit transducers for spelling variation in Old Spanish

  • J. Porta, J.-L. Sancho, J. Gómez
  • In Proceedings of the workshop on computational…
  • 2013
3 Excerpts

The Anselm Corpus: Methods and perspectives of a parallel aligned corpus

  • S. Dipper, S. Schultz-Balluff
  • In Proceedings of the workshop on computational…
  • 2013
1 Excerpt

Similar Papers

Loading similar papers…