Corpus ID: 208547632

An Annotated Dataset of Coreference in English Literature

@article{Bamman2019AnAD,
  title={An Annotated Dataset of Coreference in English Literature},
  author={David Bamman and Olivia Lewke and Anya Mansoor},
  journal={ArXiv},
  year={2019},
  volume={abs/1912.01140}
}
  • David Bamman, Olivia Lewke, Anya Mansoor
  • Published in ArXiv 2019
  • Computer Science
  • We present in this work a new dataset of coreference annotations for works of literature in English, covering 29,104 mentions in 210,532 tokens from 100 works of fiction published between 1719 and 1922. This dataset differs from previous coreference corpora in containing documents whose average length (2,105.3 words) is four times longer than other benchmark datasets (463.7 for OntoNotes), and contains examples of difficult coreference problems common in literature. This dataset allows for an… CONTINUE READING

    Create an AI-powered research feed to stay up to date with new papers like this posted to ArXiv

    Figures, Tables, and Topics from this paper.

    Citations

    Publications citing this paper.

    A Dutch coreference resolution system with an evaluation on literary fiction

    VIEW 12 EXCERPTS
    CITES METHODS, RESULTS & BACKGROUND
    HIGHLY INFLUENCED

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 49 REFERENCES

    End-to-end Neural Coreference Resolution

    VIEW 7 EXCERPTS
    HIGHLY INFLUENTIAL

    Adam: A Method for Stochastic Optimization

    VIEW 1 EXCERPT
    HIGHLY INFLUENTIAL

    On Coreference Resolution Performance Metrics

    VIEW 2 EXCERPTS
    HIGHLY INFLUENTIAL