Corpus ID: 232170369

CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review

@article{Hendrycks2021CUADAE,
  title={CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review},
  author={Dan Hendrycks and Collin Burns and Anya Chen and Spencer Ball},
  journal={ArXiv},
  year={2021},
  volume={abs/2103.06268}
}
Many specialized domains remain untouched by deep learning, as large labeled datasets require expensive expert annotators. We address this bottleneck within the legal domain by introducing the Contract Understanding Atticus Dataset (CUAD), a new dataset for legal contract review. CUAD was created with dozens of legal experts from The Atticus Project and consists of over 13,000 annotations. The task is to highlight salient portions of a contract that are important for a human to review. We find… Expand

Figures and Tables from this paper

Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents

References

SHOWING 1-10 OF 27 REFERENCES
CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension
A Benchmark for Lease Contract Review
Extracting contract elements
COLIEE-2018: Evaluation of the Competition on Legal Information Extraction and Entailment
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
...
1
2
3
...