Fine-Grained Named Entity Recognition in Legal Documents

  title={Fine-Grained Named Entity Recognition in Legal Documents},
  author={Elena Leitner and Georg Rehm and Juli{\'a}n Moreno Schneider},
  booktitle={International Conference on Semantic Systems},
This paper describes an approach at Named Entity Recognition (NER) in German language documents from the legal domain. For this purpose, a dataset consisting of German court decisions was developed. The source texts were manually annotated with 19 semantic classes: person, judge, lawyer, country, city, street, landscape, organization, company, institution, court, brand, law, ordinance, European legal norm, regulation, contract, court decision, and legal literature. The dataset consists of… 

A Dataset of German Legal Documents for Named Entity Recognition

A dataset developed for Named Entity Recognition in German federal court decisions that consists of approx.

Automatic Induction of Named Entity Classes from Legal Text Corpora

An implementation of the introduced methodology to automatically induce fine-grained classes of named entities for the legal domain, and an experiment with a large legal corpus in German language that is manually annotated with almost 54,000 named entities is developed.

Annotating Entities with Fine-Grained Types in Austrian Court Decisions

This work investigates an approach to produce fine-grained named entity annotations of a large corpus of Austrian court decisions from a small manually annotated training data set, and applies a general purpose Named Entity Recognition model to produce annotations of common coarse- grained types.

RuREBus: a Case Study of Joint Named Entity Recognition and Relation Extraction from e-Government Domain

The whole developed pipeline, starting from text annotation, baseline development, and designing a shared task in hopes of improving the baseline is described, realizing that the current NER and RE technologies are far from being mature and do not overcome so far challenges.

Named Entity Recognition for Public Interest Litigation Based on a Deep Contextualized Pretraining Approach

The proposed entity recognition method is more effective than the existing methods, which reach 96% and 90% in the F1 index of NER and NERP entities, respectively.

Semantic Segmentation of Legal Documents via Rhetorical Roles

This paper proposes a new corpus of legal documents annotated with a set of 13 semantically coherent units labels (referred to as Rhetorical Roles) and develops a multitask learning (MTL) based deep model with document rhetorical role label shift as an auxiliary task for segmenting a legal document.

Annotation of Fine-Grained Geographical Entities in German Texts

The fine-grained classification of geographical entities, the resulting annotations and preliminary results on automatically tagging toponyms in a small, bootstrapped gold corpus are presented.

NamedEntityRecognition forPublic InterestLitigationBasedona Deep Contextualized Pretraining Approach

An entity recognition method based on pretraining is proposed that is more e–ective than the existing methods, which reach 96% and 90% in the F1 index of NER and NERP entities, respectively.

Fine-grained Named Entity Annotations for German Biographic Interviews

A fine-grained NER annotations with 30 labels with generality for both domains is presented and applied to German data and its generality is confirmed, also achieving good inter-annotator agreement.

Applying Model Fusion to Augment Data for Entity Recognition in Legal Documents

A novel data augmentation method for named entity recognition by fusing multiple models, in which the identified entities with high correctness in the multiple experimental results are taken as effective entities and added to the training set for the next training.



A low-cost, high-coverage legal named entity recognizer, classifier and linker

This paper tries to improve Information Extraction in legal texts by creating a legal Named Entity Recognizer, Classifier and Linker, developed with relatively little effort by mapping the LKIF ontology to the YAGO ontology and through it, taking advantage of the mentions of entities in the Wikipedia.

Named Entity Recognition and Resolution in Legal Text

An actual system for finding named entities in legal text and evaluating its accuracy is described, as well as three methods for named entity recognition, lookup, context rules, and statistical models.

NoSta-D Named Entity Annotation for German: Guidelines and Dataset

The approach to creating annotation guidelines based on linguistic and semantic considerations is described, and how they were iteratively refined and tested in the early stages of annotation to arrive at the largest publicly available dataset for German NER, consisting of over 31,000 manually annotated sentences from German Wikipedia and German online news.

Named Entity Recognition with Bidirectional LSTM-CNNs

A novel neural network architecture is presented that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering.

GermaNER: Free Open German Named Entity Recognition Tool

The tagger is trained and evaluated on the GermEval 2014 dataset for named entity recognition and comes close to the performance of the best (proprietary) system in the competition with 76% F-measure test set performance on the four standard NER classes.

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

A novel neutral network architecture is introduced that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF, thus making it applicable to a wide range of sequence labeling tasks.

Lexicon Infused Phrase Embeddings for Named Entity Resolution

A new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embedDings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER are presented.

Named Entity Recognition, Extraction, and Linking in German Legal Contracts

Different approaches to Named Entity Recognition are incorporated into an Apache UIMA pipeline, enabling semantic analysis and structuring of legal contracts by implementing a software component.

Improving efficiency and accuracy in multilingual entity extraction

This paper discusses some implementation and data processing challenges encountered while developing a new multilingual version of DBpedia Spotlight that is faster, more accurate and easier to configure, and compares the solution to the previous system.

Named entity recognition: Exploring features

A conditional random field based system that achieves 91.02% F1-measure on the CoNLL 2003 (Sang and Meulder, 2003) dataset and 81.4% on the OntoNotes version 4 (Hovy et al., 2006) CNN dataset, which, to the knowledge, displays the best results in the state of the art for those benchmarks respectively.