Corpus ID: 64480851

A preliminary constraint grammar for Russian

@inproceedings{Tyers2015APC,
  title={A preliminary constraint grammar for Russian},
  author={Francis M. Tyers and Robert Joshua Reynolds},
  year={2015}
}
This paper presents preliminary work on a constraint grammar based disambiguator for Russian. Russian is a Slavic language with a high degree of both in-category and out-category homonymy in the inflectional system. The pipeline consists of a finite-state morphological analyser and constraint grammar. The constraint grammar is tuned to be high recall (over 0.99) at the expense of low precision. 
Morphological analysis and disambiguation for Breton
TLDR
This paper presents an extended description of two resources for natural language processing of Breton, a morphological analyser and constraint grammar-based disambiguator, and introduces a new morphologically-disambiguated corpus ofBreton. Expand
Analysing Constraint Grammar with SAT
  • 2018
ConstraintGrammar (CG) is a robust and language-independent formalism for part-of-speech tagging and shallow parsing. A grammar consists of disambiguation rules for initially ambiguous,Expand

References

SHOWING 1-10 OF 17 REFERENCES
Automatic word stress annotation of Russian unrestricted text
TLDR
The effectiveness of finitestate tools for automatically annotating word stress in Russian unrestricted text is evaluated, highlighting the need for morphosyntactic disambiguation in the word stress placement task for Russian, and setting a standard for future research on this task. Expand
Hand-Crafted Rules
TLDR
As already stated in Chapter 8, a linguistic tagger can consist of the following modules: Tokenizer, Morphological analyser, and heuristic grammar(s). Expand
Designing and Evaluating a Russian Tagset
TLDR
The principles behind designing a tagset to cover Russian morphosyntactic phenomena, modifications of the core tagset, and its evaluation are reported, which achieves about 95% accuracy on the disambiguated portion of the Russian National Corpus. Expand
A General Computational Model For Word-Form Recognition And Production
TLDR
A language independent model for recognition and production of word forms is presented, based on a new way of describing morphological alternations that is capable of both analyzing and synthesizing word-forms. Expand
Combining Stochastic and Rule-Based Methods for Disambiguation in Agglutinative Languages
In this paper we present the results of the combination of stochastic and rule-based disambiguation methods applied to Basque languagel. The methods we have used in disambiguation are ConstraintExpand
Serial Combination of Rules and Statistics: A Case Study in Czech Tagging
TLDR
A hybrid system is described which combines the strength of manual rule-writing and statistical learning, obtaining results superior to both methods if applied separately, an experiment in Czech tagging has been performed with encouraging results. Expand
The Best of Two Worlds: Cooperation of Statistical and Rule-Based Taggers for Czech
TLDR
Three different statistical taggers are used in a tagging experiment using Prague Dependency Tree-bank and the results of the hybrid systems are better than any other method tried for Czech tagging so far. Expand
Reusing Grammatical Resources for New Languages
TLDR
It is argued that there is a notable gain in reusing grammatical resources when porting technology to new languages, particularly with respect to the closely related Lule and South Sami languages. Expand
Combining Hand-crafted Rules and Unsupervised Learning in Constraint-based Morphological Disambiguation
TLDR
A constraint-based morphological disambiguation approach that is applicable to languages with complex morphology-specifically agglutinative languages with productive inflectional and derivational morphological phenomena and which can attain a recall of 96 to 97% with a corresponding precision of 93 to 94%, and ambiguity of 1.02 to 1.03 parses per token. Expand
Boosting statistical tagger accuracy with simple rule-based grammars
TLDR
The results show that one can boost the accuracy of the best performing n-gram taggers by quickly developing a rough rule-based grammar to complement the statistically induced one and then combining the output of the two. Expand
...
1
2
...