Timofey Arkhangelskiy

Learn More
Despite recent progress in developing annotated corpora for minority languages of Russia, still only about a dozen out of about 100 have comprehensive corpora, and even less have computational tools such as machine translation systems or speech recognition modules. However, given that many of them have resources such as dictionaries and grammars, the(More)
1 Introduction Learner corpora, also known as interlanguage (IL) or second language (L2) corpora, have become increasingly popular resources in language research in the past decade. Learner corpora provide large volume of rich data for theoretical and applied language studies. Just as native (or L1) corpora, learner corpora are particularly useful for(More)
Four electronic corpora created in 2011 within the framework of the “Corpus Linguistics: the Albanian, Kalmyk, Lezgian, and Ossetic Languages” Program of Fundamental Research of the RAS are presented. The interface and functionalities of these corpora are described, engineering problems to be solved in their creation are elucidated, and the promises of(More)
  • 1