Aleksandr Sboev

  • Citations Per Year
Learn More
This article presents methods to construct procedure of morpho-syntactic parsing based on corpus dataset analyzes. It contains 1) the method to eliminate morphological ambiguities using existing morphological parsers and then converting the results of parsing into the format of the language corpus used; 2) a method of selecting parameters for syntactic(More)
Authorship profiling which is a process of the extraction of information about the unknown autohr of a text (demographics, psychological traits, et al.) based on the analysis of linguistic parameters, is a problem of great importance. Research in authorship profiling has always been constrainted by the limited availability of training data since collecting(More)
The given work is focused on the principal ways by which new words and expressions enter the Chinese Internet language, the sources of new meanings for old words and phrases; neologisms and chengyu with modified meaning and structure. Some new tendencies in developing of the Chinese Internet language, such as wide use of dialect-originated words, archaic(More)
Automatic extraction of information about authors of texts (gender, age, psychological type, etc.) based on the analysis of linguistic parameters has gained a particular significance as there are more online texts whose authors either avoid providing any personal data or make it intentionally deceptive despite of it being of practical importance in(More)
The existing methodic approaches to analyzing a noncarcinogenic risk fail to fully solve the tasks set within the basic lines of the activities of the Russian Agency for Consumer Surveillance since there are limited capacities of the quantitative assessment of a noncarcinogenic risk to human health. An algorithm is proposed for basing the indicators(More)
An approach for visualization of nested topics within large collections of documents is proposed. The approach is based on set of parameters: information entropy, Kullback-Leibler divergence, Ginzburg algorithm, similarity the distributions of keywords and key phrases in the documents with Bernoulli's theoretical distributions. The results of comparisons of(More)
In the present article, we consider a problem to evaluate the gain in accuracy of using deep learning network for two language tasks: the automatic text classification according to the authors gender and to identify text sentiment. A preexisting corpus of Russian-language texts RusPersonality labeled with information on their authors (gender, age,(More)
  • 1