Aleksandr Sboev

Learn More
This article presents methods to construct procedure of morpho-syntactic parsing based on corpus dataset analyzes. It contains 1) the method to eliminate morphological ambiguities using existing morphological parsers and then converting the results of parsing into the format of the language corpus used; 2) a method of selecting parameters for syntactic(More)
Automatic extraction of information about authors of texts (gender, age, psychological type, etc.) based on the analysis of linguistic parameters has gained a particular significance as there are more online texts whose authors either avoid providing any personal data or make it intentionally deceptive despite of it being of practical importance in(More)
Authorship profiling which is a process of the extraction of information about the unknown autohr of a text (demographics, psychological traits, et al.) based on the analysis of linguistic parameters, is a problem of great importance. Research in authorship profiling has always been constrainted by the limited availability of training data since collecting(More)
In the present article, we consider a problem to evaluate the gain in accuracy of using deep learning network for two language tasks: the automatic text classification according to the authors gender and to identify text sentiment. A preexisting corpus of Russian-language texts RusPersonality labeled with information on their authors (gender, age,(More)
An approach for visualization of nested topics within large collections of documents is proposed. The approach is based on set of parameters: information entropy, Kullback-Leibler divergence, Ginzburg algorithm, similarity the distributions of keywords and key phrases in the documents with Bernoulli's theoretical distributions. The results of comparisons of(More)
  • 1