- Full text PDF available (1)
This article presents methods to construct procedure of morpho-syntactic parsing based on corpus dataset analyzes. It contains 1) the method to eliminate morphological ambiguities using existing morphological parsers and then converting the results of parsing into the format of the language corpus used; 2) a method of selecting parameters for syntactic… (More)
Automatic extraction of information about authors of texts (gender, age, psychological type, etc.) based on the analysis of linguistic parameters has gained a particular significance as there are more online texts whose authors either avoid providing any personal data or make it intentionally deceptive despite of it being of practical importance in… (More)
Authorship profiling which is a process of the extraction of information about the unknown autohr of a text (demographics, psychological traits, et al.) based on the analysis of linguistic parameters, is a problem of great importance. Research in authorship profiling has always been constrainted by the limited availability of training data since collecting… (More)
An approach for visualization of nested topics within large collections of documents is proposed. The approach is based on set of parameters: information entropy, Kullback-Leibler divergence, Ginzburg algorithm, similarity the distributions of keywords and key phrases in the documents with Bernoulli's theoretical distributions. The results of comparisons of… (More)
The existing methodic approaches to analyzing a noncarcinogenic risk fail to fully solve the tasks set within the basic lines of the activities of the Russian Agency for Consumer Surveillance since there are limited capacities of the quantitative assessment of a noncarcinogenic risk to human health. An algorithm is proposed for basing the indicators… (More)
Investigation of different factor influence on the spike-timing-dependent plasticity learning process was performed. The next factors were analyzed: choice of spike pairing scheme, shapes of postsynaptic currents and the choice of input type signal for learning. Best factors for learning performance were extracted.
In the present article, we consider a problem to evaluate the gain in accuracy of using deep learning network for two language tasks: the automatic text classification according to the authors gender and to identify text sentiment. A preexisting corpus of Russian-language texts RusPersonality labeled with information on their authors (gender, age,… (More)