• Publications
  • Influence
Methods for Intrinsic Plagiarism Detection and Author Diarization
TLDR
A plagiarism detection method based on constructing an author style function from features of text sentences and detecting outliers and adapted the method for the diarization problem by segmenting author style statistics on text parts, which correspond to different authors. Expand
Style Breach Detection with Neural Sentence Embeddings
TLDR
A method based on mapping sentences into high dimensional vector space based on using the pre-trained encoder-decoder model for constructing an author style function and detecting outliers for style breach detection task. Expand
CrossLang: the system of cross-lingual plagiarism detection
TLDR
A CrossLang system for cross-lingual plagiarism detection for English-Russian language pair is presented and the integration of the system in Antiplagiat system is integrated and technical characteristics are provided. Expand
Variational learning across domains with triplet information.
TLDR
The Variational Bi-domain Triplet Autoencoder (VBTA) is proposed that learns a joint distribution of objects from different domains that extends the VBTAs objective function by the relative constraints or triplets that sampled from the shared latent space across domains. Expand
HiRID-ICU-Benchmark - A Comprehensive Machine Learning Benchmark on High-resolution ICU Data
TLDR
This work defines multiple clinically relevant tasks developed in collaboration with clinicians using the HiRID-I dataset, and provides a reproducible end-to-end pipeline to construct both data and labels. Expand
Near-duplicate handwritten document detection without text recognition
The paper presents a novel method for near-duplicate detection in handwritten document collections of school essays. A large amount of online resources with available academic essays currently makesExpand
Variational Bi-domain Triplet Autoencoder
TLDR
The Variational Bi-domain Triplet Autoencoder (VBTA) is proposed that learns a joint distribution of objects from different domains that is comparable with some of the existing generative models and outperformes some of these methods methods. Expand
A monolingual approach to detection of text reuse in Russian-English collection
TLDR
A method for cross-lingual (Russian and English) text reuse detection based on the monolingual approach - translation of texts into one language and reduction to the text similarity problem is developed. Expand