• Publications
  • Influence
Framing and Agenda-setting in Russian News: a Computational Analysis of Intricate Political Strategies
TLDR
This work introduces embedding-based methods for cross-lingually projecting English frames to Russian, and offers new ways to identify subtle media manipulation strategies at the intersection of agenda-setting and framing.
Personalized Machine Translation: Preserving Original Author Traits
TLDR
It is shown that author’s gender has a powerful, clear signal in originals texts, but this signal is obfuscated in human and machine translation, and simple domain-adaptation techniques are proposed that help retain the original gender traits in the translation, without harming the quality of the translation.
On the features of translationese
TLDR
It is demonstrated that some feature sets are indeed good indicators of translationese, thereby corroborating some hypotheses, whereas others perform much worse, indicating that some ‘universal’ assumptions have to be reconsidered.
High-accuracy Annotation and Parsing of CHILDES Transcripts
TLDR
An ongoing project that aims to annotate the English section of the CHILDES database with grammatical relations in the form of labeled dependency structures with manually curated gold-standard grammatical relation annotations is described.
Definiteness in the Hebrew noun phrase
  • S. Wintner
  • Linguistics
    Journal of Linguistics
  • 1 July 2000
TLDR
An analysis of Modern Hebrew noun phrases in the framework of HPSG concludes that the article combines with nominals in the lexicon, and is no longer available for syntactic processes, leading to an analysis of noun phrases as NPs, rather than as DPs.
Found in Translation: Reconstructing Phylogenetic Language Trees from Translations
TLDR
This work automatically reconstruct phylogenetic language trees from monolingual texts (translated from several source languages) and indicates that source language interference is the most dominant characteristic of translated texts, overshadowing the more subtle signals of universal properties of translation.
Native Language Cognate Effects on Second Language Lexical Choice
TLDR
It is shown that the lexical choices of non-natives are affected by cognates in their native language, so powerful that the phylogenetic language tree of the Indo-European language family is reconstructed solely from the frequencies of specific lexical items in the English of authors with various native languages.
Topics to Avoid: Demoting Latent Confounds in Text Classification
TLDR
This work proposes a method that represents the latent topical confounds and a model which “unlearns” confounding features by predicting both the label of the input text and the confound; but it shows that this model generalizes better and learns features that are indicative of the writing style rather than the content.
Morphological Analysis of the Qur'an
TLDR
The system facilitates a variety of queries on the Qur'anic text that make reference not only to the words, but also to their linguistic attributes, and exemplifies its usefulness for investigating several morphological, syntactic, semantic, and stylistic aspects of the Qur'sanic text.
Identifying Semitic Roots: Machine Learning with Linguistic Constraints
TLDR
This work presents a machine learning approach, augmented by limited linguistic knowledge, to the problem of identifying the roots of Semitic words, one of the few attempts to directly address non-concatenative morphology using machine learning, and shed light on theproblem of combining classifiers under (linguistically motivated) constraints.
...
...