In this paper we provide a description of TimeML, a rich specification language for event and temporal expressions in natural language text, developed in the context of the AQUAINT program on Question Answering Systems. Unlike most previous work on event annotation, TimeML captures three distinct phenomena in temporal markup: (1) it systematically anchors… (More)
The TempEval task proposes a simple way to evaluate automatic extraction of temporal relations. It avoids the pitfalls of evaluating a graph of interrelated labels by defining three sub tasks that allow pairwise evaluation of temporal relations. The task not only allows straightforward evaluation, it also avoids the complexities of full temporal parsing.
We describe the design, implementation, and evaluation of EMBERS, an automated, 24x7 continuous system for forecasting civil unrest across 10 countries of Latin America using open source indicators such as tweets, news sources, blogs, economic indicators, and other data sources. Unlike retrospective studies, EMBERS has been making forecasts into the future… (More)
Civil unrest (protests, strikes, and " occupy " events) is a common occurrence in both democracies and authoritarian regimes. The study of civil unrest is a key topic for political scientists as it helps capture an important mechanism by which citizenry express themselves. In countries where civil unrest is lawful, qualitative analysis has revealed that… (More)
" The Spanish [γ] is often not very fricative, and more like an approximant. It may be more accurately transcribed using the symbol for a voiced velar approximant […]. " —Ladefoged (1982:148) 1 Abstract Spanish has in its phonetic inventory three sounds, transcribed here as [β], [δ], and [γ], often described as voiced fricatives in the literature. Based on… (More)
We seek to automatically estimate typical durations for events and habits described in Twitter tweets. A corpus of more than 14 million tweets containing temporal duration information was collected. These tweets were classified as to their habituality status using a bootstrapped, decision tree. For each verb lemma, associated duration information was… (More)
The overall goal of this project is to evaluate the performance of word sense alignment (WSA) systems, focusing on obtaining examples appropriate to language learners. Building a gold standard dataset based on human expert judgments is costly in time and labor, and thus we gauge the utility of using semi-experts in performing the annotation. In an online… (More)