Learn More
Synopsis The paper argues that Maximum Entropy (MaxEnt) models are preferable to Stochastic Optimality (StOT) models, as MaxEnt models allow low-ranked constraints to 'gang-up' on high ranked constraints. That is, they allow cumulativity. In addition to ganging-up cumulativity the authors distinguish counting cumulativity. Counting cumulativity is simply(More)
In the wake of the January 12 earthquake in Haiti it quickly became clear that the existing emergency response services had failed but text messages were still getting through. A number of people quickly came together to establish a text-message based emergency reporting system. There was one hurdle: the majority of the messages were in Haitian Kreyol,(More)
Crisis-affected populations are often able to maintain digital communications but in a sudden-onset crisis any aid organizations will have the least free resources to process such communications. Information that aid agencies can actually act on, 'actionable' information , will be sparse so there is great potential to (semi)automatically identify actionable(More)
In this paper, we propose that MT is an important technology in crisis events, something that can and should be an integral part of a rapid-response infrastructure. By integrating MT services directly into a messaging infrastructure (whatever the type of messages being serviced, e.g., text messages, Twitter feeds, blog postings, etc.), MT can be used to(More)
This paper investigates three dimensions of cross-domain analysis for humanitarian information processing: citizen reporting vs organizational reporting; Twitter vs SMS; and English vs non-English communications. Short messages sent during the response to the recent earthquake in Haiti and floods in Pakistan are analyzed. It is clear that SMS and Twitter(More)
This article reports on Mission 4636, a real-time humanitarian crowdsourc-ing initiative that processed 80,000 text messages (SMS) sent from within Haiti following the 2010 earthquake. It was the first time that crowdsourcing (microtasking) had been used for international relief efforts, and is the largest deployment of its kind to date. This article(More)
1 The role of data in language documentation is rather different from the way that data is traditionally treated in language description. For description, the main concern is the production of grammars and dictionaries whose primary audience is linguists (Himmelmann 1998, Woodbury 2003). In these products language data serves essentially as exemplification(More)
We present a compendium of recent and current projects that utilize crowdsourcing technologies for language studies, finding that the quality is comparable to controlled laboratory experiments, and in some cases superior. While crowdsourcing has primarily been used for annotation in recent language studies, the results here demonstrate that far richer data(More)