There have recently been considerable advances in fast inference for (online) latent Dirichlet allocation (LDA). While it is widely recognized that the scheduling of documents in stochastic optimization and in turn in LDA may have significant consequences, this issue remains largely unexplored. Instead, practitioners schedule documents essentially uniformly(More)
One major problem in text mining and semantic retrieval is that detected entity mentions have to be assigned to the true underlying entity. The ambiguity of a name results from both the pol-ysemy and synonymy problem, as the name of a unique entity may be written in variant ways and different unique entities may have the same name. The term " bush " for(More)
The mouse genes for cytosolic phosphoenolpyruvate carboxykinase-1 (Pck-1) and neuronal nicotinic acetylcholine receptor alpha 4 subunit (Acra-4) both map to distal chromosome 2 (Siracusa et al. 1989; Bessis et al. 1990). We have utilized Southern blot analysis on human/rodent somatic cell hybrids to map the human homologues of both of these genes, PCK1 and(More)
Name ambiguity is a major problem in information retrieval: The name "Metropolis" may refer to a movie, a physicist, or Superman's hometown. Recent work resolves ambiguity in natural language text by linking name mentions against the corresponding Wikipedia concept (Wikification). Standard methods comparing a single mention with the corresponding Wikipedia(More)
We present an approach for the disambigua-tion of textual mentions of ambiguous names: disambiguation means here the identification of the true entity denoted by a name phrase appearing in a query context through its assignment to the corresponding Wikipedia article. If this article does not exist, we assign this query to a default entity. Ambiguity of(More)
The output of a speech recognition system is a stream of text features that is overlayed by noise resulting from errors in the system's statistical classification of the audio input. Conditional Random Fields (CRFs), which have already proven themselves to be efficient, high-performance Named Entity Recognizers (NERs) for named entities from text, offer the(More)
In news stories verbatim quotes of persons play a very important role, as they carry reliable information about the opinion of that person concerning specific aspects. As thousands of new quotes are published every hour it is very difficult to keep track of them. In this paper we describe a set of algorithms to solve the knowledge management problem of(More)
