• Corpus ID: 16977701

Named entity extraction and disambiguation for informal text: the missing link

@inproceedings{Morgan2014NamedEE,
  title={Named entity extraction and disambiguation for informal text: the missing link},
  author={Mena Badieh Habib Morgan},
  year={2014}
}
Social media content represents a large portion of all textual content appearing on the Internet. These streams of user generated content (UGC) provide an opportunity and challenge for media analysts to analyze huge amount of new data and use them to infer and reason with new information. An example of a main sector for social media analysis is the area of customer feedback through social media. With so many feedback channels, organizations can mix and match them to best suit corporate needs… 
Semantic-Enhanced Training Data Augmentation Methods for Long-Tail Entity Recognition Models
TLDR
This work aimed at showing how, by enhancing the size and quality of the training data using different techniques, it will be possible to improve the performance of Long-tail Entity Recognition (L-tER).
Extracting actionable information from microtexts
TLDR
Crisislex: A lexicon for collecting and filtering microblogged communications in crises, and a linguistically motivated approach to information extraction from social media.
Combinatorial and compositional aspects of bilingual aligned corpora
The subject of investigation of this thesis is the building blocks of translation in Statistical Machine Translation (SMT). We find that these building blocks, namely phrase-level dictionary entries,
GeoTextTagger: High-Precision Location Tagging of Textual Documents using a Natural Language Processing Approach
TLDR
This paper presents an algorithm for location tagging of textual documents that makes use of previous work in natural language processing by using a state-of-the-art part- of-speech tagger and named entity recognizer to find blocks of text which may refer to locations.
Location Tagging in Text
Location tagging, also known as geotagging, is the process of assigning geographical coordinates to input data. In this project we present an algorithm for location tagging text. Our algorithm makes
Automatic assistants for database exploration
TLDR
This thesis presents four assistants to help data explorers interrogate a database to discover its content: Claude, Blaeu, Ziggy and Raimond, which are an attempt to generalize semi-automatic exploration to text data.
Process mining with streaming data
TLDR
This thesis explores, develop and analyse process mining techniques that are able to handle streaming event data and identifies three main process mining types of analysis, i.e. process discovery, conformance checking and process enhancement.
Hierarchical process mining for scalable software analysis
TLDR
This thesis shows how process mining can be used for analyzing software systems, and addresses the lack of support for hierarchical subprocesses, recursive behavior, and cancelation behavior, which is commonly found in software behavior.
On local and global graph structure mining
TLDR
The final author version and the galley proof are versions of the publication after peer review that features the final layout of the paper including the volume, issue and page numbers.
...
1
2
3
4
...

References

SHOWING 1-10 OF 311 REFERENCES
A Generic Open World Named Entity Disambiguation Approach for Tweets
TLDR
This paper shares ideas from information retrieval (IR) and NED to propose solutions for named entity disambiguation in twitter messages and uses Support Vector Machine (SVM) to rank the candidate pages to find the best representative entities.
Finding people and their utterances in social media
TLDR
The scope of this research focuses on three tasks in which the interaction between the two is key: utterances that are relevant and people that are of interest, and gaining insight into how people use a product and what features they wish for, eases the development of new products.
Entity-based Classification of Twitter Messages
TLDR
This paper presents various techniques for classifying tweet messages containing a given keyword, whether they are related to a particular company with that name or not, and extensively analyze the sources of errors in the classification.
Adding Meaning to Social Network Microposts via Multiple Named Entity Disambiguation APIs and Tracking Their Data Provenance
TLDR
This paper describes how one can keep track of data provenance and credit back the contributions of each sin- gle API to the joint result of the combined mash-up API, and shows how provenance meta- data can help understand the way a combined result is formed, and optimize the result formation process.
Online named entity recognition method for microtexts in social networking services: A case study of twitter
TLDR
Experimental results demonstrate the feasibility of the proposed NER method for extracting relevant information in online social network applications, and three properties of contextual association among the microtexts to discover contextual clusters of themicrotexts are proposed.
Named entity extraction and disambiguation: the missing link
TLDR
The benefit of using this reinforcement effect on two domains: NEE and NED for toponyms in formal text; and for arbitrary entity types in informal short text in tweets is shown.
NER from Tweets: SRI-JU System @MSM 2013
TLDR
This article reports the author's participation in the Concept Extraction Challenge, Making Sense of micro posts (#MSM2013), and three different systems runs have been submitted.
Unsupervised Improvement of Named Entity Extraction in Short Informal Context Using Disambiguation Clues
TLDR
An unsupervised Semantic Web-driven approach to improve the extraction process by using clues from the disambiguation process, using a simple Knowledge-Base matching technique combined with a clustering-based approach for disambigsuation.
Information Extraction: Algorithms and Prospects in a Retrieval Context
TLDR
The book elaborates on the past and current most successful algorithms and their application in a variety of domains and reveals a number of ideas towards an advanced understanding and synthesis of textual content.
Mining Wiki Resources for Multilingual Named Entity Recognition
TLDR
A system by which the multilingual characteristics of Wikipedia can be utilized to annotate a large corpus of text with Named Entity Recognition (NER) tags requiring minimal human intervention and no linguistic expertise is described.
...
1
2
3
4
5
...