Learn More
We present Sentiment Analyzer (SA) that extracts sentiment (or opinion) about a subject from online text documents. Instead of classifying the sentiment of an entire document about a subject, SA detects all references to the given subject, and determines sentiment in each of the references using natural language processing (NLP) techniques. Our sentiment(More)
This paper illustrates a sentiment analysis approach to extract sentiments associated with polarities of positive or negative for specific subjects from a document, instead of classifying the whole document into positive or negative.The essential issues in sentiment analysis are to identify how sentiments are expressed in texts and whether the expressions(More)
This paper proposes an unsupervised lexicon building method for the detection of polar clauses, which convey positive or negative aspects in a specific domain. The lexical entries to be acquired are called polar atoms, the minimum human-understandable syntactic structures that specify the polarity of clauses. As a clue to obtain candidate polar atoms, we(More)
This paper proposes a new paradigm for sentiment analysis: translation from text documents to a set of sentiment units. The techniques of deep language analysis for machine translation are applicable also to this kind of text mining task. We developed a high-precision sentiment analysis system at a low development cost, by making use of an existing(More)
In cross-language information retrieval it is often important to align words that are similar in meaning in two corpora written in different languages. Previous research shows that using context similarity to align words is helpful when no dictionary entry is available. We suggest a new method which selects a subset of words (pivot words) associated with a(More)
Large text databases potentially contain a great wealth of knowledge. However, text represents factual information (and information about the author's communicative intentions) in a complex, rich, and opaque manner. Consequently, unlike numerical and fixed field data, it cannot be analyzed by standard statistical data mining methods. Relying on human(More)
Complex documents stored in a flat or partially marked up file format require layout sensitive pre-processing before any natural language processing can be carried out on their textual content. Contemporary technology for the discovery of basic tex-tual units is based on either spatial or other content insensitive methods. However, there are many cases(More)