Satoshi Oyama

Learn More
Domain-specific Web search engines are effective tools for reducing the difficulty experienced when acquiring information from the Web. Existing methods for building domain-specific Web search engines require human expertise or specific facilities. However, we can build a domain-specific search engine simply by adding domain-specific keywords, called(More)
We describe methods to search with a query by example in a known domain for information in an unknown domain by exploiting Web search engines. Relational search is an effective way to obtain information in an unknown field for users. For example, if an <i>Apple</i> user searches for <i>Microsoft</i> products, similar <i>Apple</i> products are important(More)
Pairwise classification has many applications including network prediction, entity resolution, and collaborative filtering. The pairwise kernel has been proposed for those purposes by several research groups independently, and become successful in various fields. In this paper, we propose an efficient alternative which we call Cartesian kernel. While the(More)
We propose a kernel method for using combinations of features across example pairs in learning pairwise classifiers. Identifying two instances in the same class is an important technique in duplicate detection, entity matching, and other clustering problems. However, it is a difficult problem when instances have few discriminative features. One typical(More)
In this paper, we discuss problems with HITS (HyperlinkInduced Topic Search) algorithm, which capitalizes on hyperlinks to extract topic-bound communities of web pages. Despite its theoretically sound foundations, we observed HITS algorithm failed in real applications. In order to understand this problem, we developed a visualization tool LinkViewer, which(More)
A lot of future-related information is available in news articles or Web pages. This information can however differ to large extent and may fluctuate over time. It is therefore difficult for users to manually compare and aggregate it, and to re-construct the most probable course of future events. In this paper we approach a problem of automatically(More)
Pairwise classification has many applications including network prediction, entity resolution, and collaborative filtering. The pairwise kernel has been proposed for those purposes by several research groups independently, and has been used successfully in several fields. In this paper, we propose an efficient alternative which we call a Cartesian kernel.(More)
We have developed a method for using confidence scores to integrate labels provided by crowdsourcing workers. Although confidence scores can be useful information for estimating the quality of the provided labels, a way to effectively incorporate them into the integration process has not been established. Moreover, some workers are overconfident about the(More)