Masatoshi Tsuchiya

Learn More
The Japanese language has various types of compound functional expressions, which are very important for recognizing the syntactic structures of Japanese sentences and for understanding their semantic contents. In this paper, we formalize the task of identifying Japanese compound functional expressions in a text as a chunking problem. We apply a machine(More)
A topic-dependent-class (TDC)-based <i>n</i>-gram language model (LM) is a topic-based LM that employs a semantic extraction method to reveal latent topic information extracted from noun-noun relations. A topic of a given word sequence is decided on the basis of most frequently occuring (weighted) noun classes in the context history through voting. Our(More)
A data sparseness problem for modeling a language often occurs in many language models (LMs). This problem is caused by the insufficiency of training data, which in turn, makes the infrequent words have unreliable probability. Mapping from words into classes gives the infrequent words more confident probability, because they can rely on other more frequent(More)
A new cell culture system has been developed that reflects the vascular microenvironment. By means of this system the cultured cells are exposed not only to shear stress by the circulating culture medium, but also to an oxygen concentration gradient and certain critical blood components such as low-density lipoprotein (LDL) and monocytes. DNA microarray(More)
This paper proposes an approach of processing Japanese compound functional expressions by identifying them and analyzing their dependency relations through a machine learning technique. First, we formalize the task of identifying Japanese compound functional expressions in a text as a machine learning based chunking problem. Next, against the results of(More)
This paper explains our developing Corpus of Japanese classroom Lecture speech Contents (henceforth, denoted as CJLC). Increasing e-Learning contents demand a sophisticated interactive browsing system for themselves, however, existing tools do not satisfy such a requirement. Many researches including large vocabulary continuous speech recognition and(More)
This paper discusses the speech recognition of Japanese classroom lecture speech. In particular, we mention the influences of microphone differences and the language model differences on the speech recognition performance of classroom lectures. First, we collected actual classroom lecture contents from several universities in Japan. In this paper, we(More)
Language models (LMs) are an important field of study in automatic speech recognition (ASR) systems. LM helps acoustic models find the corresponding word sequence of a given speech signal. Without it, ASR systems would not understand the language and it would be hard to find the correct word sequence. During the past few years, researchers have tried to(More)
Although large-scale spontaneous speech corpora are crucial resource for various domains of spoken language processing, they are usually limited due to their construction cost especially in transcribing precisely. On the other hand, inexact transcribed corpora like shorthand notes, meeting records and closed captions are widely available. Unfortunately, it(More)