A Simple and Effective Unsupervised Word Segmentation Approach

Abstract

In this paper, we propose a new unsupervised approach for word segmentation. The core idea of our approach is a novel word induction criterion called WordRank, which estimates the goodness of word hypotheses (character or phoneme sequences). We devise a method to derive exterior word boundary information from the link structures of adjacent word hypotheses… (More)

4 Figures and Tables

Topics

  • Presentations referencing similar topics