Kyoko Sakuraba

  • Citations Per Year
Learn More
Speech communication has several steps of production, encoding, transmission, decoding, and hearing. In every step, acoustic distortions are involved inevitably as differences of vocal tract length, gender, age, microphone, room, line, hearing characteristics, etc. These are static non-linguistic factors and completely irrelevant to speech recognition.(More)
There are may exist some common factors independent of languages and cultures in human perception of emotion via speech sounds. This study investigated the factors using subjects from Japan, the United States and China, all of whom have no experience living abroad. An emotional speech database sans linguistic information was used in this study and evaluated(More)
This paper proposes a new method of estimating perceptual femininity (PF) of an input utterance using Gaussian Mixture Model (GMM) supervectors and support vector regression (SVR). The method is used to develop a femininity estimation tool, which is introduced to voice therapy of Gender Identity Disorder (GID) clients, especially MtF (Male to Female)(More)
あらまし 近年の計算機性能の飛躍的な向上により,大規模語彙を対象とした音声認識は実用段階を迎えてい る.音声合成においても話者性や発話スタイルを制御できる合成方式など,種々の応用場面を念頭においた技術 開発が行われている.その一方で,音声工学研究の目的を「人間に匹敵するような」音声言語情報処理能力の計 算機実装と考えた場合,人間と機械との間には,今なお,大きな溝があることも指摘されている.本研究ではま ず,現在の音声認識・音声合成相当の情報処理を行う人間が現に存在した場合,その人間の挙動は,音声言語の 獲得に困難を示す重度自閉症者の挙動と類似するであろうことを指摘する.その上で,(定型発達を遂げた)人間 らしい音声情報処理の実現に向けて,現在の音声技術に欠けている基礎技術は何であるのかを幅広い視点から考(More)
This work describes the development of an automatic estimator of perceptual femininity (PF) of an input utterance using speaker verification techniques. The estimator was designed for its clinical use and the target speakers are Gender Identity Disorder (GID) clients, especially MtF (Male to Female) transsexuals. The voice therapy for MtFs, which is(More)
In speech communication, various acoustic distortions are inevitably involved by speakers, channels, and hearers. However, infants acquire a spoken language mainly with speech samples of their mothers and fathers. They can easily solve the variability problem only with a remarkably biased speech corpus. Why and how is it possible? To answer this hard(More)
This paper describes the development of an estimator of perceptual femininity (PF) of an input utterance using speaker recognition techniques. The estimator was designed for its clinical use and the target speakers are gender identity disorder (GID) clients, especially MtF (male to female) transsexuals. The voice therapy for MtFs is composed of three kinds(More)
(マイク特性,話者性の一部)を考える。加算性の背 景雑音は考えない。上記二種類の音響歪みはケプス トラムの一次変換 c′ = Ac+ bで近似される。図 1に 示すように,線形変換性歪み,畳み込み歪みはスペク トルの水平方向(A),垂直方向(b)の変動に対応す る。図 3に示している,ケプストラム系列を分布(ガ ウス混合分布)系列に変換した後に計算される分布 間距離(バタチャリヤ距離等)群,即ち,距離行列は 一次変換不変な物理量である。n点から成る幾何学構 造としての n角形は,全ての 2点間距離,即ち n×n の距離行列によってその構造は一意に規定される。同 様に分布間距離群として求められた距離行列も(あ る非ユークリッド空間における)幾何学的構造を規定 する。第 2.1節に示したように,この距離行列のみか(More)
In speech communication, acoustic distortions are inevitably involved by speakers, channels, and hearers. However, infants acquire a spoken language mainly with speech samples of their mothers and fathers. They can solve the variability problem only with a remarkably biased speech corpus. Why and how is it possible? To answer this hard question, we already(More)
In speech communication, various acoustic distortions are inevitably involved by speakers, channels, and hearers. However, infants acquire spoken language mainly with speech samples of their mothers and fathers. They can naturally acquire the solution of the variability problem only with a remarkably biased speech corpus. Why and how is it possible? To(More)
  • 1