An interactive tool to investigate relations between vocal tract transfer function,
pole frequencies and bandwidths, shape, LSF (LSP), and synthesized sounds is added to
Matlab realtime speech tools. Stand alone applications for Mac and Windows,
which do not require Matlab installation are also available (15/May/2015)
Temporally Variable Multi attribute Morphing of Arbitrarily Many Voices for Exploratory Research of Speech Prosody,
in Speech Prosody in Speech Synthesis: Modeling and generation of prosody for high quality and flexible speech synthesis,
(eds. Keikichi Hirose and Jianhua Tao),
Springer Berlin Heidelberg, pp.109-120, 2015.
(DOI:10.1007/978-3-662-45258-5_8 ). This is a good introduction to a powerful extended morphing.
Hideki Kawahara, Masanori Morise, Ryuichi Nisimura and Toshio Irino:
HIGHER ORDER WAVEFORM SYMMETRY MEASURE AND ITS APPLICATION TO PERIODICITY DETECTORS FOR SPEECH AND SINGING WITH FINE TEMPORAL RESOLUTION,
ICASSP2013, Vancouver Canada, 26-31 May, 2013. (Accepted) (30/May/2013).
Hideki Kawahara, Masanori Morise, Ryuichi Nisimura and Toshio Irino:
An interference-free representation of group delay
for periodic signals,
Proc. APSIPA, 3-6 December, OS.17-SLA 8, 2012 Calfornia, USA. (4/Dec./2012)
Hideki Kawahara, Masanori Morise, Ryuichi Nisimura, Toshio Irino:
Deviation measure of waveform symmetry and its
application to high-speed and temporally-fine F0
extraction for vocal sound texture manipulation,
Interspeech2012, 2012. (10/Sept./2012)
Hideki Kawahara, Toshio Irino and Masanori Morise,
An interference-free representation of instantaneous frequency of periodic signals and its application to F0 extraction,
Proc. ICASSP 2011, May 2011. (doi:10.1109/ICASSP.2011.5947584 )
Laetitia Bruckert, Patricia Bestelmeyer, Marianne Latinus, Julien Rouger, Ian Charest, Guillaume A. Rousselet, Hideki Kawahara, Pascal Belin, Vocal Attractiveness Increases by Averaging, Current Biology, Volume 20, Issue 2, 116-120, 26 (January 2010)
Romi Zäske, Stefan R. Schweinberger, Jürgen M. Kaufmann, Hideki Kawahara:
In the ear of the beholder: neural correlates of adaptation to voice gender,
European Journal of Neuro Science, Vol.30, No.3, pp.527-534 (August 2009)
Osamu Fujimura, Kiyoshi Honda, Hideki Kawahara, Yasuyuki Konparu, Masanori Morise and J.C. Williams, Noh Voice Quality, J. Logopedics Phoniatrics Vocology,34(4), 157-170 (04 June 2009)
H. Kawaahra, R. Nisimura, T. Irino, M. Morise, T. Takahashi, B. Banno, Temporally variable multi-aspect auditory morphing enabling extrapolation without objective and perceptual breakdown, Proc. ICASSP, Taipei, Taiwan, 19-24 (2009).
Stefan R. Schweinberger, Christoph Casper, Nadine Hauthal, Juergen M. Kaufmann, Hideki Kawahara, Nadine Kloth, David M.C. Robertson, Adrian P. Simpson and Romi Zaeske,
Auditory Adaptation in Voice Perception, Current Biology 18, 684-688, May 6, (2008).
Hideki Kawahara, Masanori Morise, Toru Takahashi, Ryuichi Nisimura, Toshio Irino, Hideki Banno, A TEMPORALLY STABLE POWER SPECTRAL REPRESENTATION FOR PERIODIC SIGNALS AND APPLICATIONS TO INTERFERENCE-FREE SPECTRUM, F0, AND APERIODICITY ESTIMATION, Proc. ICASSP 2008, Las Vegas,pp.3933-3936(2008)
Hideki Banno, Hiroaki Hata, Masanori Morise, Toru Takahashi, Toshio Irino and Hideki Kawahara,
"Implementatioin of realtime STRAIGHT speech manipulation system: Report on its first implementation,"
Acoustic Science and Technology, Vol.28, pp.140-146 (2007)
Hideki Kawahara: STRAIGHT, Exploration of the other aspect of VOCODER:
Perceptually isomorphic decomposition of speech sounds,
Acoustic Science and Technology, Vol.27, No.6, pp.349-353 (2006).[invited]
Toshio Irino, Roy D. Patterson, and Hideki Kawahara, "Speech
segregation using an auditory vocoder with event-synchronous
enhancements," IEEE Trans. Speech and Audio Process.,
Vol.27, Issue 6, pp.2212-2221 (2006).
Hideki Kawahara, Alain de Cheveigne, Hideki Banno, Toru Takahashi and Toshio Irino,
Nearly Defect-free F0 Trajectory Extraction for Expressive Speech Modifications based on STRAIGHT,
Proc. Interspeech2005, Lisboa, pp.537-540, Sept. 2005.
David R. R. Smith, Roy D. Patterson, Richard Turner, Hideki Kawahara and Toshio Irino,
The processing and perception of size information in speech sounds,
Journal of the Acoustical Society of America, 117(1), pp.305-318, Jan.2005.
Hideki Kawahara, Hideki Banno, Toshio Irino and Parham Zolfaghari,
ALGORITHM AMALGAM: Morphing waveform based methods, sinusoidal models and STRAIGHT,
Proc. ICASSP'2004, Montreal Canada, vol.1, pp.13-16, 2004
Hideki Kawahara and Hisami Matsui,
Auditory morphing based on an elastic perceptual distance metric in an interference-free
ICASSP'2003, pp.256-259 (2003).
Alain de Cheveigné,Hideki Kawahra,
YIN, "a fundamental frequency estimator for speech and music",
Journal of the Acoustical Society of America, Vol.111, No.4, pp.1917-1930 (2002)
H. Kawahara, Jo Estill and O. Fujimura: Aperiodicity extraction
and control using mixed mode excitation and group delay manipulation
for a high quality speech analysis, modification and synthesis
system STRAIGHT, MAVEBA 2001, Sept.13-15, Firentze Italy, 2001.
Hideki Kawahara, Yoshinori Atake and Parham Zolfaghari: Accurate
vocal event detection method based on a fixed-point to weighted
average group delay, ICSLP-2000, Beijing, pp.664-667 2000.
Hideki Kawahara, Haruhiro Katayose, Alain de Cheveigne, Roy
D. Patterson: Fixed Point Analysis of Frequency to Instantaneous
Frequency Mapping for Accurate Estimation of F0 and Periodicity
, Proc. EUROSPEECH'99, Volume 6, Page 2781-2784 (1999).
Hideki Kawahara, Ikuyo Masuda-Katsuse and Alain de Cheveigne:
Restructuring speech representations using a pitch-adaptive time-frequency
smoothing and an instantaneous-frequency-based F0 extraction:
Possible role of a reptitive structure in sounds, Speech Communication,
27, pp.187-207 (1999). [1998-1999 EURASIP best paper award]
Alain de Cheveigne,Hideki Kawahara,"Missing-data Model of Vowel Identification" J.Acoust.Soc.Am., Vol.105, pp.3497-3508, 1999.
Hideki Kawahara, Alain de Cheveigne and Roy D. Patterson:
An instantaneous-frequency-based pitch extraction method for
high-quality speech transformation: revised TEMPO in the STRAIGHT-suite,
Proc. 5th Int. Conf. on Spoken Language Processing (ICSLP '98),
Hiroko Kato and Hideki Kawahara: ``An Application of the
Bayesian Time Series Model and Statistical System Analysis for
F0 Control'', Speech Communication, (1998)
Hideki Banno, J. Ju, Satoshi Nakamura, Kiyohiro Shikano and
Hideki Kawahara: ``Efficient Representation of Short-time Phase
Based on Group Delay'', ICASSP'98, SP26.6, Seattle, (1998.5).
Alain DE Cheveigne (CNRS), Hideki Kawahara, Minoru Tsuzaki,
and Kiyoaki Aikawa: ''Concurrent Vowel Identification I: Effects
of relative Level and F0 Differences,'' J. Acoust. Soc. Am.,
Vol.101, pp.2839-2847 (1997.5)
Hideki Kawahara: ''Speech Representation and Transformation
using Adaptive Interpolation of Weighted Spectrum: VOCODER Revisited,''
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing
(ICASSP '97), vol.2, pp.1303-1306 (1997.4)
Auditory morphing demonstrations on emotional speech. (Flash movie)
This demonstration was on display at National Museum of Emerging Science and Innovation Tokyo
from 23 April to 15 August 2006. The title of the special event was
"Love stories - Why you are not alone.".
The interface was designed by Takashi Yamaguchi and
auditory morphing sounds were synthesized by Hideki Kawahara using
STRAIGHT-based morphing algorithm.
STRAIGHT based TTS (Text To Speech) system won the first place in the Blizzard challange and reported at
INTERSPEECH2005. (Sept. 2005).
"Temporal media design project" supported by CREST (Aka CrestMuse project)
(2005 to 2010: PI is Prof. Katayose and I will take part) uses STRAIGHT as one of key components. (Sept. 2005).
Invited talk on "Manipulating the pulse rate and resonance scale in speech and animal calls",
at the special session "Size information in speech and animal calls" (organized by
at the 149th ASA meeting in Vancouver, Canada. (May 2005).