STRAIGHT is a versatile
speech manipulation tool invented by Hideki
Kawahara when he was in ATR.
A series of refinements and developments were conducted in the
Brain Project" under CREST
program sponsored by JST.
- Affine transformation
- The basic application of STRAIGHT is to modify fundamental
frequencies, the frequency axis and the time axis of a speech
sample independently in a proportional fashion. The following
examples illustrates results of these basic manipulations.
- Japanese sentence (spoken by a female speaker)[original]
- Arbitrary transformation
- Transformations applicable to STRAIGHT parameters are not
necessarily proportional. Nonlinear and non-stationary transformations
of parameters are allowed unless they do not violate physical
feasibility of the modified representations.
- Auditory morphing of speech sounds
- Auditory morphing is to transform one speech example into
the other speech example in a parameterized manner.
- Music application
- Synthetic chorus using STRAIGHT won the first place in RENCON'04
among four synthetic singing systems.
Here is the link to demonstrations.
- Hideki Kawahara, Ikuyo Masuda-Katsuse and Alain de Cheveigne:
Restructuring speech representations using a pitch-adaptive time-frequency
smoothing and an instantaneous-frequency-based F0 extraction:
Possible role of a repetitive structure in sounds, Speech Communication,
27, 3-4, pp.187-207 (1999). [The EURASIP Best-Paper Award 1998/99]
- Hideki Kawahara, Haruhiro Katayose, Alain de Cheveigne, Roy
D. Patterson: Fixed Point Analysis of Frequency to Instantaneous
Frequency Mapping for Accurate Estimation of F0 and Periodicity
, Proc. EUROSPEECH'99, Volume 6, Page 2781-2784 (1999).
- Hideki Kawahara, Yoshinori Atake and Parham Zolfaghari: Accurate
vocal event detection method based on a fixed-point to weighted
average group delay, ICSLP-2000, Beijing, pp.664-667 2000
- Parham Zolfaghari, Yoshinori Atake, Kiyohiro Shikano, Hideki
Kawahara: Investigation of analysis and synthesis parameters
of STRAIGHT by subjective evaluation, ICSLP-2000, Beijin
- H. Kawahara and P. Zolfaghari: Systematic F0 glitches around
vowel nasal transitions, EUROSPEECH'2001, pp.2459-2462, 2001.
- H. Kawahara, Jo Estill and O. Fujimura: Aperiodicity extraction
and control using mixed mode excitation and group delay manipulation
for a high quality speech analysis, modification and synthesis
system STRAIGHT, MAVEBA 2001, Sept.13-15, Firentze Italy, 2001.
- Hisami Matsui and Hideki Kawahara:
Auditorily motivated elastic spectral distance and its application
to emotional morphing of portrayal speech, FIRST PAN-AMERICAN/IBERIAN
MEETING ON ACOUSTICS, 2-6 December 2002, Cancun, 3pSC11.
- Hideki Kawahara and Hisami Matsui: AUDITORY MORPHING BASED
ON AN ELASTIC PERCEPTUAL DISTANCE METRIC IN AN INTERFERENCE-FREE
TIME-FREQUENCY REPRESENTATION, Proc. ICASSP'2003, vol.I, pp.256-259,
Notes on availability
Essential components of STRAIGHT are patented by ATR and JST.
When a company is interested in using STRAIGHT as a research tool,
a written agreement between JST and the company is necessary for
the company to get access to the codes and the technical documents.
(Please contact the JST office
for details.) For non-commercial research and/or educational institutes,
please contact to
For evaluation purpose, you can use
the trial version of STRAIGHT.
Last update:Thu Sep 16 13:21:31 JST 2004