STRAIGHT, a speech analysis, modification and synthesis system
last update: Mon Jun 1 18:13:21 JST 2015
STRAIGHT is a tool for manipulating voice quality, timbre, pitch, speed and other attributes flexibly.
It is an always evolving system for attaining better sound quality,
that is close to the original natural speech, by
introducing advanced signal processing algorithms and findings in computational aspects of auditory processing.
STRAIGHT decomposes sounds into source information and resonator (filter) information.
This conceptually simple decomposition makes it easy to conduct experiments on
speech perception using STRAIGHT, the initial design objective of this tool,
and to interpret experimental results
in terms of huge body of classical studies.
The most up to date STRAIGHT is called TANDEM-STRAIGHT (2008 ICASSP) that is a complete reformulation based on a new
representation of power spectra of periodic signals.
The original idea which lead to STRAIGHT was emerged at ATR
Human Information Processing Research Laboratory in 1996.
The first stage of development was supported by
Japan Science and Technology agency (JST),
under its CREST research promotion program.
"Auditory Brain Project"
established the foundation of STRAIGHT and initiated subsequent research and development of
various applications based on STRAIGHT.
Availability of STRAIGHT
There are two contact points to get access to STRAIGHT codes. For academic use,
please contact to the author of the system.
For non-academic use,
please contact to the technical liaison office.
Please put a word "STRAIGHT" in the Subject field of your mail.
- Tutorial on STRAIGHT will be held at APSIPA 2014. (December 12-19, 2014)
- Presented "Temporally variable multi-aspect N-way morphing" at APSIPA 2013 ASC. (31/October/2013)
(My presentation at APSIPA 2013 (pdf))
- Presented a new F0 extractor (again!) and important "Take home message" at ICASSP2013. Please
check my presentation at ICASSP2013. (30/May/2013)
- Presented an invited talk
at PPRU-Workshop VII,
Friedrich-Schiller-University of Jena, Germany (2013.4.26)
- A fast and accurate F0 extractor is presented at
the "pitch and harmonic analysis" session in
Interspeech 2012. (10 September 2012)
- Invited talk at
The Acoustics 2012 Hong Kong, 13-18 May 2012
- Invited talk at
LISTA workshop, Edinburough, 2-3 May 2012
- Organized the special session
(SS-L9: Advances in singing-voice synthesis, transformation, and application) in
ICASSP2012 (25-30 March 2012, Kyoto)
(My ICASSP presentation is now available: 6/April/2012)
- Invited talk on Signal processing challenge for singing vice texture was presented at the SIGMUS94
special event on Singing information processing
(USTREAM video: in Japanese) using
( interactive iBook: in Japanese ) (3/Feb./2012)
- The first workshop on singing voice (InterSinging2010):
(Engineering bldg.6, University of Tokyo, 1-2 Oct., 2010)
Movie visualizations on how TANDEM-STRAIGHT works
are accessible now. Detailed explanations will be presented at a tutorial session in
SSW7, a satellite workshop of Interspeech 2010. (22-24/Sept./2010)
- CrestMuse Symposium 2010 (Kansei gakuin University, 13/Sept./2010)
What STRAIGHT can do (our work)
- Emotional morphing with extrapolation based on APSIPA2013 formulation (Temporally variable multi-aspects N-way morphing)
- Song morphing demo based on ICASSP2009 formulation
- New demo movies on TANDEM-STRAIGHT and morphing.(03/Dec./2009)
- TANDEM-STRAIGHT-based morphing with temporally variable morphing rates
- TANDEM-STRAIGHT, a complete reformulation of STRAIGHT:
26 March, 2007: Monthly meeting of the Speech Committee of IEICEJ and Acoustical Society of Japan
19 December, 2006:
2006 Symposium of
the Comprehensive Development of e-Society Foundation; Software program
26 October, 2006:
First CrestMuse symposium of
2006.10/26 CrestMuse project
23 April to 15 August, 2005: Demonstration presented at the event
"LOVE STORIES - Why you are not alone". held at
2005.4/23〜8/15 National Museum of Emerging Science and Innovation.
3 to 5 June, 2004: Participated in
the 4th Rencon (Contest for Performance Rendering Systems)
1 to 3 September, 2003: Participated in Education Arena held in
February, 2001: Joint meeting of speech committee of IEICEJ,
hearing meeting of the Acoustical Society of Japan and Special Interest group on MUSic and computer of
（ related materials [in Japanese] ）
March, 2000: Monthly meeting of the Speech Committee of IEICEJ and Acoustical Society of Japan
(demonstrations are in Japanese)
What STRAIGHT can do (contributions of users and collaborators)
A list of citations on STRAIGHT
provides a overview of the range of applications. The followings show selected topics.
Selected articles on STRAIGHT
- Hideki Kawahara, Masanori Morise, Toru Takahashi, Ryuichi Nisimura, Toshio Irino, Hideki Banno,
Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation,
Proc. ICASSP 2008, Las Vegas,pp.3933-3936(2008)
- Stefan R. Schweinberger, Christoph Casper, Nadine Hauthal, Jürgen M. Kaufmann, Hideki Kawahara, Nadine Kloth, David M.C. Robertson, Adrian P. Simpson and Romi Zäske,
Auditory Adaptation in Voice Perception, Current Biology 18(9), 684-688, May 6, (2008).
- H. Kawaahra, R. Nisimura, T. Irino, M. Morise, T. Takahashi, B. Banno, Temporally variable multi-aspect auditory morphing enabling extrapolation without objective and perceptual breakdown, Proc. ICASSP, Taipei, Taiwan, 19-24 (2009).
- Hideki Kawahara, Toru Takahashi, Masanori Morise and Hideki Banno: Development of exploratory research tools based on TANDEM-STRAIGHT, Proc. APSIPA, Sapporo, pp.111-120 (2009).
- Laetitia Bruckert, Patricia Bestelmeyer, Marianne Latinus, Julien Rouger, Ian Charest, Guillaume A. Rousselet, Hideki Kawahara, Pascal Belin, Vocal Attractiveness Increases by Averaging, Current Biology, Volume 20, Issue 2, 116-120, 26 (January 2010)
- Hideki Kawahara and Masanori Morise,
Technical foundations of TANDEM-STRAIGHT, a speech analysis, modification and synthesis framework,
SADHANA - Academy Proceedings in Engineering Sciences, Vol.36, Part 5, pp.713-722, 2011.
- Hideki Kawahara, Toshio Irino and Masanori Morise,
An interference-free representation of instantaneous frequency of periodic signals and its application to F0 extraction,
Proc. ICASSP 2011, May 2011. (doi:10.1109/ICASSP.2011.5947584 )
Selected articles and presentations
- Hideki Kawahara, Ikuyo Masuda-Kasuse and Alain de Cheveigne: Restructuring
speech representations using a pitch-adaptive time-frequency smoothing
and an instantaneous-frequency-based F0 extraction: Possible role of a
reptitive structure in sounds, Speech Communication, 27, pp.187-207 (1999).
- Hideki Kawahara, Haruhiro Katayose, Alain de Cheveigne, Roy D.
Patterson: Fixed Point Analysis of Frequency to Instantaneous Frequency
Mapping for Accurate Estimation of F0 and Periodicity , Proc. EUROSPEECH'99,
Volume 6, Page 2781-2784 (1999).
- Hideki Kawahara, Yoshinori Atake and Parham Zolfaghari: Accurate
vocal event detection method based on a fixed-point to weighted average
group delay, ICSLP-2000, Beijing, pp.664-667 2000
- Parham Zolfaghari, Yoshinori Atake, Kiyohiro Shikano, Hideki Kawahara:
Investigation of analysis and synthesis parameters of STRAIGHT by subjective
evaluation, ICSLP-2000, Beijin
- H. Kawahara and P Zolfaghari: Systematic F0 glitches around vowel
nasal transitions, EUROSPEECH'2001, pp.2459-2462, 2001.
- H. Kawahara, Jo Estill and O. Fujimura: Aperiodicity extraction
and control using mixed mode excitation and group delay manipulation for
a high quality speech analysis, modification and synthesis system STRAIGHT,
MAVEBA 2001, Sept.13-15, Firentze Italy, 2001.
- Hideki Kawahara, Parham Zolfaghari and Alain de Cheveigne, "On F0 Trajectory for very high-quality speech manipulation" ICSLP'2002, (2002).
- Hideki Kawahara and Hisami Matsui:
Auditory morphing based on an elastic perceptual distance metric in an interference-free time-frequency representation,
Proc. ICASSP'2003, vol.I, pp.256-259, 2003.
- Hisami Matsui, Hideki Kawahara,
Investigation of Emotionally Morphed Speech Perception and its Structure Using a High Quality Speech Manipulation System
Prod. Eurospeech'03, pp. 2113-2116, 2003.
- Hideki Kawahara: Exemplar-based Voice Quality Analysis and Control
using a High Quality Auditory Morphing Procedure based on STRAIGHT,
VOQUAL'03, ISCA Tutorial and Research Workshop, Geneva, August 27-29, 2003, pp.109-114.
- Hideki Kawahara, Hideki Banno, Toshio Irino and Parham Zolfaghari,
Morphing waveform based methods, sinusoidal models and STRAIGHT,
Proc. ICASSP'2004, Montreal Canada, pp.13-16 (2004)
- Hideki Kawahara, Alain de Cheveigne, Hideki Banno, Toru Takahashi and Toshio Irino,
Nearly Defect-free F0 Trajectory Extraction for Expressive Speech Modifications based on STRAIGHT,
Proc. Interspeech2005, Lisboa, pp.537-540, Sept. 2005.
- Hideki Kawahara: STRAIGHT, Exploration of the other aspect of VOCODER:
Perceptually isomorphic decomposition of speech sounds,
Acoustic Science and Technology, Vol.27, No.6, pp.349-353 (2006)
(link to PDF page)
- Hideki Kawahara, Osamu Fujimura and Yasuyuki Konparu,
Voice as Artistic Expression in Noh, presented at the Joint Meeting of ASA and ASJ, Honolulu 2006
(link to the lay language paper on ASA press room)
- Hideki Banno, Hiroaki Hata, Masanori Morise, Toru Takahashi, Toshio Irino and Hideki Kawahara,
"Implementation of realtime STRAIGHT speech manipulation system: Report on its first implementation,"
Acoustic Science and Technology, Vol.28, pp.140-146 (2007)
(link to PDF page)
Please mail to (firstname.lastname@example.org).