Finite element synthesis of diphthongs using tuned two-dimensional vocal tracts

M. Arnela, O. Guasch

Volum

Issue

Publisher

IEEE

Year

2017

Month

October

First page

2013

Last page

2023

Research Group

HER

Research Line

Acoustics and Audio

DOI

10.1109/TASLP.2017.2735179

Three-dimensional (3-D) vocal tract acoustic modeling has the potential to generate high quality and natural voice sounds, but at the price of a large computational cost. Alternatively, 2-D models based on tuned vocal tracts have shown to provide similar results to the 3-D ones but with less computational demands. However, they are currently limited to the synthesis of static vowel sounds. In this paper, the tuned 2-D approach is extended by considering moving vocal tracts to generate dynamic vowel sounds, like diphthongs. Four tuning steps are followed to build a dynamic 2-D vocal tract model that can recover, to a large extent, the formant locations, bandwidths, and energies of a 3-D vocal tract with circular cross section, set in a spherical baffle representing the human head. Acoustic waves propagating through the time evolving vocal tract and radiating to free-field are simulated using the finite element method in the time-domain. As examples, the diphthongs [Ai] and [Au] have been generated using the tuning approach and compared, by means of objective and subjective evaluations, to those resulting from 3-D and conventional 2-D simulations.

Authors