A team of computer scientists and linguists from the Bielefeld and Paderborn universities has investigated how different parts of human speech can be separated from each other and thus better analyzed and modified. The results will feed into the research of the TRR subproject C06 " Technically enabled explanation of speaker traits".
"The human voice is a complex construct made up of overlaps of various influencing factors. As a result, it has different characteristics that are difficult to identify," says Professor Dr. Reinhold Häb-Umbach, Professor of Communications Engineering at Paderborn University and one of the leaders of subproject C06. "By breaking down speech signals into different components, we can learn more about what makes our voices unique."
The components are distinguished between linguistic-content characteristics – what someone says – and tonal characteristics – how the voice sounds in the process. In their publication, the researchers show how the individual components are related at the tonal level. To do this, they created a neural network model that separates the different sonic aspects. This can be used to create a new synthetic language with specifically modified attributes, for example a desired average pitch.
The researchers presented the results in their article "Speech Disentanglement for Analysis and Modification of Acoustic and Perceptual Speaker Characteristics". "With the publication, we contribute to understanding how we can use computers to understand and modify different aspects of speech," summarizes Frederik Rautenberg, co-author of the paper and also a researcher in subproject C06. "This will allow us to develop language modification programs that can help people with speech difficulties, for example."
The article was presented at the 49th Annual Conference on Acoustics (DAGA). The DAGA is the largest conference on acoustics in the German-speaking region and was held in Hamburg from March 6-9.
Project C06 "Technically assisted explanation of voice characteristics".
The subproject C06 investigates in their research voice characteristics and how they can be manipulated with the computer. The goal is to develop an intelligent system that experts can use to explain the phenomenon of voice to laypersons.
Further information:
- Link to subproject C06
- To the website of the Annual Acoustics Conference in Hamburg: https://www.daga2023.de/
- Article "Speech Disentanglement for Analysis and Modification of Acoustic and Perceptual Speaker Characteristics" by Frederik Rautenberg, Michael Kuhlmann, Janek Ebbers, Jana Wiechmann, Fritz Seebauer, Petra Wagner, and Reinhold Häb-Umbach: https://ris.uni-paderborn.de/record/44849