Matthias Janke, Lorenz Diener
Reference:
EMG-to-Speech: Direct Generation of Speech From Facial Electromyographic Signals (Matthias Janke, Lorenz Diener), in TASLP – IEEE/ACM Transactions on Audio, Speech and Language Processing, volume 25, number 12, pages 2375–2385, November 2017
Bibtex Entry:
@article{janke2017emg,
title = {EMG-to-Speech: Direct Generation of Speech From Facial Electromyographic Signals},
author = {Janke, Matthias and Diener, Lorenz},
year = 2017,
month = nov,
day = 23,
journal = {{TASLP} -- {IEEE/ACM} Transactions on Audio, Speech and Language Processing},
volume = 25,
number = 12,
pages = {2375--2385},
doi = {10.1109/TASLP.2017.2738568},
abstract = {Silent speech interfaces are systems that enable speech communication even when an
acoustic signal is unavailable. Over the last years, public interest in such interfaces has
intensified. They provide solutions for some of the challenges faced by today's speech-driven
technologies, such as robustness to noise and usability for people with speech impediments. In
this paper, we provide an overview over our silent speech interface. It is based on facial
surface electromyography (EMG), which we use to record the electrical signals that control
muscle contraction during speech production. These signals are then converted directly to an
audible speech waveform, retaining important paralinguistic speech cues for information such as
speaker identity and mood. This paper gives an overview over our state-of-the-art direct
EMG-to-speech transformation system. This paper describes the characteristics of the speech EMG
signal, introduces techniques for extracting relevant features, presents different EMG-to-speech
mapping methods, and finally, presents an evaluation of the different methods for real-time
capability and conversion quality.},
url = {https://halcy.de/cites/pdf/janke2017emg.pdf},
}