Publications:
Simantiraki, O., Charonyktakis, P., Pampouchidou, A., Tsiknakis, M., Cooke, M. (2017) Glottal Source Features for Automatic Speech-Based Depression Assessment. Proc. Interspeech 2017, 2700-2704, DOI: 10.21437/Interspeech.2017-1251.
Llorach G., Blat J. (2017) Say Hi to Eliza. In: Beskow J., Peters C., Castellano G., O’Sullivan C., Leite I., Kopp S. (eds) Intelligent Virtual Agents. IVA 2017. Lecture Notes in Computer Science, vol 10498. Springer, Cham. DOI: 10.1007/978-3-319-67401-8_34
Hendrikse, M., Llorach, G., Grimm, G., Hohmann, V. Influence of visual cues on head and eye movements during listening tasks in multi-talker audiovisual environments with animated characters. Speech Communication, Vol 101, July 2018, p. 70-84. DOI: 10.1016/j.specom.2018.05.008.
Govender, A., King, S. (2018). Using Pupillometry to Measure the Cognitive Load of Synthetic Speech. Proc. Interspeech 2018, 2838-2842, DOI: 10.21437/Interspeech.2018-1174.
Govender, A., King, S. (2018). Measuring the Cognitive Load of Synthetic Speech Using a Dual Task Paradigm. Proc. Interspeech 2018, 2843-2847, DOI: 10.21437/Interspeech.2018-1199.
Simantiraki, O., Cooke, M., King, S. (2018). Impact of Different Speech Types on Listening Effort. Proc. Interspeech 2018, 2267-2271, DOI: 10.21437/Interspeech.2018-1358.
Shifas PV, M., Tsiaras, V., Stylianou, Y. (2018) Speech Intelligibility Enhancement Based on a Non-causal Wavenet-like Model. Proc. Interspeech 2018, 1868-1872, DOI: 10.21437/Interspeech.2018-2119.
Espic calderón, F., Govender, A., Ribeiro, M. S., Valentini Botinhao, C., & Watts, O. (2018). The CSTR entry to the 2018 Blizzard Challenge. In Blizzard Challenge 2018 workshop Hyderabad, India.
Kaplan, E., Wagner, A. & Baskent, D. (2018). Are musicians at an advantage when processing speech on speech? In Parncutt, R., & Sattmann, S. (Eds.) (2018). Proceedings of ICMPC15/ESCOM10 (p. 233-236). Graz, Austria: Centre for Systematic Musicology, University of Graz.
G. Llorach, G. Grimm, M. M. E. Hendrikse, V. Hohmann. Towards Realistic Immersive Audiovisual Simulations for Hearing Research: Capture, virtual scenes and reproduction, Proceedings of 2018 Workshop on Audio-Visual Scene Understanding for Immersive Multimedia (AVSU’18), p. 33-40, 26 October 2018, Seoul, Republic of Korea, DOI: 10.1145/3264869.3264874
Raman, S., Hernaez, I., Navas, E., Serrano, L. (2018) Listening to Laryngectomees: A study of Intelligibility and Self-reported Listening Effort of Spanish Oesophageal Speech. Proc. IberSPEECH 2018, 107-111, DOI: 10.21437/IberSPEECH.2018-23
Serrano, L., Tavarez, D., Sarasola, X., Raman, S., Saratxaga, I., Navas, E., Hernaez, I. (2018) LSTM based voice conversion for laryngectomees. Proc. IberSPEECH 2018, 122-126, DOI: 10.21437/IberSPEECH.2018-26
Llorach G., Agenjo J., Blat J., Sayago S. (2019) Web-Based Embodied Conversational Agents and Older People. In: Sayago S. (eds) Perspectives on Human-Computer Interaction Research with Older People. Human–Computer Interaction Series. Springer, Cham. DOI: https://doi.org/10.1007/978-3-030-06076-3_8
Amy Hall, Jan Rennies-Hochmuth, Axel Winneke (2019). Assessing and reducing listening effort of listening to speech in adverse conditions. DAGA 2019, Conference proceedings, p. 958-961.
Llorach, G., Oetting, D., Krüger M., Vormann, M., Fitschen, C., Schulte, M., Hohmann, V., Meis, M. (2019) Vehicle Noise: Loudness Ratings, Loudness Models and Future Experiments with Audiovisual Immersive Simulations, Proceedings of Internoise 2019.
Raman, S.; Serrano, L.; Winneke, A.; Navas, E.; Hernaez, I. Intelligibility and Listening Effort of Spanish Oesophageal Speech. Appl. Sci. 2019, 9, 3233. DOI: 10.3390/app9163233
Shen C., Janse E., 2019 Articulatory Control in Speech Production. In Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren (eds.) Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019 (pp. 2533-2537). Canberra, Australia: Australasian Speech Science and Technology Association Inc.
Marcoux, K.P, Ernestus, M.T.C, 2019. Pitch in Native and Non-Native Lombard Speech. In Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren (eds.) Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019 (pp. 2605-2609). Canberra, Australia: Australasian Speech Science and Technology Association Inc.
Hendrikse, M. M. E., Llorach, G., Grimm, G., Hohmann, V. (2019). Movement and Gaze Behavior in Virtual Audiovisual Everyday-Life Listening Environments. Trends in Hearing 23 (2019) p. 2331216519872362, DOI: 10.1177/2331216519872362
Sfakianaki, A. (2019). Designing a Modern Greek sentence corpus for audiological and speech technology research. Proceedings of the 14th International Conference on Greek Linguistics (ICGL14), September 5-8, 2019, University of Patras, Greece. Proceeddings not published yet
Cooke, M., King, S., Hazan, V., Stylianou, Y., Janse, E., Baskent, D., Hohmann, V., Winneke, A., Hernaez, I., (2019) Enriched communication across the lifespan- Comunicación enriquecida a lo largo de la vida. Procesamiento del lenguaje natural 63 (2019), pp.175-178. DOI: 10.26342/2019-63-24
Simantiraki, O., Cooke, M. (2019) Listeners’ Speech Rate Preferences in Stationary and Modulated Maskers. ICA 2019 Conference Proceedings, pp. 5736-5738. DOI: 10.18154/RWTH-CONV-239387
Raman, S., Hernaez, I., Navas, E., Serran, L. A Multifaceted Enrichment of Oesophageal Speech. ICA 2019 Conference Proceedings, pp. 5739-5741. DOI: 10.18154/RWTH-CONV-239415
Exenberger, A., Iverson, O. Speech enrichment: Listening effort and intelligibility. ICA 2019 Conference Proceedings, pp. 5700-5702. DOI: 10.18154/RWTH-CONV-238874
Paulus, M., Hazan, V., Wagner, A., Adank, P. (2019). Talker intelligibility and listening effort: The role of speaking rate. ICA 2019 Conference Proceedings, pp. 5708-5712. DOI: 10.18154/RWTH-CONV-239169
Chermaz, C., Valentini Botinhao, C., Schepker, H., & King, S. Near End Listening Enhancement in Realistic Environments. In ICA 2019 Proceedings, pp. 5731-5735. DOI: 10.18154/RWTH-CONV-239327
Govender, A., King, S. and Valentini-Botinhao, C. (2019). Evaluating Cognitive Load of Text-To-Speech (TTS) synthesis. In ICA 2019 Proceedings, pp. 5759-5763. DOI: 10.18154/RWTH-CONV-239695
Shen C., Cooke M., Janse E. (2019) Individual Articulatory Control in Speech Enrichment. In ICA 2019 Proceedings, pp. 5761-5765. DOI: 10.18154/RWTH-CONV-239282
Marcoux, K.P, Ernestus, M.T.C. (2019) Differences between Native and Non-Native Lombard Speech in terms of pitch range. In ICA 2019 Proceedings, pp. 5713-5720. DOI: 10.18154/RWTH-CONV-239240
Padinjaru Veettil, M.S., Santelli, C., and Stylianou,Y. (2019) Towards a Neural-Based Single Channel Speech Enhancement Model for Hearing-Aids. In ICA 2019 Proceedings, pp. 5745-5748. DOI: 10.18154/RWTH-CONV-239594
Padinjaru Veettil, M.S., Chermaz, C., Chimona, T., Tsiaras, V. and Stylianou, Y. (2019) Benefits of the WaveNet-Based Speech Intelligibility Enhancement for Normal and Hearing Impaired Listeners. In ICA 2019 Proceedings, pp. 5721-5725. DOI: 10.18154/RWTH-CONV-239258
Paul, D., Pantazis, Y. and Stylianou , Y. (2019). Weighted Generative Adversarial Network for many-to-many Voice Conversion. In ICA 2019 Proceedings, pp. 5742-5744. DOI: 10.18154/RWTH-CONV-239420
Kirwan, J., Wagner, A. and Baskent, D. (2019) Pupillary Correlates of Auditory Emotion Recognition in Hearing-Impaired Listeners. In ICA 2019 Proceedings, pp. 5771-5772. DOI: 10.18154/RWTH-CONV-239805
Kaplan, E.C., Baskent, D. and Wagner, A. (2019) Differences in Processing Speech-on-Speech Between Musicians and Non-musicians: The Role of Prosodic Cues. In ICA 2019 Proceedings, pp. 5756-5758. DOI: 10.18154/RWTH-CONV-239680
Llorach, G. and Hohmann, V. (2019) Word error and confusion patterns in an audiovisual German matrix sentence test (OLSA). In ICA 2019 Proceedings, pp. 5749-5751. DOI: 10.18154/RWTH-CONV-239621
Hall, A., Winneke, A., Rennies-Hochmuth, J. (2019) EEG alpha power as a measure of listening effort reduction in adverse conditions. ICA 2019 Conference Proceedings, pp. 5752-5755. DOI: 10.18154/RWTH-CONV-239632
Serrano, L., Raman, S., Tavarez, D., Navas, E., Hernaez, I. (2019) Parallel vs. Non-Parallel Voice Conversion for Esophageal Speech. Proc. Interspeech 2019, 4549-4553, DOI: 10.21437/Interspeech.2019-2194.
Paulus, M., Hazan, V., Adank, P. (2019) Talker Intelligibility and Listening Effort with Temporally Modified Speech. Proc. Interspeech 2019, 3128-3132, DOI: 10.21437/Interspeech.2019-1402.
Chermaz, C., Valentini Botinhao, C., Schepker, H., & King, S. Evaluating Near End Listening Enhancement Algorithms in Realistic Environments. In Proceedings Interspeech 2019, 1373-1377, DOI: 10.21437/Interspeech.2019-1800. Nominated for the best student paper at Interspeech 2019.
Govender, A., Wagner, A.E., King, S. (2019) Using Pupil Dilation to Measure Cognitive Load When Listening to Text-to-Speech in Quiet and in Noise. Proc. Interspeech 2019, 1551-1555, DOI: 10.21437/Interspeech.2019-1783.
Muhammed Shifas, P. V., Adiga, N., Tsiaras, V., & Stylianou, Y. (2019). A Non-Causal FFTNet Architecture for Speech Enhancement. Proc. Interspeech 2019, p. 1826-1830. DOI: 10.21437/Interspeech.2019-2622.
Paul, D., Pantazis, Y., Stylianou, Y. (2019) Non-Parallel Voice Conversion Using Weighted Generative Adversarial Networks. Proc. Interspeech 2019, 659-663, DOI: 10.21437/Interspeech.2019-2869.
Eloff, R., Nortje, A., Niekerk, B.V., Govender, A., Nortje, L., Pretorius, A., Biljon, E.V., Westhuizen, E.V.D., Staden, L.V., Kamper, H. (2019) Unsupervised Acoustic Unit Discovery for Speech Synthesis Using Discrete Latent-Variable Neural Networks. Proc. Interspeech 2019, 1103-1107, DOI: 10.21437/Interspeech.2019-1518.
Govender, A., Valentini-Botinhao, C., King, S. (2019) Measuring the contribution to cognitive load of each predicted vocoder speech parameter in DNN-based speech synthesis. Proc. 10th ISCA Speech Synthesis Workshop, 121-126, DOI: 10.21437/SSW.2019-22
Y. Pantazis, D. Paul, M. Fasoulakis and Y. Stylianou (2019). Training Generative Adversarial Networks With Weights, 2019 27th European Signal Processing Conference (EUSIPCO), A Coruna, Spain, 2-6 Sept 2019, pp. 1-5, doi: 10.23919/EUSIPCO.2019.8902934.
Koutsogiannaki, M., Simantiraki, O., Cooke, M., Lallier, M. (2020), Listening effort of natural speaking styles. SpiN 2020, 9-10 January 2020, Toulouse, France. Abstract p.61-62.
Simantiraki, O., Cooke, M., (2020) Exploring listeners’ speech modification preferences. SpiN 2020, 9-10 January 2020, Toulouse, France. Abstract p.15-16.
Raman, S., Winneke, A., Hernaez, I., Navas, E. (2020) Listening effort and oesophageal speech: An EEG study. SpiN 2020 9-10 January 2020, Tolouse, France. Abstract p.60-61.
Simantiraki, O., Cooke, M., Pantazis, Y. (2020) Effects of Spectral Tilt on Listeners’ Preferences And Intelligibility. ICASSP 2020 – 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 6254-6258, DOI: 10.1109/ICASSP40776.2020.9054117.
Paulus, M., Hazan, V., Adank, P. (2020). The relationship between talker acoustics, intelligibility, and effort in degraded listening conditions. The Journal of the Acoustical Society of America, 147(5), 3348-3359, DOI: 10.1121/10.0001212.
Shen C., Janse E. (2020). Maximum Speech Performance and Executive Control in Young Adult Speakers. Journal of Speech, Language, and Hearing Research. ePub Ahead of Issue. DOI: 10.1044/2020_JSLHR-19-00257
Simantiraki, O. and Cooke, M., (2020). Exploring listeners’ speech rate preferences. In Proceedings Interspeech 2020, p. 1346-1350. DOI: 10.21437/Interspeech.2020-1832
Chermaz, C., King. S., (2020). A Sound Engineering Approach to Near End Listening Enhancement. In Proceedings Interspeech 2020, p. 1356-1360. DOI: 10.21437/Interspeech.2020-2748
Paul, D., Pantazis, Y., Stylianou, Y. (2020) Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions. Proc. Interspeech 2020, 235-239, DOI: 10.21437/Interspeech.2020-2786.
Paul, D., Muhammed Shifas, P. V., Pantazis, Y. and Stylianou, Y. (2020). Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion. In Proceedings Interspeech 2020, p. 1361-1365. DOI: 10.21437/Interspeech.2020-2793
Rennies, J., Schepker, H., Valentini-Botinhao, C., Cooke, M. (2020) Intelligibility-Enhancing Speech Modifications — The Hurricane Challenge 2.0. Proc. Interspeech 2020, 1341-1345, DOI: 10.21437/Interspeech.2020-1641.
Govender, A., et al (2020). ASVspoof 2019: A large-scale public database of synthetized, converted and replayed speech. Computer Speech & Language, Vol. 64, November 2020, 101114. DOI: 10.1016/j.csl.2020.101114
Serrano, L., Raman, S., Hernaez, I., Navas, E., Sanchez, J., Saratxaga, I. (2020). A Spanish Multispeaker Database of Esophageal Speech. Computer Speech and Language, Volume 66, March 2021, 101168. DOI: 10.1016/j.csl.2020.101168
Kaplan EC, Wagner AE, Toffanin P and Başkent D (2021) Do Musicians and Non-musicians Differ in Speech-on-Speech Processing? Front. Psychol. 12:623787. DOI: 10.3389/fpsyg.2021.623787
Llorach, G., Kirschner, F., Grimm, G., Zokoll, M.A., Wagener, K.C. and Hohmann, V., (2020). Development and Evaluation of Video Recordings for the OLSA Matrix Sentence Test. arXiv preprint arXiv:1912.04700
Padinjaru Veettil, M.S., S. Claudio, S., Stylianou, Y. (2020) A fully recurrent feature extraction for single channel speech enhancement.” arXiv preprint arXiv:2006.05233
Pantazis, Y., Paul, D., Fasoulakis, M., and Stylianou , Y., Katsoulakis. M, (2020) Cumulant GAN. arXiv preprint arXiv:2006.06625
Llorach, G., Hendrikse, M.M., Grimm, G. and Hohmann, V., 2020. Comparison of a Head-Mounted Display and a Curved Screen in a Multi-Talker Audiovisual Listening Task. arXiv preprint arXiv:2004.01451
Databases:
Shen, Chen; Janse, Esther; King, Simon. (2018). Radboud Lombard Corpus_Dutch, 2017. Radboud University. Centre for Language Studies.
Marcoux, Katherine; Ernestus, Mirjam; King, Simon. (2018). Dutch English Lombard Speech Native and Non-Native (DELNN). Radboud University. Center for Language Studies.
Shen, Chen, & Janse, Esther. (2018). Radboud Lombard Corpus (Dutch). Zenodo. DOI: 10.5281/zenodo.4040685
Katherine P. Marcoux, & Mirjam Ernestus (2019). Dutch English Native Non-Native Lombard (DELNN) Corpus. Zenodo. DOI: 10.5281/zenodo.4267819
Maartje M. E. Hendrikse, Giso Grimm, Gerard Llorach, & Volker Hohmann. (2018). Audiovisual recordings of acted casual conversations between four speakers in German. Zenodo. DOI: 10.5281/zenodo.1257333
Llorach, Gerard, Kirschner, Frederike, Grimm, Giso, & Hohmann, Volker. (2020). Video recordings for the female German Matrix Sentence Test (OLSA). Zenodo. DOI: 10.5281/zenodo.3673062
Llorach, Gerard, Grimm, Giso, Vormann, Matthias, Hohmann, Volker, & Meis, Markus. (2020). Vehicle driving actions for loudness and annoyance perception. Zenodo. DOI: 10.5281/zenodo.3822311
Sfakianaki, Anna; Kafentzis, George; Stylianou, Yannis (2020). Greek Harvard.
Govender, Avashna (2018). Pupillometry toolkit.
Shen, C., Janse, E. (2021). Radboud Tongue Twister Corpus _Dutch. To be uploaded in the near future on Zenodo.
_ _ _ _ _ _ _ _
Unless otherwise noted, this work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.