The Speaker and Language Recognition Workshop

SpeakersTutorial day

Tutorial Speakers

Dr. Massimiliano Todisco is an Assistant Professor within the Digital Security Department at EURECOM, France. He received his Ph.D. degree in Sensorial and Learning Systems Engineering from the University of Rome Tor Vergata in 2012. Currently, he is serving as principal investigator and coordinator for TReSPAsS-ETN, a H2020 Marie Skłodowska-Curie Innovative Training Network (ITN) and RESPECT, a PRCI project funded by the French ANR and the German DFG. He co-organizes the ASVspoof challenge series, which is community-led challenges which promote the development of countermeasures to protect automatic speaker verification (ASV) from the threat of spoofing. He is the inventor of constant Q cepstral coefficients (CQCC), the most commonly used anti-spoofing features for speaker verification and first author of the highest-cited technical contribution in the field in the last three years. He has more than 90 publications. His current interests are in developing end-to-end architectures for speech processing and speaker recognition, fake audio detection and anti-spoofing, and the development of privacy preservation algorithms for speech signals based on encryption solutions that support computation upon signals, templates and models in the encrypted domain.

Dr. Johan Rohdin received the M.Sc. degree in Engineering Physics and Mathematics from Chalmers University of Technology in 2008 and the Ph.D. degree in Computer Science from Tokyo Institute of Technology in 2015. From 2013 to 2016 he worked with speech technology development at Inferret Limited. In 2016, he joined Brno University of Technology (BUT) as a post-doctoral researcher sponsored by the South Moravian Programme for Distinguished Researchers (SoMoPro) for working on neural network approaches to speaker recognition. Since 2019 he works at BUT as an assistant professor, and also for Omilia —Conversational Intelligence with speech technology R&D. His research interests includes machine learning and speech processing, in particular applied to automatic speaker recognition.

Dr. Xin Wang is a project researcher at National Institute of Informatics, Japan. He received the Ph.D. degree from SOKENDAI, Japan, in 2018. Before that, he received M.S. and B.E degrees from University of Science and Technology of China and University of Electronic Science and Technology of China in 2015 and 2012, respectively. His research interests include statistical speech synthesis and machine learning.

Dr. Yotaro Kubo received the B.E., M.E., and Dr.Eng. degrees from Waseda University, Tokyo, Japan, in 2007, 2008, and 2010, respectively. He was a visiting scientist at RWTH Aachen University for six months in 2010. After that period, he joined Nippon Telegraph and Telephone Corporation (NTT) and had been with NTT Communication Science Laboratories. From 2014 to 2019, he was with Amazon and developed/ investigated speech recognition for voice search and personal assistants. Since 2019, he is a research scientist at Google. His research interest includes generative/ discriminative hybrid modeling, kernel-based probabilistic models, and integration of probabilistic systems. He is a member of the IEEE, the International Speech Communication Association (ISCA), and the Acoustical Society of Japan (ASJ).

Dr.-Ing. Sakriani Sakti is currently a Research Associate Professor with NAIST and a Research Scientist with RIKEN Center of for Advanced Intelligent Project AIP, Japan. She received the B.E. degree (cum laude) in informatics from the Bandung Institute of Technology, Indonesia, in 1999, the M.Sc. degree, in 2002, and the Ph.D. degree from the Dialog Systems Group, University of Ulm, Germany, in 2008. Between 2003-2009, she worked as a researcher at ATR SLC Labs, Japan, and during 2006-2011, she worked as an expert researcher at NICT SLC Groups, Japan. She actively involved in collaboration activities such as Asian Pacific Telecommunity Project (2003-2007), A-STAR and U-STAR (2006-2011). In 2009-2011, she served as a visiting professor of Computer Science Department, University of Indonesia (UI), Indonesia. In 2011-2017, she was an assistant professor at the Augmented Human Communication Laboratory, NAIST, Japan. She served also as a visiting scientific researcher of INRIA Paris-Rocquencourt, France, in 2015-2016, under “JSPS Strategic Young Researcher Overseas Visits Program for Accelerating Brain Circulation”. Her research interests include statistical pattern recognition, graphical modeling framework, deep learning, multilingual speech recognition synthesis, spoken language translation, affective dialog systems, and cognitive communication. She is a member of JNS, SFN, ASJ, ISCA, and IEICE. She is also an Officer of the ELRA/ISCA Special Interest Group on Under-resourced Languages (SIGUL) and a Board Member of the Spoken Language Technologies for Under-Resourced Languages (SLTU).

Mr Shigeki Karita is a research software engineer at Google, Tokyo, Japan. He received the B.E. and M.E. degrees from Osaka University, Japan in 2014 and 2016, respectively. He was a research scientist at NTT Communication Science Laboratories, Kyoto, Japan, from 2016 to 2020. He received the Young Researcher's Award in 2014 from IEICE, Japan. His research interests include speech recognition, speech translation, and speech enhancement. Recently, he is working on semi-supervised training and sequence discriminative training for end-to-end ASR. He is also a core developer of ESPnet: end-to-end speech processing toolkit.