QU Dan, YANG Xukui, YAN Honggang, CHEN Yaqi, NIU Tong
Abstract:
Low-resource few-shot speech recognition is an urgent technical demand faced by the speech recognition industry. The framework technology for few-shot speech recognition is first briefly discussed in this article. The research progress of several important low resource speech technologies, including feature extraction, acoustic model, and resource expansion, is then highlighted. The latest advancements in deep learning technologies, such as generative adversarial networks, self-supervised representation learning, deep reinforcement learning, and meta-learning, are then focused on in order to address few-shot speech recognition on the basis of the development of continuous speech recognition framework technology. On that basis, the problems of limited complementarity, unbalanced task and model deployment faced by this technology are analyzed for the subsequent development. Finally, a summary and prospect of few-shot continuous speech recognition are given.