Research Interest
I work in the field of Audio Singal Processing, Audio Codec Model, Self-supervised learning, Large Language Model, Multimodal Model, Machine learning, and Deep learning supervised by
Prof. Xie Chen, I will try my best in the next five exciting years! 💪. Currently, I focus on the following research topics:
- Audio Codec Model which converts continuation latent representation to discrete token
- Audio Self-supervised learning
- Speech Multimodal Large Language Model
Education and Intern
Publications
Models and Methods for Speech SSL:
- Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen*.
Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning.
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU Workshop),
2023.
[Link]
[PDF]
[BibTeX]
- Chenpeng Du, Yiwei Guo, Hankun Wang, Yifan Yang, Zhikang Niu, Shuai Wang, Hui Zhang, Xie Chen.
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech.
[Link]
[PDF]
[BibTeX]
Projects
Open-Source Projects:
- thorough-pytorch: A Chinese PyTorch tutorial and it has already collected 1,800 more stars and 333 forks on GitHub.
- More open-source contents can be found on my GitHub.
Research Projects
Honors and Awards
- 2022, National Scholarship, Ministry of Education in China.
- 2021, Meritorious Winner, Interdisciplinary Contest In Modeling.
- 2021, 2023, The First Prize Scholarship, Xidian University.
Activities
- 2021.11-Now, Datawhale member (an open-source AI organization), helped data science fans get involved in the AI community.