Reference Hub4
CBC-Based Synthetic Speech Detection

CBC-Based Synthetic Speech Detection

Jichen Yang, Qianhua He, Yongjian Hu, Weiqiang Pan
Copyright: © 2019 |Volume: 11 |Issue: 2 |Pages: 12
ISSN: 1941-6210|EISSN: 1941-6229|EISBN13: 9781522565154|DOI: 10.4018/IJDCF.2019040105
Cite Article Cite Article

MLA

Yang, Jichen, et al. "CBC-Based Synthetic Speech Detection." IJDCF vol.11, no.2 2019: pp.63-74. http://doi.org/10.4018/IJDCF.2019040105

APA

Yang, J., He, Q., Hu, Y., & Pan, W. (2019). CBC-Based Synthetic Speech Detection. International Journal of Digital Crime and Forensics (IJDCF), 11(2), 63-74. http://doi.org/10.4018/IJDCF.2019040105

Chicago

Yang, Jichen, et al. "CBC-Based Synthetic Speech Detection," International Journal of Digital Crime and Forensics (IJDCF) 11, no.2: 63-74. http://doi.org/10.4018/IJDCF.2019040105

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

In previous studies of synthetic speech detection (SSD), the most widely used features are based on a linear power spectrum. Different from conventional methods, this article proposes a new feature extraction method for SSD from octave power spectrum which is obtained from constant-Q transform (CQT). By combining CQT, block transform (BT) and discrete cosine transform (DCT), a new feature is obtained, namely, constant-Q block coefficients (CBC). In which, CQT is used to transform speech from the time domain into the frequency domain, BT is used to segment octave power spectrum into many blocks and DCT is used to extract principal information of every block. The experimental results on ASVspoof 2015 corpus shows that CBC is superior to other front-ends features that have been benchmarked on ASVspoof 2015 evaluation set in terms of equal error rate (EER).