Reference Hub

This research has been cited in:

Conference
MKPLS: Manifold Kernel Partial Least Squares for Lipreading and Speaker Identification2013 IEEE Conference on Computer Vision and Pattern Recognition10.1109/CVPR.2013.94
Chapter
Human Machine Interaction via Visual Speech SpottingAdvanced Concepts for Intelligent Vision Systems10.1007/978-3-319-25903-1_49
Article
Measuring the effect of high-speed video data on the audio-visual speech recognition accuracyInformation and Control Systems10.31799/1684-8853-2019-2-26-34
Article
A noise-robust speech recognition approach incorporating normalized speech/non-speech likelihood into hypothesis scoresSpeech Communication10.1016/j.specom.2012.10.001
Article
Audiovisual Fusion: Challenges and New ApproachesProceedings of the IEEE10.1109/JPROC.2015.2459017

Audio-Visual and Visual-Only Speech and Speaker Recognition: Issues about Theory, System Design, and Implementation

Derek J. Shiell, Louis H. Terry, Petar S. Aleksic, Aggelos K. Katsaggelos

Source Title: Visual Speech Recognition: Lip Segmentation and Mapping

ISBN13: 9781605661865|ISBN10: 1605661864|ISBN13 Softcover: 9781616925338|EISBN13: 9781605661872

DOI: 10.4018/978-1-60566-186-5.ch001

Cite Chapter Cite Chapter

MLA

Shiell, Derek J., et al. "Audio-Visual and Visual-Only Speech and Speaker Recognition: Issues about Theory, System Design, and Implementation." Visual Speech Recognition: Lip Segmentation and Mapping, edited by Alan Wee-Chung Liew and Shilin Wang, IGI Global, 2009, pp. 1-38. https://doi.org/10.4018/978-1-60566-186-5.ch001

APA

Shiell, D. J., Terry, L. H., Aleksic, P. S., & Katsaggelos, A. K. (2009). Audio-Visual and Visual-Only Speech and Speaker Recognition: Issues about Theory, System Design, and Implementation. In A. Liew & S. Wang (Eds.), Visual Speech Recognition: Lip Segmentation and Mapping (pp. 1-38). IGI Global. https://doi.org/10.4018/978-1-60566-186-5.ch001

Chicago

Shiell, Derek J., et al. "Audio-Visual and Visual-Only Speech and Speaker Recognition: Issues about Theory, System Design, and Implementation." In Visual Speech Recognition: Lip Segmentation and Mapping, edited by Alan Wee-Chung Liew and Shilin Wang, 1-38. Hershey, PA: IGI Global, 2009. https://doi.org/10.4018/978-1-60566-186-5.ch001

Export Reference

Favorite

View Full Text HTML

View Full Text PDF

Abstract

The information imbedded in the visual dynamics of speech has the potential to improve the performance of speech and speaker recognition systems. The information carried in the visual speech signal compliments the information in the acoustic speech signal, which is particularly beneficial in adverse acoustic environments. Non-invasive methods using low-cost sensors can be used to obtain acoustic and visual biometric signals, such as a person’s voice and lip movement, with little user cooperation. These types of unobtrusive biometric systems are warranted to promote widespread adoption of biometric technology in today’s society. In this chapter, the authors describe the main components and theory of audio-visual and visual-only speech and speaker recognition systems. Audio-visual corpora are described and a number of speech and speaker recognition systems are reviewed. Finally, various open issues about the system design and implementation, and present future research and development directions in this area are discussed.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Audio-Visual and Visual-Only Speech and Speaker Recognition: Issues about Theory, System Design, and Implementation

MLA

APA

Chicago

Export Reference

Abstract

Request Access