Check nearby libraries
Buy this book

The performance of automatic speech recognition (ASR) systems degrades significantly in adverse environments due to ambient noise and reverberation. This problem becomes even greater in hands-free speech applications, where the microphones can be placed far away from the speaker of interest. Environmental robustness has become a major barrier that prevents ASR from a wide range of applications such as voice recognition in a car and voice controlled hand-held devices.In this research, the importance of phase in robust speech recognition is explored. First, the effect of phase uncertainty on the recognition accuracy of human listeners is investigated. The goal is to get a quantitative measure on the importance of phase. The results show that the importance of phase varies with SNR (signal-to-noise ratio). At low SNR conditions, phase can have a significant impact on speech recognition accuracy. Next, motivated by the importance of phase in multi-microphone signal processing, a phase-based dual-microphone noise masking approach is proposed for speech enhancement. By utilizing the time delay of the speech source of interest to the two microphones and the actual phases of the signals recorded by both microphones, the algorithm filters the noise signal in the short-time Fourier transform domain. By doing so, the noise components are distorted beyond recognition and the speech recognition accuracy is improved. The effectiveness of this approach is demonstrated through performance comparison with alternative techniques. Lastly, an automatic parameter estimation technique is developed to further optimize its performance. The parameter of the phase-based dual-microphone filter is adjusted in run-time automatically by performing likelihood calculations of the enhanced speech features using a prior speech model. Speech recognition tests show that this adaptive approach not only achieves better recognition accuracy, but also improves the filter's robustness when time delay estimates are inaccurate.
Check nearby libraries
Buy this book

Previews available in: English
Edition | Availability |
---|---|
1 |
aaaa
|
2
Phase-based speech processing
2006, World Scientific, World Scientific Publishing Company
in English
9812566120 9789812566126
|
cccc
|
3
Phased-Based Speech Processing
December 30, 2005, World Scientific Publishing Company
Hardcover
in English
9812566120 9789812566126
|
zzzz
|
4
Phased-Based Speech Processing
December 30, 2005, World Scientific Publishing Company
Paperback
in English
9812566139 9789812566133
|
zzzz
|
Book Details
Edition Notes
Source: Dissertation Abstracts International, Volume: 67-06, Section: B, page: 3354.
Advisor: Parham Aarabi.
Thesis (Ph.D.)--University of Toronto, 2006.
Includes bibliographic references.
Electronic version licensed for access by U. of T. users.
The Physical Object
Edition Identifiers
Work Identifiers
Community Reviews (0)
History
- Created October 23, 2008
- 2 revisions
Wikipedia citation
×CloseCopy and paste this code into your Wikipedia page. Need help?
December 15, 2009 | Edited by WorkBot | link works |
October 23, 2008 | Created by ImportBot | Imported from University of Toronto MARC record |