الأحد، 8 يونيو 2008



Feedback
Objective speech quality measurement using statistical data mining
Full text
Pdf (374 KB)
Source
EURASIP Journal on Applied Signal Processing archiveVolume 2005 , Issue 1 (January 2005) table of contents
Pages: 1410 - 1424
Year of Publication: 2005
ISSN:1110-8657
Authors
Wei Zha
Power, Acquisition and Telemetry Group, Schlumberger Technology Corporation, Sugar Land, TX
Wai-Yip Chan
Department of Electrical & Computer Engineering, Queen's University, Kingston, ON, Canada
Publisher
Hindawi Publishing Corp. New York, NY, United States
Bibliometrics
Downloads (6 Weeks): 2, Downloads (12 Months): 37, Citation Count: 0
Additional Information:
abstract references index terms collaborative colleagues
Tools and Actions:
Review this Article Save this Article to a Binder Display Formats: BibTex EndNote ACM Ref
ABSTRACT
Measuring speech quality by machines overcomes two major drawbacks of subjective listening tests, their low speed and high cost. Real-time, accurate, and economical objective measurement of speech quality opens up a wide range of applications that cannot be supported with subjective listening tests. In this paper, we propose a statistical data mining approach to design objective speech quality measurement algorithms. A large pool of perceptual distortion features is extracted from the speech signal. We examine using classification and regression trees (CART) and multivariate adaptive regression splines (MARS), separately and jointly, to select the most salient features from the pool, and to construct good estimators of subjective listening quality based on the selected features. We show designs that use perceptually significant features and outperform the state-of-the-art objective measurement algorithm. The designed algorithms are computationally simple, making them suitable for real-time implementation. The proposed design method is scalable with the amount of learning data; thus, performance can be improved with more offline or online training.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
[1] D. G. Jamieson, V. Parsa, M. Price, and J. Till, "Interaction of speech coders and atypical speech, II: effects on speech quality," Journal of Speech Language & Hearing Research, vol. 45, pp. 689-699, 2002.

2
[2] N. Kitawaki and H. Nagabuchi, "Quality assessment of speech coding and speech synthesis systems," IEEE Commun. Mag., vol. 26, no. 10, pp. 36-44, 1988.

3
[3] A. E. Conway, "A passive method for monitoring voice-over-IP call quality with ITU-T objective speech quality measurement methods," in Proc. IEEE International Conference on Communications (ICC '02), vol. 4, pp. 2583-2586, New York, NY, USA, April-May 2002.

4
[4] ITU-T Rec. P.800, "Methods for subjective determination of transmission quality," International Telecommunication Union, Geneva, Switzerland, August 1996.

5
[5] ITU-T Rec. P.862, "Perceptual evaluation of speech quality (PESQ): an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs," International Telecommunication Union, Geneva, Switzerland, February 2001.

6
[6] ITU-T Rec. P.563, "Single ended method for objective speech quality assessment in narrow-band telephony applications," International Telecommunication Union, Geneva, Switzerland, May 2004.

7
[7] R. F. Kubichek, D. Atkinson, and A. Webster, "Advances in objective voice quality assessment," in Proc. IEEE Global Telecommunications Conference (GLOBECOM '91), vol. 3, pp. 1765-1770, Phoenix, Ariz, USA, December 1991.

8
Hugo Fastl , Eberhard Zwicker, Psychoacoustics: Facts and Models, Springer-Verlag New York, Inc., Secaucus, NJ, 2006

9
[9] S. Voran, "Objective estimation of perceived speech quality. I. Development of the measuring normalizing block technique," IEEE Trans. Speech Audio Processing, vol. 7, no. 4, pp. 371-382, 1999.

10
[10] S. Voran, "Objective estimation of perceived speech quality. II. Evaluation of the measuring normalizing block technique," IEEE Trans. Speech Audio Processing, vol. 7, no. 4, pp. 383-390, 1999.

11
[11] S. Wang, A. Sekey, and A. Gersho, "An objective measure for predicting subjective quality of speech coders," IEEE J. Select. Areas Commun., vol. 10, no. 5, pp. 819-829, 1992.

12
Nuggehally S. Jayant , P. Noll, Digital Coding of Waveforms: Principles and Applications to Speech and Video, Prentice Hall Professional Technical Reference, 1990

13
[13] J. E. Schroeder and R. F. Kubichek, "L1 and L2 normed cepstral distance controlled distortion performance," in Proc. IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM '91), vol. 1, pp. 41-44, Victoria, BC, Canada, May 1991.

14
W. B. Kleijn , K. K. Paliwal, Speech Coding and Synthesis, Elsevier Science Inc., New York, NY, 1995

15
[15] S. R. Quackenbush, T. P. Barnwell III, and M. A. Clements, Objective Measures of Speech Quality, Prentice-Hall, Englewood Cliffs, NJ, USA, 1988.

16
A. W. Rix , J. G. Beerends , M. P. Hollier , A. P. Hekstra, Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs, Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference, p.749-752, May 07-11, 2001 [doi>10.1109/ICASSP.2001.941023]

17
[17] M. P. Hollier, M. O. Hawksford, and D. R. Guard, "Error activity and error entropy as a measure of psychoacoustic significance in the perceptual domain," IEE Proceedings of Vision, Image and Signal Processing, vol. 141, no. 3, pp. 203-208, 1994.

18
[18] L. Thorpe and W. Yang, "Performance of current perceptual objective speech quality measures," in Proc. IEEE Workshop on Speech Coding Proceedings, pp. 144-146, Porvoo, Finland, June 1999.

19
[19] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees, CRC Press, Boca Raton, Fla, USA, 1984.

20
[20] J. H. Friedman, "Multivariate adaptive regression splines," The Annals of Statistics, vol. 19, no. 1, pp. 1-141, 1991.

21
[21] N. Suzuki, S. Kirihara, A. Ootaki, M. Kitajima, and S. Nakamura, "Statistical process analysis of medical incidents," Asian Journal on Quality, vol. 2, no. 2, pp. 127-135, 2001.

22
[22] K. O. Perlmutter, S. M. Perlmutter, R. M. Gray, R. A. Olshen, and K. L. Oehler, "Bayes risk vector quantization with posterior estimation for image compression and classification," IEEE Trans. Image Processing, vol. 5, no. 2, pp. 347-360, 1996.

23
[23] P. Sephton, "Forecasting recession: can we do better on MARS?" Federal Reserve Bank of St. Louis Review, vol. 83, no. 2, pp. 39-49, 2001.

24
T. Ekman , G. Kubin, Nonlinear prediction of mobile radio channels: measurements and MARS model designs, Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference, p.2667-2670, March 15-19, 1999 [doi>10.1109/ICASSP.1999.761246]

25
[25] ITU-T Rec. G.729 - Annex B, "A silence compression scheme for G.729 optimized for terminals conforming to recommendation V.70," International Telecommunication Union, Geneva, Switzerland, November 1996.

26
[26] ETSI EN 301 708 V7.1.1, "Digital Cellular Telecommunications System (Phase 2+); Voice Activity Detector (VAD) for Adaptive Multi-Rate (AMR) Speech Trac Channels," Euro. Telecom. Stds. Inst., December 1999.

27
[27] R. F. Kubichek, E. A. Quincy, and K. L. Kiser, "Speech quality assessment using expert pattern recognition techniques," in Proc. IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM '91), pp. 208-211, Victoria, BC, Canada, June 1989.

28
[28] S. Voran, "Advances in objective estimation of received speech quality," in Proc. IEEE Workshop on Speech Coding for Telecommunications , Porvoo, Finland, June 1999.

29
[29] K. K. Paliwal and B. S. Atal, "Efficient vector quantization of LPC parameters at 24 bits/frame," IEEE Trans. Speech, and Audio Processing, vol. 1, no. 1, pp. 3-14, 1993.

30
[30] H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738-1752, 1990.

31
[31] L. Chistovich and V. V. Lublinskaya, "The 'center of gravity' effect in vowel spectra and critical distance between the formants: psychoacoustical study of the perception of vowellike stimuli," Hearing Research, vol. 1, no. 3, pp. 185-195, 1979.

32
[32] S. Voran, "A simplified version of the ITU algorithm for objective measurement of speech codec quality," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP '98), vol. 1, pp. 537-540, Seattle, Wash, USA, May 1998.

33
[33] ITU-T Rec. P. Supplement 23, "ITU-T coded-speech database," International Telecommunication Union, Geneva, Switzerland, February 1998.

34
[34] A. W. Rix, "A new PESQ scale to assist comparison between P.862 PESQ score and subjective MOS," ITU-T SG12 COM12-D86, May 2002.

35
[35] A. Abraham, "Analysis of hybrid soft and hard computing techniques for forex monitoring systems," in Proc. IEEE International Conference on Fuzzy Systems (FUZZ-IEEE '02), vol. 2, pp. 1616-1622, Honolulu, Hawaii, USA, May 2002.

36
[36] M. Stone, "Cross-validation choice and assessment of statistical predictions," Journal of the Royal Statistical Society: Series B, vol. 36, pp. 111-147, 1974.
INDEX TERMS
Primary Classification: I. Computing Methodologies I.2 ARTIFICIAL INTELLIGENCE I.2.7 Natural Language Processing Subjects: Speech recognition and synthesis
Additional Classification: H. Information Systems H.2 DATABASE MANAGEMENT H.2.8 Database applications Subjects: Data mining H.5 INFORMATION INTERFACES AND PRESENTATION (I.7) H.5.5 Sound and Music Computing Subjects: Signal analysis, synthesis, and processing I. Computing Methodologies I.5 PATTERN RECOGNITION I.5.1 Models Subjects: Statistical I.5.4 Applications Subjects: Signal processing
General Terms: Algorithms, Design, Measurement
Keywords: classification trees, data mining, mean opinion scores, regression, speech perception, speech quality
Collaborative Colleagues:
Wei Zha: colleagues
Wai-Yip Chan: colleagues

ليست هناك تعليقات: