This blog is a database for the 3rd semester students' writing database.
Sunday, 7 January 2018
Article "On the impact of children's emotional speech on acoustic and language models" (4th)
Clarisa Livia
16611022
This article shows in the difficulty of automatic recognition of children's utterance, especially in cases of spontaneous and affective speech. Thus, E mphatic and A ngry speech are recognized best-even better than Nutral speech, although basic ASR systems are only trained in Nutral speech only. The reason could be that the speech or the stern speech produced in the form of anger is slightly articulated clearly and that its acoustic realization is very similar to neutral speech. This does not apply to other M speeches that generate high error rate.
ASR performance is influenced by many factors. For this study, they have tried to keep as many factors as possible constant. For all experiments in this study, tutor greeting teachers have been trained that have the same vocabulary. However, four distinct subsets of different sets of tests relate to the size of the actual vocabulary used in different emotional states. Since this is a spontaneous speech, this factor can not be controlled. It remains unclear how ASR's performance is affected by this different vocabulary. Perhaps similarly acoustic vocabulary words may be more frequently observed in the Nutral state than in the other two countries. E mphatic and A nger. Furthermore, the realization of acoustics from other M's in the course of the training seems to be too different from the existing tests so the acoustic model can not be adapted successfully.
Labels:
clarisa
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment