<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Zerari, Naima</style></author><author><style face="normal" font="default" size="100%">Samir Abdelhamid</style></author><author><style face="normal" font="default" size="100%">Bouzgou, Hassen</style></author><author><style face="normal" font="default" size="100%">Raymond, Christian</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Bidirectional deep architecture for Arabic speech recognition</style></title><secondary-title><style face="normal" font="default" size="100%"> Open Computer Science</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2019</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://www.degruyter.com/document/doi/10.1515/comp-2019-0004/html</style></url></web-urls></urls><volume><style face="normal" font="default" size="100%">9</style></volume><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p style=&quot;text-align: justify;&quot;&gt;
	Nowadays, the real life constraints necessitatescontrolling modern machines using human interventionby means of sensorial organs. The voice is one of the hu-man senses that can control/monitor modern interfaces.In this context, Automatic Speech Recognition is princi-pally used to convert natural voice into computer text aswell as to perform an action based on the instructionsgiven by the human. In this paper, we propose a generalframework for Arabic speech recognition that uses LongShort-Term Memory (LSTM) and Neural Network (Multi-Layer Perceptron: MLP) classifier to cope with the non-uniform sequence length of the speech utterances issuedfrom both feature extraction techniques, (1) Mel FrequencyCepstral Coefficients MFCC (static and dynamic features),(2) the Filter Banks (FB) coefficients. The neural architec-ture can recognize the isolated Arabic speech via classifi-cation technique. The proposed system involves, first, ex-tracting pertinent features from the natural speech signalusing MFCC (static and dynamic features) and FB. Next,the extracted features are padded in order to deal with thenon-uniformity of the sequences length. Then, a deep ar-chitecture represented by a recurrent LSTM or GRU (GatedRecurrent Unit) architectures are used to encode the se-quences of MFCC/FB features as a fixed size vector that willbe introduced to a Multi-Layer Perceptron network (MLP)to perform the classification (recognition). The proposedsystem is assessed using two different databases, the firstone concerns the spoken digit recognition where a com-parison with other related works in the literature is per-formed, whereas the second one contains the spoken TVcommands. The obtained results show the superiority ofthe proposed approach.
&lt;/p&gt;
</style></abstract><issue><style face="normal" font="default" size="100%">1</style></issue></record></records></xml>