An Innovative Approach to Continuous Sign Language Recognition: Combining CNN and BILSTM Models with Iterative Training

Main Article Content

Kondragunta Rama Krishnaiah, Alahari Hanumant Prasad

Abstract

In this research, we introduce an advanced framework for continuous sign language (SL) recognition using deep neural networks. Our primary goal is to accurately transcribe videos of SL sentences into sequences of ordered gloss labels. Traditional methods for continuous SL recognition have relied on hidden Markov models, which have limitations in capturing temporal information effectively. To overcome this, we propose a novel architecture that leverages the power of deep convolutional neural networks with stacked temporal fusion layers for feature extraction, coupled with bi-directional recurrent neural networks for sequence learning. One of the key challenges we faced was the limited size of available datasets, which made it difficult for an end-to-end training approach to fully exploit the capabilities of the complex deep neural network. To tackle this issue, we developed an iterative optimization process to train our CNN-BILSTM architecture effectively. Here's how it works: We start by providing gloss-level gestural supervision through forced alignment from the end-to-end system. This approach directly guides the training process of the feature extractor. Next, we fine-tune the BILSTM system using the improved feature extractor, leading to further refined alignment for the feature extraction module. This iterative training strategy allows our CNN-BILSTM model to continue learning and benefiting from these refined gestural alignments. To evaluate our framework, we employed the 'SignumDataset,' a dataset containing 24 different signs or signatures. Our proposed architecture demonstrated promising results on this dataset, showing the potential of deep neural networks in SL recognition.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Article Details

How to Cite
Kondragunta Rama Krishnaiah, Alahari Hanumant Prasad. (2023). An Innovative Approach to Continuous Sign Language Recognition: Combining CNN and BILSTM Models with Iterative Training. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 14(1), 321–333. https://doi.org/10.17762/turcomat.v14i1.14004
Section
Articles