AN ENSEMBLE LEARNING FRAMEWORK FOR SIGN LANGUAGE TRANSLATION USING BIDIRECTIONAL CNNS AND TRANSFORMERS

Authors

  • Anigbogu Kenechukwu S. Department of Computer Science, Nnamdi Azikiwe University, Awka.
  • Okafor Paul C. Department of Computer Science, Nnamdi Azikiwe University, Awka.
  • Nwankpa Joshua M. Department of Computer Science, Nnamdi Azikiwe University, Awka.
  • Asogwa Emmanuel C. Department of Computer Science, Nnamdi Azikiwe University, Awka.

Keywords:

American Sign Language (ASL), Bura Sign Language (BSL), Yoruba Sign Language (YSL), Hausa Sign Language, and Adamorobe Sign Language (Ghana).

Abstract

Effective communication is essential for human interaction, yet individuals with hearing impairments and speaking difficulties often face significant challenges. The ability to recognize and translate sign language in real-time can bridge the communication gap between those who do not know sign language and those who rely on it. This work examines various sign language conventions prevalent in Nigeria and beyond, including distinct phonetic and semantic structures, as well as the messages they convey. It covers languages such as American Sign Language (ASL), Bura Sign Language (BSL), Yoruba Sign Language (YSL), Hausa Sign Language, and Adamorobe Sign Language (Ghana). This work utilized an object-oriented programming (OOP) methodology. The work was built using Python for backend development, with machine learning libraries such as TensorFlow, Pandas, and NumPy. The user interface was developed using React and Node.js. The system implemented an ensemble learning-based bidirectional sign language translation protocol using Convolutional Neural Networks (CNN) and the Bidirectional Encoder Representations from Transformers (BERT). These architectures were combined in a forward and backward encoder mechanism to translate general sign languages, ensuring both high performance and robustness. Community data gathering, validation, refinement, and analysis techniques were employed to create a reliable and diverse dataset. The model achieved great performance with an accuracy of 98.7%, a precision of 97.6%, and an F1 Score of 98.2%. It was tested on ASL datasets (both images and videos) and community feedback from language experts.

Downloads

Published

2026-03-31