Sign Language Translator AI (SLTA)

chevron-icon
Back
Top
chevron-icon
project-presentation-img
khasyahfr
Project Owner

Sign Language Translator AI (SLTA)

Expert Rating

n/a
  • Proposal for BGI Nexus 1
  • Funding Request $50,000 USD
  • Funding Pools Beneficial AI Solutions
  • Total 4 Milestones

Overview

Sign Language Translator AI (SLTA) is an AI service designed to automatically translate sign language from video files into text. Leveraging computer vision and machine learning, SLTA will interpret hand gestures within the sign language videos to provide real-time translations. This service aims to eliminate communication barriers for millions of deaf and hard-of-hearing individuals, improving accessibility for non-signers. SLTA's inclusion in the SingularityNET marketplace will enhance its ecosystem with a socially impactful AI solution, promoting inclusivity and accessibility, in line with BGI's commitment to social good.

Proposal Description

How Our Project Will Contribute To The Growth Of The Decentralized AI Platform

SLTA aligns with the BGI mission by providing a solution that addresses a social need, the communication barriers faced by the deaf and hard-of-hearing community. By enabling real-time sign language translation, SLTA empowers individuals who rely on sign language, ensuring they can engage with non-signers. SLTA promotes inclusivity and offers equal opportunities for millions of people. Its adoption through SingularityNET enhances accessibility, which aligns with BGI's commitment to social good.

Our Team

We are well-equipped to execute this project. Khasyah, a software engineer with previous experience as a funded proposer in Deep Funding Round 4, brings project management and technical skills. Pandu, a seasoned data scientist and machine learning engineer, offers knowledge in building AI models. We have already identified the key dataset and initial model architecture, giving us a clear path forward. With a lean yet capable team, we can rapidly iterate and deliver the project efficiently.

AI services (New or Existing)

Sign Language to Text Translator

Type

New AI service

Purpose

To automatically translate sign language gestures from video files into written text providing real-time communication assistance for non-signers and promoting accessibility for the deaf and hard-of-hearing community.

AI inputs

Video file containing sign language gestures.

AI outputs

Text output representing the translated sign language in a readable format.

The core problem we are aiming to solve

The core problem we aim to solve is the communication barrier between deaf individuals who use sign language and the broader population. Currently, sign language translation heavily relies on human interpreters, which are not always accessible, practical, or cost-effective in many situations. This creates a gap in communication, limiting access to essential services, professional opportunities, and educational resources for the 72 million people who are deaf or hard of hearing. By automating the translation of sign language into text, we address this issue head-on, providing a scalable, practical solution that enhances accessibility and promotes inclusion across various sectors.

Our specific solution to this problem

Our solution addresses the communication gap by automatically translating sign language into text through advanced computer vision and machine learning techniques. First, we are constructing a diverse dataset of sign language gestures, encompassing various demographics and environmental conditions, which will form the foundation of our system. This data is then processed to extract key hand landmarks using techniques like MediaPipe, capturing the critical hand movements needed for accurate interpretation. The core of our system will be a sequence-based deep learning model, such as an LSTM or transformer, designed to handle the temporal nature of sign language. By training the model on this structured data and optimizing it for real-world scenarios, we ensure that the AI can effectively interpret a wide range of signing styles and conditions. This results in a scalable and reliable AI service in SingularityNET that provides accurate sign language translations in real time, making communication between signers and non-signers more accessible.

Given the rapid advancements in AI and machine learning, we may adapt our model or implementation approach over time, while ensuring that the SLTA consistently delivers high-quality and accurate translations.

Open Source Licensing

Apache License

Was there any event, initiative or publication that motivated you to register/submit this proposal?

select_option

Proposal Video

Placeholder for Spotlight Day Pitch-presentations. Video's will be added by the DF team when available.

  • Total Milestones

    4

  • Total Budget

    $50,000 USD

  • Last Updated

    24 Feb 2025

Milestone 1 - Dataset Preparation

Description

This milestone focuses on building a comprehensive sign language dataset that captures the diversity and complexity of sign language gestures. The work encompasses gathering data from multiple sources and ensuring sufficient variety in terms of demographics environmental conditions and signing styles. This foundation is crucial for developing a robust and inclusive sign language recognition system.

Deliverables

- Collection and aggregation of sign language datasets from various sources (e.g. Kaggle ASL alphabet dataset) to ensure diverse representation across races genders lighting conditions and other variations - Verify proper consent and ethical data collection for all sign language video datasets, ensuring diverse representation while protecting participant privacy - Implementation of data augmentation techniques to expand the dataset through controlled variations such as slight rotations scale adjustments flipping and noise addition to enhance model robustness - Creation of supplementary dataset through custom recording sessions if external datasets prove insufficient for comprehensive coverage - Organization of the complete dataset into structured training (75%) validation (15%) and test (15%) sets ensuring balanced representation across all sign categories

Budget

$15,000 USD

Success Criterion

- Comprehensive labeled dataset with clear gloss annotations for each sample, organized in a structured folder hierarchy - Balanced representation across different demographic groups and environmental conditions

Milestone 2 - Preprocessing

Description

This milestone addresses the transformation of raw sign language video data into a structured format suitable for machine learning. The focus is on extracting meaningful spatial and temporal features from the sign language gestures while ensuring consistency and standardization across all samples creating a foundation for effective model training.

Deliverables

- Extraction of hand landmarks from video frames using MediaPipe capturing comprehensive spatial information including key points for fingers palm and other critical hand components - Implementation of coordinate normalization to create a uniform reference system with palm-centric coordinates where the palm center serves as the origin (0 0) using MediaPipe for consistent processing - Processing of dynamic signs into standardized sequence lengths to capture temporal transitions between hand movements with flexibility for adjustment based on sign complexity and speed - Creation of structured data formats suitable for model input including sequential storage of landmark coordinates (x y z) for each frame in accessible formats such as numpy arrays or CSV files

Budget

$20,000 USD

Success Criterion

- Complete extraction of landmarks for all video frames with proper CSV file storage - Verified normalization of landmark coordinates using palm-centered anchoring - Ready-to-use dataset format suitable for model training

Milestone 3 - Training

Description

This milestone encompasses the development and optimization of the sign language recognition model. The focus is on implementing and training a sequence-based deep learning architecture capable of accurately recognizing continuous sign language gestures with particular attention to handling the temporal aspects of signing.

Deliverables

- Selection and implementation of appropriate sequence-based model architecture evaluating options including Long Short-Term Memory (LSTM) RNN and transformer models based on project requirements - Implementation of Connectionist Temporal Classification (CTC) loss function to handle variable sequence lengths and misalignments between input video frames and output glosses/words - Comprehensive hyperparameter optimization through grid search or random search including learning rate batch size sequence length and model architecture parameters - Implementation of Word Error Rate evaluation metric to assess sequence-level prediction accuracy rather than frame-level performance - Check training progress to maintain similar performance levels across demographic categories - Development of model saving and loading functionality with verification of successful model restoration and inference

Budget

$10,000 USD

Success Criterion

- Detailed model performance report with comprehensive metrics - Verified model saving and loading functionality with successful inference testing

Milestone 4 - Safety Review and Platform Deployment

Description

This milestone focuses on making the trained sign language recognition model accessible and operational within the SingularityNET ecosystem while ensuring that safety, privacy, and ethical considerations are rigorously addressed. We will ensure no user data is stored or retained during the translation process.

Deliverables

- Full integration of privacy measures to ensure no user data is stored or retained. This includes processing all interactions with the service in real time without retaining any sensitive or personal information. - Ensuring that no personally identifiable information (PII) is captured, stored, or misused. - Ensuring inclusivity by using a diverse dataset representing various groups to avoid biases in the sign language translation. This ensures equitable access and accurate translations for all users. - Complete service configuration setup and deployment of the sign language recognition service within the SingularityNET platform. - Publication of the service to the SingularityNET marketplace, ensuring proper accessibility for both users - Execution of comprehensive testing including functionality verification, performance validation, reliability testing, and integration testing with other platform services to ensure robust operation

Budget

$5,000 USD

Success Criterion

- Successful deployment and accessibility verification on SingularityNET - Complete resolution of all testing-identified issues - Smooth operation within the SingularityNET environment

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon