Think Inclusion Conferencing App

chevron-icon
Back
Top
chevron-icon
project-presentation-img
Remo Start
Project Owner

Think Inclusion Conferencing App

Expert Rating

n/a
  • Proposal for BGI Nexus 1
  • Funding Request $45,000 USD
  • Funding Pools Beneficial AI Solutions
  • Total 4 Milestones

Overview

Understanding the need for an inclusive and accessible space,institutions create accessible pathways for mobility impaired amongst us, sign language interpreters for audio impaired amongst us, and a host of other structures to accommodate these important members of our society.However it seems to be a lost art today as we moved to the digital space.We unconsciously neglected some demographics amongst us,we don't build technologies thinking actively about them: The visual, audio and speech impaired etc. are the ones mostly affected by this.Think Inclusion project creates AI tools helping this specific demographics with access, with an ai meeting app being the focus of this specific proposal

Proposal Description

How Our Project Will Contribute To The Growth Of The Decentralized AI Platform

One of the goals of the BGI Nexus is to develop a platform that enables anyone, anywhere, to determine what a beneficial and inclusive future for humankind and the earth should be, this panders directly with the Think Incusion Conferencing app, because it sorts to bring a marginalised group of persons, in this case those with audio and speech impairment to the conventional digital space, creating a future of mutual shared digital space for them.

Our Team

Ubio Obu: Team Lead: CEO of Remostart and an AI Engineer with several publications in several Journals like IEEE, American Institute of Physics etc

Ediyangha Ottoho: Full Stack Developer: More than 10 years of development experince and has worked on several top projects where some have morphed into Unicorns.

Daniel Effiom: Project and Ethics Manager: He is the Chief operating officer at Remostart and has high value for ensuring projects are executed on time and following a high level of standards.

AI services (New or Existing)

Speech to text to Sign Language Model

Type

New AI service

Purpose

Coverts spoken words into text then into a sign language

AI inputs

Speech

AI outputs

Sign Language

Sign Language to speech model

Type

New AI service

Purpose

Converts sig language to spoken words

AI inputs

Sign language

AI outputs

Speech

Company Name (if applicable)

Remostart

The core problem we are aiming to solve

The core problem we aim to solve is the digital exclusion of over 450 million people with speech impairments, as highlighted by WHO. While physical spaces have become more inclusive, digital platforms remain largely inaccessible, limiting participation in virtual interactions. Existing assistive tools are often prohibitively expensive, creating a significant barrier for millions who wish to be included and have their voices heard. This lack of affordable, inclusive technology deprives them of equal opportunities in communication, collaboration, and decision-making.

Our specific solution to this problem

Our solution is an AI-powered meeting app that bridges the communication gap for speech-impaired individuals by integrating a real-time sign language translator. The system converts spoken conversations into sign language, allowing the speech-impaired user to follow discussions effortlessly. In return, the user's sign language responses are translated into synthesized speech, enabling seamless participation as if they were speaking audibly.

Key Features include:

  • Interruptions & Turn-Taking: A gesture-based or on-screen prompt system ensures speech-impaired users can indicate when they want to contribute.
  • Voice Representation: Users can select or customize an AI-generated voice that aligns with their identity.
  • Multi-Mode Accessibility: The app supports text-based inputs for those preferring typing and integrates haptic feedback for additional engagement.
  • Affordability & Scalability: Leveraging AI and open-source technologies keeps the solution cost-effective, ensuring wider adoption.

Project details

Solution Overview

The Think Inclusion Meeting App is an innovative AI-powered communication platform designed to bridge the accessibility gap for individuals with speech impairments. It ensures full participation in virtual meetings by integrating a real-time sign language translator that converts spoken conversations into sign language and, in return, translates sign language responses into natural-sounding synthesized speech. This enables speech-impaired individuals to engage in discussions as seamlessly as their hearing and speaking counterparts.

The system leverages computer vision, NLP, and speech synthesis to interpret gestures, process spoken language, and generate a smooth and natural audio output. It ensures that users can interrupt, respond, and contribute in meetings without being sidelined. Additionally, the tool provides multiple accessibility modes, including text-based input, haptic feedback, and voice customization, allowing users to personalize their communication experience.

Technical Architecture

The solution consists of multiple integrated components that work together to ensure a fluid, real-time experience:

1. Input Layer

  • Spoken Conversations: Captured via the app’s audio input, transcribed using an automatic speech recognition (ASR) model like Whisper or DeepSpeech.
  • Sign Language Interpretation: The system uses real-time video analysis powered by computer vision models like OpenPose or MediaPipe to track hand movements and facial expressions. These inputs are processed using a sign language recognition model trained on datasets like ASLLVD (American Sign Language Video Dataset) and PHOENIX-2014T.
  • Text Input: Users can type responses if they prefer a non-gesture-based approach.

2. Processing Layer

  • Natural Language Processing (NLP): Used to structure and interpret transcribed speech and sign language responses.
  • Gesture-to-Text Translation: The AI model maps hand and facial movements to a structured text representation of sign language.
  • Response Handling: Users can signal interruptions using predefined gestures or on-screen prompts, ensuring active participation.
  • Voice Personalization: A text-to-speech (TTS) module, using models like Tacotron 2 or VITS, converts structured sign-language-translated text into a natural voice output.

3. Output Layer

  • Sign Language Visualization: The system renders a sign language avatar for hearing users who prefer to receive signed translations.
  • AI-Generated Speech Output: The user’s translated text is synthesized into speech and broadcasted in the meeting.
  • Haptic Feedback: Tactile notifications assist speech-impaired users in staying engaged.

4. Cloud & Edge AI Infrastructure

  • Cloud-Based AI Models: Speech-to-text, sign recognition, and text-to-speech are deployed on cloud inference servers for efficient processing.
  • On-Device Computation: Edge AI models optimize low-latency processing on mobile devices for responsive real-time interactions.

How It Works

  1. A spoken conversation begins. The app's ASR model transcribes speech in real time.
  2. Sign language translation initiates. The AI model converts the transcript into animated sign language, displayed on-screen for the speech-impaired participant.
  3. Speech-impaired users respond using sign language. The AI recognizes and translates these signs into structured text.
  4. Text is converted into speech. The system synthesizes the response into natural-sounding audio, allowing the speech-impaired individual to contribute seamlessly.
  5. Users can signal to speak. A gesture-based system or on-screen button allows them to indicate when they wish to participate.
  6. A fully inclusive meeting unfolds. The AI ensures all users, regardless of impairment, communicate in a natural and synchronized manner.

Uniqueness of the Solution

  • Bidirectional Accessibility: Most tools cater to one-way accessibility (e.g., speech-to-text), whereas this app ensures two-way communication by translating speech to sign language and sign language to speech.
  • Customizable AI Voice Representation: Unlike traditional systems, users can select a preferred AI-generated voice for natural, personalized communication.
  • Interrupt Mechanism for Equal Participation: A robust gesture-based interruption system ensures speech-impaired individuals can engage in discussions in real time.
  • Affordable & Scalable: Many existing assistive tools are prohibitively expensive. By leveraging AI-driven, open-source components, this tool remains cost-effective and widely accessible.

Social Impact

  1. Workplace Inclusion: Enables professionals with speech impairments to actively participate in meetings, fostering diverse and inclusive workplaces.
  2. Education & Learning: Improves accessibility in online classrooms, ensuring equitable learning opportunities.
  3. Telemedicine & Healthcare: Allows patients with speech impairments to communicate effectively with doctors in virtual consultations.
  4. Public & Civic Engagement: Promotes inclusivity in government meetings, public discussions, and community events, ensuring no voice goes unheard.
  5. Remote Work & Digital Communication: Bridges the gap for speech-impaired individuals in remote work settings, allowing them to engage in virtual teams without barriers.

Ethical Considerations

  • Bias in AI Training: Ensuring sign language models are trained on diverse datasets covering multiple sign languages (ASL, BSL, etc.) to prevent biases and misinterpretations.
  • User Data Privacy: Implementing strict data encryption and edge AI processing to minimize storage and sharing of sensitive communication data.
  • Cultural Sensitivity in Sign Language: Sign languages differ across regions, so the system must adapt dynamically based on the user's geographic and linguistic context.
  • Consent-Based AI Interactions: Users must have full control over their data, AI-generated voice selection, and system preferences.
  • Non-Disruptive Integration: The system should function as an assistive layer within existing meeting platforms (Zoom, Google Meet, Microsoft Teams), ensuring easy adoption

Existing resources

We have a repository of more than 5000 tech talents in our platform, who will act as beta testers for our solution, helping us improve the solution.

Open Source Licensing

MIT - Massachusetts Institute of Technology License

select_option

Proposal Video

Placeholder for Spotlight Day Pitch-presentations. Video's will be added by the DF team when available.

  • Total Milestones

    4

  • Total Budget

    $45,000 USD

  • Last Updated

    24 Feb 2025

Milestone 1 - Architecture Design

Description

Design the entire project architecture document and expected road map

Deliverables

A sharable pdf document for the project and in it a clear architectural diagram for the project

Budget

$4,000 USD

Success Criterion

success_criteria_1 The document contains a diagram for the architecture and its explanation

Milestone 2 - Speech to Sign Language Model Development

Description

Build or finetune a model which specifically does speech to sign language in any sign Language of choice

Deliverables

a link to the model in a git repository a video demo of it working

Budget

$15,000 USD

Success Criterion

link is accessible video shows how it works

Milestone 3 - Sign Language to speech model

Description

Build or finetune a model which specifically does sign language to speech in any sign Language of choice

Deliverables

a link to the model in a git repository a video demo of it working

Budget

$15,000 USD

Success Criterion

link is accessible video shows how it works

Milestone 4 - MVP Integration into an Interface

Description

Make a usable MVP that captures the most important features of the solution

Deliverables

Create an interface that will allow for the usage of the model in an MVP mode Integrate the model to the interface Make a video of how to use the solution

Budget

$11,000 USD

Success Criterion

A workable MVP

Join the Discussion (5)

Sort by

5 Comments
  • 0
    commentator-avatar
    stella
    Feb 25, 2025 | 4:09 AM

    I want to ask the team, how exactly will they build the text-to-sign language model? Does it use AI-generated video? Because AFAIK, AI-generated images/videos are famously bad at rendering fingers—often missing thumbs or even generate six or more fingers—so how can they possibly produce valid sign gestures?   also, isn’t the output of the DeepFunding proposal supposed to be deploying the model on the SingularityNET platform? Why does the proposal involve creating a standalone app?

    • 1
      commentator-avatar
      Remo Start
      Feb 27, 2025 | 11:55 AM

      I guess you mean speech to text to sign language model right? yes its an avatar that wil be doing the translating and rendering, its simply doing the backward propagation of the sign language to speech it has been trianed on, the other option is to simpy use fingers without the need to put an avatar face to it. The question of apps, deepfunding actually does allow creation of apps, the only difference is that you must necessarily create an AI service that will be onbaorded to the marketplace, there are many projests that were funded and created apps to that effect. But moreso this is BGI nexus rounds and the core demand is to create using AI it is not a MUST to onboard it to the marketplace as is the case with the usual deepfunding rounds

  • 2
    commentator-avatar
    MarcAndersen
    Feb 25, 2025 | 3:49 AM

    As someone from the hard of hearing community, I believe the sign language to speech idea misses the point. Sign language exists because spoken language is not accessible to people like myself. Converting it to speech defeats its whole purpose by focusing on a medium that isn't useful to signers. Even for communication with non-signers, I still think sign language to text is more practical and inclusive. Text is universally accessible, whether someone is deaf, hard of hearing, or doesn’t know sign language. It's encouraging to see more AI proposals in this area, and I hope they focus more on the actual needs of signers.

    • 1
      commentator-avatar
      Remo Start
      Feb 27, 2025 | 11:44 AM

      Hi Marc, thank you for your opinion, In addressing it I will like you to recall the contextualization of our solution in the summary, We drew the analogy of how institutions in order to foster inclusion make their facilities accesible to people with disability, this same premise is what we seek to achieve, we are not aiming to build another exclusive app for the signers community, for us thats not the kind of inclusion we seek, we seek an inclusion wthat brings everyone to the same space. our solution is to make these two community to coexist, I agree speech to text may be easier but is it necessarily the most inclusive? Does it necessarily bring both parties together? Take for example we are in confeence call maybe on zoom, gmeet etc, and then a signer takes the turn and is speaking, which would be most effective the ability for it to be converted to text or speech? Well for us we thought of it and we thought both should be applied hence in our methodology we stated that it is sign to text to speech which obviously is the natural flow, this means our app is offereing both options, why we emphasized on the speech is that sign to text already exists our innovation is in the converting it to speech in a sound of choice of the person. Another consideration is that just keeping it at the text level is actually excluding those who can hear but have deficiency in sight hence may strugge with reading the text output, our goal for this app is purely inclusion like the name implies and we are creating an ai conferencing app with the pure intent of bringing everyone into the same space for meaningful dialogue and conversations irrespective. With respect to your commendation on the fact that we are thinking in this direction thank you a lot, we think more of this conversation is needed for us to buid an inclusive world 

      • 0
        commentator-avatar
        Devbasrahtop
        Mar 8, 2025 | 5:00 PM

        Awesome response. I like the innovative approach of sign language to text and to speech, someone like myself who find it deficult to understand signs from a friend that most times leads to frustration to both side could not think otherwise than to say, this is a great innovation. if my friend use sign language to communicate with me and I would be audily able to understand what it means, and myself responding with either text or speech to be translated to signs for effective communication is really a big game changers. My only reservation will be on the model accuracy in the translations. I love this innitiative, it's include all and not for only the signers community. 

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon