SNET Machine Translation for Minority Languages

chevron-icon
Back
project-presentation-img
Presentation
Expert Review🌟
Ubio Obu
Project Owner

SNET Machine Translation for Minority Languages

Funding Requested

$25,000 USD

Expert Review
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 4
Community
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 4.4 (9)

Overview

SingularityNet understands the importance of machine translations as a means of ensuring inclusion and making the AGI more accessible to all. This is why Ben Goertz established the ML team for machine translation for undersourced languages, with a pilot ongoing in Ethiopia in languages of Amharic, Tigray, etc. While the Pilot is ongoing in Ethiopia this proposal is to create a community-centric effort that will complement the work of the iCog Labs team who are working in Ethiopia and provide a community corpus of various minority languages that will be utilized by the SNET machine translation team to build a more comprehensive translation tool which incorporates the diversity of our world.

Proposal Description

How Our Project Will Contribute To The Growth Of The Decentralized AI Platform

This RFP will allow the community to contribute to a project which is currently ongoing in the SNET core team, breeding an inter relationship between the Deepfunding community and the SNET core team, also the output of this project will be an asset that will be available to many who intend to use the model when made available on the AI market place

Our Team

For writing the RFP, I Ubio Obu will be writing the project while the Editing and proof reading will be done by the Remostart team. I will be writing the RFP due to my insight on the issue from the SNET and iCogLab team and Remostart will be bring professionals to help with the proof reading of whatever has been written

View Team

Company Name (if applicable)

Remostart

The core problem we are aiming to solve

There are about 7000 known  languages globally, most Machine learning translation projects are highly focused on the popular languages, obviously for the commercial value that creating a Corpus of English, French, Chinese etc with only about 100 of these languages being translated today. Unfortunately, for the true AGI to be built we must go beyond commerce to understand the place of cultural heritage and language reservation. Odds are that most languages will face extinction if not been translated into the AI adaptable and understandable languages, in an ever-increasing AI-dependent future, this is why inclusion becomes a MUST for those working in this space.
SingularityNet understands this problem and as one of the leaders in the true AGI, has started copulating undersourced languages into her ML translation models. However we understand that this will be almost impossible to achieve, faster and  on time, except there's a community and decentralized approach to this and this is what this proposal is about.

Our specific solution to this problem

Taking cue from the work of the iCog Labs team in Ethiopia this is a proposal to create an RFP that will allow the community to create a corpus of voice and text data set that will be used to train and expand the SNET translation tooling. This is purely complimentary work as such it is meant to be directly in alignment with the process and method that the iCog Lab team is using to gather and process data in Ethiopia. The goal of this RFP is to allow the community to identify minority languages within their space and work actively to standardize those languages into the SNET language model, it is a community approach to reaching the 7000 languages faster and better.
There will be strict regulations and standards that will come with the RFP being created as it must align with existing SNET translation structure for data conformity, veracity, and interoperability. It must also utilize the Leyu.ai platform of the iCog Labs team for collecting these languages. These and many more will be part of the mandatory constitutes of the RFP.

Proposal Video

DF Spotlight Day - DFR4 - Ubio Obu - SNET Machine Translation for Minority Languages

3 June 2024
  • Total Milestones

    3

  • Total Budget

    $25,000 USD

  • Last Updated

    3 Jun 2024

Milestone 1 - Engaging with the Internal teams and stakeholders

Description

This is where I will be having further discussions with the SNET ML translation team iCog Labs and every other party involved on the unity gritty details that will make this RFP

Deliverables

A transription of our meeting note or a voice recording if possible

Budget

$5,000 USD

Milestone 2 - Draft the RFP

Description

At this point I will be documenting the RFP with all the details that will be needed to make it easy to comprehend and easy to apply as well

Deliverables

A draft of the completed written RFP

Budget

$14,996 USD

Milestone 3 - Proof Reading

Description

The written draft is given to expert to review and proof rean and ensure that it is in line with the acceptable standard

Deliverables

A reviewed RFP that has passed through proof reading and ready to be used

Budget

$5,004 USD

Join the Discussion (5)

Sort by

5 Comments
  • 0
    commentator-avatar
    Emotublockchain
    Jun 9, 2024 | 7:22 PM

    Do you plan to engage and collaborate with native speakers or linguistic experts of minority languages to continuously improve and validate the accuracy of the translation models?

  • 0
    commentator-avatar
    nwobodojerry
    Jun 9, 2024 | 7:18 PM

    What measures will this project take to address potential challenges of accuracy in translations, linguistic nuances and cultural elements specific to minority languages in other to ensuring that the translations are both precise and contextually relevant?

  • 0
    commentator-avatar
    GraceDAO
    May 22, 2024 | 4:59 AM

    How will you collect enough data from these languages? What are you going to do about the languages that do not have a written form?

    • 0
      commentator-avatar
      Ubio Obu
      May 27, 2024 | 2:23 AM

      Hi Grace, thanks for your question, don't forget that this is for RFP design and when voted in I will give more specifications to whomever will be applying on what they can apply with. However just to answer your question the Icog labs team do have a clear methodology for collecting enough data, which includes platform, tools, pay per data and the RFP will specify that proposers follow that same method which they are using to gather enough data. For the unwritten languages we will not be starting with those of the 7000 languages, 4000 have written languages, so even the race towards the 4000 is a lot and while we do the languages with written literature, we will be planning for those with oral literature

      • 0
        commentator-avatar
        GraceDAO
        May 29, 2024 | 7:46 PM

        This is a very helpful clarification, thank you.

Expert Review

Overall

4

user-icon
  • Feasibility 4
  • Viability 4
  • Desirabilty 5
  • Usefulness 4
This proposal addresses a highly relevant issue

"If we want to take decentralization to the next level and truly reach all corners of the planet, impacting all social layers in a transversal manner, we must work on minority languages. This proposal addresses a highly relevant issue: the problem of language in the decentralization of Artificial Intelligence. Designing an RFP that aims to work on a translation machine of this type would position SingularityNET and its entire Community at the forefront of social and technological innovation, while also providing an opportunity to bring AI decentralization to all corners of the world. 

Ubio Obu is a recognized member of the Community for his track record and commitment to decentralization and the mission and vision of SingularityNET. His name is synonymous with trust; he would be building upon already existing work, strengthening and enhancing the Community's current efforts.The proposal is solid and consistent, the feasibility and capability of the team are highly assured."

Sort by

9 ratings
  • 0
    user-icon
    Gombilla
    Jun 10, 2024 | 10:25 AM

    Overall

    5

    • Feasibility 5
    • Viability 4
    • Desirabilty 5
    • Usefulness 5
    Promotes standardization of minority languages

    As a linguist myself, I find this very fascinating. I believe this project will promotes the preservation and standardization of minority languages of which several are already on the brink of extinction. Also, this project will enhance cultural inclusivity by integrating minority languages into the SNET language model, allowing more people to access and benefit from AI technologies in their native languages. Great job

  • 0
    user-icon
    Max1524
    Jun 10, 2024 | 8:38 AM

    Overall

    3

    • Feasibility 2
    • Viability 3
    • Desirabilty 3
    • Usefulness 3
    Author should make adjustment suit his reputation

    I see that Ubio Obu already has this reputation for a long time and this proposal is further confirming his credibility. But I have to review that because I'm a bit disappointed because of the short content of the milestones, plus the duration of the milestones is not mentioned. This greatly affects the feasibility and, more deeply, Ubio Obu's reputation is also somewhat affected. I hope Ubio Obu pays attention to my comments and makes appropriate adjustments

  • 0
    user-icon
    Emotublockchain
    Jun 9, 2024 | 7:36 PM

    Overall

    4

    • Feasibility 4
    • Viability 5
    • Desirabilty 4
    • Usefulness 5
    SNET\'s MT Project: Overcoming Linguistic Diversity

    The SNET Machine Translation For Minority Languages is a highly commendable project. An initiative to include minority languages in the realm of machine translation is crucial for ensuring linguistic diversity and promoting inclusivity in AI technologies.

    However scarcity of data can lead to inaccurate translations and a higher rate of semantic errors. To address this, the project should actively engage with native speakers and linguistic experts to validate datasets. Community involvement is essential not only for data collection but also for ensuring cultural and contextual nuances are accurately taken Into consideration.

     

    Linguistic diversity within minority languages poses some level of complexity. Dialects, idiomatic expressions, and local variations can significantly affect translation quality. The project culd adopt a flexible and adaptive approach, incorporating feedback mechanisms that allow for continuous improvement of the models. Engaging with academic institutions and local communities can provide the necessary linguistic expertise to handle these variations effectively.

     

    The limitations of current machine translation technology also need to be acknowledged. While AI has made significant strides in recent years, it still struggles with contextually rich and syntactically complex sentences. This can be particularly problematic for minority languages that may have unique grammatical structures and idiomatic expressions. To address this, integrating a hybrid approach that combines rule-based and statistical methods with neural network models could enhance translation accuracy.

     

    In addition the ethical implications of deploying AI-driven translations in sensitive contexts must be considered. Misinterpretations in medical, legal, or educational content can have serious consequences. Therefore, the project should incorporate rigorous validation and testing phases before deploying the translation tools in critical applications

  • 0
    user-icon
    nwobodojerry
    Jun 9, 2024 | 6:59 PM

    Overall

    5

    • Feasibility 4
    • Viability 5
    • Desirabilty 4
    • Usefulness 5
    Laudable Project With A Comprehensive Approach

    This project is a highly impressive initiative aimed at bridging language barriers through advanced machine translation (MT) technology led by Ubio Ubo a recognized member of the SNET community. His commitment to decentralization is laudable. The project\'s focus on minority and undersourced languages, with an ongoing pilot in Ethiopia targeting languages like Amharic and Tigray, underscores its commitment to inclusivity and accessibility. This proposal seeks to amplify these efforts by fostering a community-centric approach that complements the work of the iCog Labs team in Ethiopia. By creating a community corpus of various minority languages, the project aims to enhance the comprehensiveness of SNET\'s translation tools, reflecting the rich linguistic diversity of our world.

     

    What stands this project apart is its focus on minority languages, often overlooked by mainstream MT initiatives. By concentrating on these languages, the project addresses a significant gap in the market, ensuring that speakers of minority languages are not left behind in the digital age. Additionally, the collaborative approach involving the community in building the language corpus adds a layer of authenticity and richness to the data, potentially leading to more accurate and culturally sensitive translations.

     

     

    By expanding the range of languages covered by SNET\'s MT capabilities, the project enhances the platform\'s utility and appeal to a broader audience. This aligns with SNET\'s mission of democratizing AI and making it accessible to all, regardless of linguistic background. Furthermore, the project\'s community-driven approach fosters deeper engagement between the Deepfunding community and the SNET core team, promoting collaboration and a sense of shared purpose.

     

     

    The SNET Machine Translation For Minority Languages project is a commendable effort to promote linguistic inclusivity through advanced AI technology. Its focus on minority languages and community involvement are its standout features, though careful attention to data quality, community engagement, and scalability will be crucial for its long-term success. By addressing these areas, the project can significantly contribute to the growth and inclusivity of the SNET ecosystem, making advanced AI tools accessible to a wider audience.

     

    This project\'s viability hinges on several factors. Firstly, the successful collection and quality of the linguistic data from the community are crucial. Ensuring that the data is representative and comprehensive enough to train robust MT models will be a significant challenge. In addition, the project\'s success depends on its ability to integrate seamlessly with the existing efforts of the iCog Labs team and the broader SNET core team, requiring effective coordination and commu

    nication.

  • 0
    user-icon
    Onize Olie
    Jun 6, 2024 | 6:35 PM

    Overall

    5

    • Feasibility 5
    • Viability 4
    • Desirabilty 5
    • Usefulness 4
    Community-Driven Language Preservation.

    This proposal should be funded for several compelling reasons. First and foremost, it addresses a critical gap in current machine learning translation efforts, which predominantly focus on popular languages. By expanding the scope to include underrepresented languages, this project aims to preserve cultural heritage and prevent the extinction of lesser-known languages. In the context of developing true Artificial General Intelligence (AGI), understanding a broad spectrum of human languages is essential for capturing the full diversity of human thought and culture.

    The proposal’s approach, which leverages a decentralized and community-driven model, is particularly noteworthy. By engaging communities to identify and standardize minority languages, the project not only accelerates the process of data collection but also ensures that the data is accurate and culturally relevant. This grassroots approach enhances the sustainability and inclusiveness of the initiative, making it more likely to succeed in its ambitious goal of reaching 7000 languages.

    Moreover, the proposal builds on the proven methods of the iCog Labs team in Ethiopia, which adds a layer of credibility and practicality. By aligning with an established methodology and utilizing the Leyu.ai platform, the project can ensure data conformity, veracity, and interoperability. This structured framework is crucial for maintaining high standards and consistency across the diverse range of languages to be included.

    In terms of implementation, several aspects need to be carefully executed. The creation of a Request for Proposal (RFP) that aligns with the existing SNET translation structure is paramount. This RFP must detail the regulations and standards for data collection, ensuring that the process is rigorous and the data collected is of high quality. Additionally, utilizing the Leyu.ai platform for data gathering will be critical in maintaining consistency and integrating seamlessly with current efforts by the iCog Labs team. Clear guidelines and support for communities involved in this initiative will be necessary to empower them and facilitate effective participation.

    Overall, this proposal represents a significant step towards a more inclusive and comprehensive approach to language preservation and machine learning translation. Its success could have far-reaching implications for the development of AGI and the preservation of global linguistic diversity.

  • 0
    user-icon
    Tu Nguyen
    Jun 3, 2024 | 8:14 AM

    Overall

    4

    • Feasibility 3
    • Viability 4
    • Desirabilty 4
    • Usefulness 4
    SNET Machine Translation For Minority Languages

    The problem this proposal would solve is that most languages ​​will face extinction if not translated into languages ​​that are easy to understand and adapt to AI. Their solution was to create an RFP that would allow the community to create a speech and text dataset that would be used to train and scale the SNET translation engine.
    I think they should clearly share information about the Remostart team because this team plays a very important role in the project.

  • 0
    user-icon
    CLEMENT
    Jun 1, 2024 | 11:56 AM

    Overall

    5

    • Feasibility 5
    • Viability 5
    • Desirabilty 5
    • Usefulness 5
    This Addresses a need for language preservation

    Hi Ubio. I will commend your focus on language accessibility over the past years. Your dedication to this niche is quite commendable. As regards the desirability and usefulness of this RFP, I believe it addresses a critical need for language preservation and accessibility, particularly for minority languages that are often overlooked by mainstream translation tools.

    Additionally, within the SNET scheme of things, this project will contribute to it by expanding the scope and reach of AI-powered translation services. I can see your objective of incorporating minority languages into the SNET language model. I am sure this will the platform's ability to serve diverse linguistic communities.

    Ultimately, this will not only attract new users from underserved language groups but also enriches the marketplace ecosystem by fostering collaboration and knowledge exchange across linguistic boundaries.

    Kudos to your team !

  • 0
    user-icon
    King Eddie
    May 27, 2024 | 1:33 PM

    Overall

    5

    • Feasibility 5
    • Viability 5
    • Desirabilty 5
    • Usefulness 4
    A much needed innovation

    The proposal for SNET Machine Translation for Minority Languages, submitted by Ubio, seeks $25,000 to support a community-driven initiative that complements the ongoing efforts of iCog Labs in Ethiopia. The project aims to develop a comprehensive corpus of minority languages, leveraging community input to enhance SingularityNet\'s (SNET) machine translation capabilities. This initiative addresses the significant gap in AI translation for undersourced languages, promoting cultural heritage and language preservation. The proposal outlines the creation of an RFP to engage the community in collecting voice and text data sets, aligning with SNET\'s existing translation structures. By fostering a decentralized approach, the project aims to accelerate the inclusion of approximately 7,000 global languages, ensuring broader accessibility and inclusivity for AGI. The team, led by Ubio Obu with support from Remostart professionals, is poised to deliver a standardized and interoperable solution that enhances the SNET language model.

  • 0
    user-icon
    Joseph Gastoni
    May 22, 2024 | 8:17 AM

    Overall

    4

    • Feasibility 4
    • Viability 3
    • Desirabilty 4
    • Usefulness 4
    a project to build a corpus of minority languages.

    This proposal outlines a community-driven project to build a corpus of minority languages for SingularityNet's machine translation tool. Here's a breakdown of its strengths and weaknesses:

    Feasibility:

    • Moderate-High: The core concept of a community-built corpus is feasible, but requires effective recruitment, quality control, and adherence to data collection standards.
      • Strengths: Leverages the power of the community to gather diverse language data.
      • Weaknesses: Ensuring data quality, consistency, and adherence to SNET's standards across a potentially large and geographically dispersed community requires careful planning.

    Viability:

    • Moderate: Success depends on attracting a large enough community, their dedication to data collection, and effective integration with SNET's existing efforts.
      • Strengths: The proposal offers a potentially valuable resource for SNET's translation tool.
      • Weaknesses: The proposal lacks details on the long-term engagement strategy and the effort required for data processing and integration.

    Desirability:

    • High: For those passionate about language preservation and inclusivity in AI, this could be highly desirable.
      • Strengths: The proposal aligns with SingularityNet's goal of a more inclusive AGI and offers a way for communities to contribute.
      • Weaknesses: The proposal needs to address the potential time commitment and technical barriers for community participation.

    Usefulness:

    • Moderate-High: The project has the potential to significantly improve the diversity and accuracy of SNET's machine translation tool, but its impact depends on the volume and quality of data collected.
      • Strengths: The proposal offers a way to expand the language coverage of SNET's translation tool beyond commercially dominant languages.
      • Weaknesses: The proposal lacks details on how the project will measure the quality and effectiveness of the collected data for machine translation purposes.

    Overall, Remostart's project has a noble goal, but focus on:

    • Community recruitment and engagement strategy: Developing a clear plan to attract and retain a large enough community of contributors across diverse minority languages.
    • Data quality control mechanisms: Establishing strict quality control measures to ensure the accuracy, consistency, and adherence to SNET's standards for the collected data.
    • Integration with existing efforts: Ensuring seamless integration of the community-built corpus with SingularityNet's ongoing machine translation project.
    • Impact measurement: Developing a plan to measure the project's impact on the diversity and accuracy of SNET's machine translation tool.

    By addressing these considerations, Remostart's project can become a valuable asset for promoting inclusivity and expanding SingularityNet's machine translation capabilities.

    Here are some strengths of this project:

    • Promotes a community-driven approach to language preservation and inclusivity in AI development.
    • Complements SingularityNet's existing efforts on machine translation for under-resourced languages.
    • Offers a way to expand the language coverage of the translation tool beyond commercially dominant languages.

Summary

Overall Community

4.4

from 9 reviews
  • 5
    5
  • 4
    3
  • 3
    1
  • 2
    0
  • 1
    0

Feasibility

4.1

from 9 reviews

Viability

4.2

from 9 reviews

Desirabilty

4.3

from 9 reviews

Usefulness

4.3

from 9 reviews

Get Involved

Contribute your talents by joining your dream team and project. Visit the job board at Freelance DAO for opportunites today!

View Job Board