MeTTa Language Corpus

chevron-icon
RFP Proposals
Top
chevron-icon
project-presentation-img
amcmaster1988
Project Owner

MeTTa Language Corpus

Expert Rating

n/a

Overview

This proposal outlines a strategic approach for developing a high-quality MeTTa language corpus that aligns with the objectives of the SingularityNET Foundation. The aim is to deliver a comprehensive dataset of 10,000 well-curated instruction-output pairs over a four-month period. Leveraging our team's significant expertise in natural language processing, corpus development, and AI training—coupled with extensive experience in Lisp-based languages similar to MeTTa—this proposal is designed to meet the project’s requirements with rigor and efficiency.

RFP Guidelines

Create corpus for NL-to-MeTTa LLM

Proposal Submission (4 days left)
  • Type SingularityNET RFP
  • Total RFP Funding $70,000 USD
  • Proposals 4
  • Awarded Projects n/a
author-img
SingularityNET
Aug. 13, 2024

Develop a MeTTa language corpus to enable the training or fine-tuning of an LLM and/or LoRAs aimed at supporting developers by providing a natural language coding assistant for the MeTTa language.

Proposal Description

Proposal Details Locked…

In order to protect this proposal from being copied, all details are hidden until the end of the submission period. Please come back later to see all details.

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

  • Total Milestones

    3

  • Total Budget

    $30,000 USD

  • Last Updated

    4 Nov 2024

Milestone 1 - Extraction of MeTTa resources/data structuring.

Description

In Month 1, the project will commence with the extraction and analysis of existing MeTTa resources, including official documentation, community-contributed code, and repository data. This phase will involve organizing and structuring the initial dataset to create a comprehensive foundation for the corpus.

Deliverables

Deliverables: Comprehensive review of MeTTa documentation, community code, and repositories. Extraction scripts and initial structured dataset of instruction-output pairs. Preliminary report detailing data sources and initial findings.

Budget

$10,000 USD

Milestone 2 - Initial assembly of the corpus

Description

Month 2-3 will focus on the development and rigorous testing of data processing scripts to facilitate efficient data extraction, conversion, and formatting. During this phase, the initial assembly of the corpus will take shape, with automated processes ensuring consistency and readiness for subsequent expansion.

Deliverables

Deliverables: Completed and tested data extraction and processing scripts. Preliminary version of the corpus containing structured instruction-output pairs. Interim validation report documenting the results of script testing and early-stage corpus quality.

Budget

$10,000 USD

Milestone 3 - Final validation, comprehensive documentation

Description

In Month 4, the project will enter its final phase, focusing on comprehensive validation of the entire corpus and the completion of detailed documentation. Integration testing will be conducted to ensure compatibility with modern linters and machine learning frameworks. This phase will culminate in the release of the complete corpus and associated open-source code, marking the successful conclusion of the project and enabling future development and applications.

Deliverables

Deliverables: Fully validated and finalized corpus of 10,000 instruction-output pairs. Comprehensive, version-controlled documentation detailing the corpus creation process, validation steps, and known limitations. Integration testing report showing compatibility with modern linters and machine learning frameworks. Final open-source release package containing the complete corpus, data processing scripts, and associated documentation.

Budget

$10,000 USD

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon