Prompt-Instruct-Response MeTTa CORPUS

chevron-icon
RFP Proposals
Top
chevron-icon
project-presentation-img
Remo Start
Project Owner

Prompt-Instruct-Response MeTTa CORPUS

Expert Rating

n/a

Overview

The methodology utilized in creating a corpus for an LLM is as important as the quality of the dataset, we propose to create a corpus for MeTTa using the prompt-instruct-response approach to allow the model to learn both the theoretical and practical concepts and aspects of MeTTa programming language.The prompt-instruct-response approach is a technique in the LLM dataset that allows the LLM to learn not only from the response but the instructions and prompt. The implication of this is that any coding assistance trained on such a corpus better understands contexts around each line in the dataset. Where context, efficiency and practicality is of the essence this methodology performs best.

RFP Guidelines

Create corpus for NL-to-MeTTa LLM

Internal Proposal Review
  • Type SingularityNET RFP
  • Total RFP Funding $70,000 USD
  • Proposals 10
  • Awarded Projects n/a
author-img
SingularityNET
Aug. 13, 2024

Develop a MeTTa language corpus to enable the training or fine-tuning of an LLM and/or LoRAs aimed at supporting developers by providing a natural language coding assistant for the MeTTa language.

Proposal Description

Proposal Details Locked…

In order to protect this proposal from being copied, all details are hidden until the end of the submission period. Please come back later to see all details.

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

  • Total Milestones

    3

  • Total Budget

    $35,000 USD

  • Last Updated

    8 Dec 2024

Milestone 1 - Project Setup and Initial Design

Description

Establish the foundational framework for the project by finalizing the methodology, identifying data sources, and developing initial scripts for data extraction and formatting

Deliverables

Project plan and timeline. List of identified MeTTa resources (e.g., tutorials, GitHub repositories, documentation) to be used as data sources. Initial data extraction and formatting scripts.

Budget

$7,000 USD

Success Criterion

Project plan and timeline approved by the team. At least three MeTTa resources identified and documented. At least ten sample resource of extracted and formated data.

Milestone 2 - Corpus Development and Validation

Description

Build the core of the dataset by creating the first 7,000 prompt-instruct-response pairs from existing MeTTa resources and validating their correctness

Deliverables

A corpus containing 7,000 validated prompt-instruct-response pairs. Validation report ensuring correctness of at least 95% of the dataset. Initial documentation describing data sources, methods for extraction, and validation process.

Budget

$21,000 USD

Success Criterion

Dataset contains 7,000 prompt-instruct-response pairs in the defined JSON structure. Validation confirmation ≥ 95% of outputs are error-free and adhere to MeTTa standards. Initial documentation is reviewed and finalized.

Milestone 3 - Completion, Refinement, and Documentation

Description

Finalize the dataset by synthesizing new data to complete 10,000 pairs, validate the entire corpus, and produce comprehensive documentation for release

Deliverables

Complete 10,000-pair MeTTa corpus, including synthesized data for underrepresented features. Final validation report ensuring corpus quality and diversity. Fully functional scripts for data generation and validation, released under the MIT License. Comprehensive final documentation, comprehensive final josn file and an optional CSV version of the corpus.

Budget

$7,000 USD

Success Criterion

Corpus contains exactly 10,000 entries, validated with ≥ 95% correctness. Scripts and dataset are uploaded to a version-controlled repository. Final documentation reviewed and accessible, final json file submitted and verified with an optional CSV file generated and verified(optional if we have time within the 4 months then csv will be added to the jsonl file)

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon