Pedagogy MeTTa-NLP Corpus Generation

chevron-icon
RFP Proposals
Top
chevron-icon
project-presentation-img
simuliinc
Project Owner

Pedagogy MeTTa-NLP Corpus Generation

Expert Rating

n/a

Overview

This proposal outlines the creation of a 20,000-pair MeTTa language corpus to enable training of an AI coding assistant. The approach involves generating instruction-output pairs through a combination of data collection, processing, and synthesis, followed by rigorous validation using both automated and human review. The corpus will cover six key areas including arithmetic, functional programming, and AGI-specific tasks. The $35,000, 6-month project delivers the corpus, validation tools, documentation, and a roadmap for future updates. A unique aspect is that the extraction/generation model can later validate the resulting MeTTa LLM and a pedagogical approach.

RFP Guidelines

Create corpus for NL-to-MeTTa LLM

Internal Proposal Review
  • Type SingularityNET RFP
  • Total RFP Funding $70,000 USD
  • Proposals 10
  • Awarded Projects n/a
author-img
SingularityNET
Aug. 13, 2024

Develop a MeTTa language corpus to enable the training or fine-tuning of an LLM and/or LoRAs aimed at supporting developers by providing a natural language coding assistant for the MeTTa language.

Proposal Description

Proposal Details Locked…

In order to protect this proposal from being copied, all details are hidden until the end of the submission period. Please come back later to see all details.

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

  • Total Milestones

    4

  • Total Budget

    $35,000 USD

  • Last Updated

    7 Dec 2024

Milestone 1 - Fundamental MeTTa Corpus Foundation

Description

Generate and validate first batch of instruction-output pairs covering arithmetic operations and functional programming paradigms in MeTTa. Establish initial validation framework.

Deliverables

6,000-7,000 validated instruction-output pairs Initial extraction/generation model Validation tooling first version Documentation of processes used

Budget

$10,000 USD

Success Criterion

95% pass rate on automated validation checks Human expert validation of random 10% sample Successful execution of all code samples Documentation peer reviewed by 2 team members

Milestone 2 - Symbolic & Graph Operations

Description

Develop and validate pairs focused on symbolic reasoning and graph operations. Enhance validation framework based on learnings.

Deliverables

Additional 6000-7000 validated pairs Improved validation framework Updated extraction/generation model Integration tests for new pairs

Budget

$10,000 USD

Success Criterion

97% pass rate on automated validation Cross-validation by separate model implementations All graph operations verified with test cases Zero conflicts with existing corpus

Milestone 3 - Advanced AGI Features

Description

Complete corpus with AGI-specific tasks and probabilistic models while refining overall quality.

Deliverables

Final 6000-7000 validated pairs Finalized validation system Complete extraction/generation model Comprehensive test suite

Budget

$10,000 USD

Success Criterion

99% pass rate on automated validation Full coverage of specified AGI tasks Successful integration with Hyperon framework All probabilistic models verified accurate

Milestone 4 - Documentation & Tools

Description

Package all tools create comprehensive documentation and establish future maintenance protocols.

Deliverables

Complete 20k pair corpus All source code and tools Comprehensive documentation Tutorial videos and examples

Budget

$5,000 USD

Success Criterion

Successful test runs by external developers Documentation covers all major use cases Tools successfully deployed in test environment Positive feedback from user testing

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon