Seb Wiechers
Project OwnerProject owner. Responsible for curating the team, defining workflows, designing algorithms, requirements. Also: upskilling human labeling team to working knowledge of MeTTa language.
We propose to curate two (natural language <-> MeTTa) expression datasets, respectively the *silver* dataset, consisting of 20.000 AI-generated, probabilistically verified (NL <-> MeTTa) pairs, and the *gold* dataset, consisting of 10.000 human-labeled, high-quality pairs. The proposed timeline is 4 months. Funding will be used to cover the expense of a) compute costs b) scoring output-pairs, c) developing algorithms for estimating the probability of correct predictions. Our knowledge of the MeTTa language and NLP, linguistics, background in logic, AI and real-world organizational experience places us in a perfect position to have a compounding effect on the SNET ecosystem.
Develop a MeTTa language corpus to enable the training or fine-tuning of an LLM and/or LoRAs aimed at supporting developers by providing a natural language coding assistant for the MeTTa language.
In order to protect this proposal from being copied, all details are hidden until the end of the submission period. Please come back later to see all details.
1 month: gathering of team, resources and tooling. Defining human labeling workflow. Selecting appropriate datasets and preparing the data pipeline in python (deliverable)
We will open source a data pipeline that can be used to generate and curate NL <> MeTTa pairs.
$4,000 USD
2 months: small-scale testing of different generative approaches and development of a 'common sense' algorithm that checks parsed metta expressions against a logical properties of input statements (deliverable)
We will open source an algorithm and approach that can be used to perform a 'common sense' test on generated MeTTa statements, if the logical relation between input expressions is known beforehand.
$6,000 USD
3 months: start human labeling-process, while incrementally using findings to produce the silver dataset.
By this time we expect to be able to deliver 2.000 gold pairs, as well as 20.000 silver pairs.
$8,500 USD
4 months: delivery of the silver dataset (minimum 20.000 labeled pairs) and a gold dataset (10.000 pairs)
We finish the project, delivering the remaining 8.000 gold pairs.
$8,500 USD
Reviews & Ratings
Please create account or login to write a review and rate.
Check back later by refreshing the page.
Join the Discussion (0)
Please create account or login to post comments.