Synthetic Dataset Generator Using Qwen Models

chevron-icon
RFP Proposals
Top
chevron-icon
project-presentation-img
morgan
Project Owner

Synthetic Dataset Generator Using Qwen Models

Expert Rating

n/a

Overview

This project aims to build a flexible, AI-powered synthetic dataset generator leveraging Qwen 2.5’s robust natural language understanding and structured data generation capabilities. The generator will produce datasets tailored to different levels of complexity (basic, intermediate, and advanced) across domains such as genetics, public health, and socio-economics. The Qwen 2.5 model, known for its support for large-scale datasets and context lengths up to 128K tokens, is well-suited for generating structured outputs in JSON, tables, and other formats.

RFP Guidelines

Create corpus for NL-to-MeTTa LLM

Internal Proposal Review
  • Type SingularityNET RFP
  • Total RFP Funding $70,000 USD
  • Proposals 10
  • Awarded Projects n/a
author-img
SingularityNET
Aug. 13, 2024

Develop a MeTTa language corpus to enable the training or fine-tuning of an LLM and/or LoRAs aimed at supporting developers by providing a natural language coding assistant for the MeTTa language.

Proposal Description

Proposal Details Locked…

In order to protect this proposal from being copied, all details are hidden until the end of the submission period. Please come back later to see all details.

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

  • Total Milestones

    6

  • Total Budget

    $35,000 USD

  • Last Updated

    7 Dec 2024

Milestone 1 - Prototype Development

Description

Develop an initial version of the synthetic code dataset generator integrating Qwen 2.5 for MeTTa examples generation. We'll use an orchestration framework to check the equality of outputs between code examples (python, js, etc.)

Deliverables

Script to finetune Qwen + coding dataset (opensource) to be translated to MeTTa

Budget

$5,000 USD

Success Criterion

Fine-tuned Qwen model + human-filtered code dataset

Milestone 2 - Develop 2000

Description

Develop 2000 code examples (MeTTa)

Deliverables

Develop 2000 code examples (MeTTa)

Budget

$6,000 USD

Success Criterion

Working 2000 examples (MeTTa)

Milestone 3 - Develop 2000 - 4000

Description

Develop 2000 code examples (MeTTa)

Deliverables

Develop 2000 code examples (MeTTa)

Budget

$6,000 USD

Success Criterion

Working 2000 examples (MeTTa)

Milestone 4 - Develop 4000 - 6000

Description

Develop 2000 code examples (MeTTa)

Deliverables

Develop 2000 code examples (MeTTa)

Budget

$6,000 USD

Success Criterion

Working 2000 examples (MeTTa)

Milestone 5 - Develop 6000 - 8000

Description

Develop 2000 code examples (MeTTa)

Deliverables

Develop 2000 code examples (MeTTa)

Budget

$6,000 USD

Success Criterion

Working 2000 examples (MeTTa)

Milestone 6 - Develop 8000 - 10000

Description

Develop 2000 code examples (MeTTa)

Deliverables

Develop 2000 code examples (MeTTa)

Budget

$6,000 USD

Success Criterion

Working 2000 examples (MeTTa)

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon