Multi-objective EAs for LLM multiparameter tuning

chevron-icon
RFP Proposals
Top
chevron-icon
project-presentation-img
Luke Mahoney (MLabs)
Project Owner

Multi-objective EAs for LLM multiparameter tuning

Expert Rating

n/a

Overview

Tuning hyperparameters when building an LLM, or any DNN, is a difficult task: they are inter-dependent both with each other, the data and the desired learning outcomes, and even determining acceptable values is a difficult process with little to no guidance. Thus, the process is tedious, repetitive and not likely to succeed. To solve this problem, we will design, implement and test an EA-based solution to the hyperparameter tuning problem, designed to be extensible to suit various choices of hyperparameter(s) and measurable objectives. To demonstrate its efficacy, we will use NanoGPT as a test bed, to tune context window size and vocabulary size, measuring efficacy by comparing loss.

RFP Guidelines

Evolutionary algorithms for training transformers and other DNNs

Internal Proposal Review
  • Type SingularityNET RFP
  • Total RFP Funding $40,000 USD
  • Proposals 8
  • Awarded Projects n/a
author-img
SingularityNET
Aug. 12, 2024

Explore and demonstrate the use of evolutionary methods (EMs) for training various DNNs including transformer networks. Such exploration could include using EMs to determine model node weights, and/or using EMs to evolve DNN/LLM architectures. Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is an example of one very promising evolutionary method among others.

Proposal Description

Proposal Details Locked…

In order to protect this proposal from being copied, all details are hidden until the end of the submission period. Please come back later to see all details.

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

  • Total Milestones

    4

  • Total Budget

    $40,000 USD

  • Last Updated

    5 Dec 2024

Milestone 1 - Design

Description

We will investigate literature on EAs to produce implementable designs (or choices) for the following: - A multi-objective evolutionary selection strategy - A suitable genotype representation for the hyperparameters we aim to tune - Suitable evolutionary operations (mutation and crossover) - An approach to avoid the rebuilding problem - An approach to avoid the paralellism problem Then we will create a combined implementable design for an EA using the above designs or choices.

Deliverables

1. A design or choice for a genotype representation for context window size and vocabulary size. 2. A design or choice for evolutionary operations (mutation and crossover) for the genotype representation in 1. 3. A design or choice for a multi-objective evolutionary selection operation. 4. A strategy for avoiding the rebuilding problem. 5. A strategy or approach for parallelising the EA. 6. An overall design combining 1-5 in a single approach.

Budget

$8,800 USD

Success Criterion

1. A combined design for an EA, with appropriate genotype, evolutionary operations, multi-objective selection and strategies for handling parallelization and rebuilding. 2. All parts of the design from 1 are specified in a document, with an emphasis on implementation considerations. 3. The evolutionary operations are fair (in the EA sense). 4. All design choices are grounded in existing EA work, particularly the multi-objective optimization function.

Milestone 2 - Development

Description

Using the design in Milestone 1 we will implement an executable program based on that design. Said executable will metaheuristically search for choices for context window size and vocabulary size using NanoGPT and the combined works of Shakespeare data set. As part of this work we will design some IPC-like interface for allowing our executable to call NanoGPT with specific parameters then return loss information to allow our approach to work. Our implementation must allow easy modification of the population size and the number of generations to run to serve future benchmarking.

Deliverables

1. A working implementation of the Milestone 1 design. 2. A wrapper or testing rig for NanoGPT that allows making requests from outside the program to train a model given our choice of hyperparameters and returning loss information. 3. The implementation from 1 must have easily modifiable population size and generation counts such as via a CLI option.

Budget

$14,300 USD

Success Criterion

1. The Milestone 1 design is implemented successfully as an executable, and can be run on problem instances. 2. The testing rig for NanoGPT responds correctly to a structured request from outside its own process, returning the loss information in a structured form. 3. The implementation from 1 can be configured (such as via CLI or config file) to modify population size and generations without requiring a rebuild or code changes.

Milestone 3 - Testing and Benchmarking

Description

We will use the executable and modified NanoGPT from Milestone 2 to test and benchmark our solution. In particular we will verify the following: - How well we optimize loss relative the default NanoGPT parameters on the same data set; - How long each run of our executable takes given the same number of generations; - How much improvement we gain by adding more generations to our EA; - How much improvement we gain by increasing the population size of our EA; - How much parallel resource utilization we gain on a typical run; and - How many times we require a rebuild and how many of these (if any) are redundant.

Deliverables

1. Comparison of how well the Milestone 2 executable's tuned hyperparameters optimize loss relative NanoGPT defaults. 2. A benchmark of the execution time of the Milestone 2 executable given a fixed population size and generation count. 3. Degree of improvement in hyperparameter tuning at different numbers of generations for the Milestone 2 approach. 4. Degree of improvement in hyperparameter tuning with different population sizes for the Milestone 2 approach. 5. Measurement of utilization of parallel resources when running the Milestone 2 executable. 6. Measurement of neural network rebuilds when running the Milestone 2 executable as well as how many of these are technically redundant.

Budget

$4,400 USD

Success Criterion

1. Milestone 2 executable out-performs NanoGPT defaults 2. Parallel saturation of at least 4 cores is obtained. 3. Rebuilds are sublinear, or at least linear with an expected factor of less than 1.

Milestone 4 - Optimization

Description

Using the benchmarks and measurements gathered in Milestone 3 we will attempt to improve the performance of the Milestone 2 executable. We will prioritize improvements in the following order: 1. Better utilization of available parallel resources. 2. Fewer rebuilds of the neural network architecture with particular emphasis on eliminating redundant rebuilds. 3. Improving the quality of the outcome with fewer generations. 4. Improving the overall runtime or memory use more generally. We will then implement an optimized version of the Milestone 2 executable and compare it against the same benchmarks as used to measure the Milestone 2 executable in Milestone 3.

Deliverables

1. An optimized version of the Milestone 2 executable on the basis of the Milestone 3 measurements and benchmarks given the priorities specified above. 2. Comparisons between the Milestone 2 executable and its optimized version from 1 using the Milestone 3 benchmarks. These must show improvements in the optimized version relative the Milestone 3 baselines.

Budget

$12,500 USD

Success Criterion

The optimized executable out-performs its Milestone 2 equivalent on at least one of the criteria specified above.

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon