Evolving DNN Architectures

chevron-icon
Back
Top
chevron-icon
project-presentation-img
Patrick Nercessian
Project Owner

Evolving DNN Architectures

Status

  • Overall Status

    ⏳ Contract Pending

  • Funding Transfered

    $0 USD

  • Max Funding Amount

    $40,000 USD

Funding Schedule

View Milestones
Milestone Release 1
$12,000 USD Pending TBD
Milestone Release 2
$8,000 USD Pending TBD
Milestone Release 3
$8,000 USD Pending TBD
Milestone Release 4
$12,000 USD Pending TBD

Project AI Services

No Service Available

Overview

The goal of this project is to create a framework that uses evolutionary computation to design new neural network architectures for natural language prediction. The idea is to build on the success of transformers while exploring entirely new design possibilities that could outperform them. By simulating evolution—introducing variations, selecting the best-performing models, and iterating—the framework aims to uncover architectures that go beyond what human researchers have created. We aim to represent neural networks as directed acyclic graphs - and perform mutations and crossover on those representations. We also plan to allow mutations/crossover on model hyperparameters as well.

RFP Guidelines

Evolutionary algorithms for training transformers and other DNNs

Complete & Awarded
  • Type SingularityNET RFP
  • Total RFP Funding $40,000 USD
  • Proposals 8
  • Awarded Projects 1
author-img
SingularityNET
Aug. 12, 2024

Explore and demonstrate the use of evolutionary methods (EMs) for training various DNNs including transformer networks. Such exploration could include using EMs to determine model node weights, and/or using EMs to evolve DNN/LLM architectures. Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is an example of one very promising evolutionary method among others.

Proposal Description

Company Name (if applicable)

Vectorial

Project details

Vectorial consists of 5 software, AI, and ML engineers. For 2 years, we have implemented advanced LLM and ML solutions for our customers. Our expertise originates from several advanced degrees in Data Science, Computer Science, and Machine Learning. Our engineers also have working experience at companies such as AWS, Oracle, Home Depot, and Fortune 50 financial corporations. Our President, Patrick Nercessian, did very similar research at evolving convolutional neural networks, work linked in the Links and References section.

The goal of this project is to create a framework (wrapped in a Hyperon Neural Atomspace) that uses evolutionary computation to design new neural network architectures for natural language prediction. The idea is to build on the success of transformers while exploring entirely new design possibilities that could outperform them. By simulating evolution—introducing variations, selecting the best-performing models, and iterating—the framework aims to uncover architectures that go beyond what human researchers have created.

Rather than evolving from scratch, we can take a cue from nature and evolve from an existing population of organisms (neural network architectures) that are already “fit”. Transformers have revolutionized Natural Language Processing, Computer Vision, Speech and Audio Processing, Generative Speech and Audio, and Generative Image and Video. Many attribute this success to the architecture’s ability to process vast quantities of data in parallel.

Thus, this project aims to start with randomly generating initial populations of standard decoder-only transformers with varying hyperparameters. These transformers would be represented as an intermediate data structure, such as a directed acyclic graph (DAG), in order to allow for variation operators (i.e. mutations and/or recombination). Because the goal is entirely new architectures, we need to ensure a large amount of flexibility in regards to the possible architectural variation operators. However, a naive approach would result in inherently invalid models (e.g. by attempting to multiply a pair of incompatible matrices). Thus, effort would need to be done to ensure or promote *valid* variation operators, or possibly a form of post hoc self-healing.

Further variation operators would take the form of neural network hyperparameters, such as optimizers, learning rates and their schedules/strategies, activation functions, etc.

This project will incorporate biases during the evolutionary process to increase the chances of favorable variation operators. For example, one such favorable outcome might be modularized neural networks, since many successful modern neural network architectures are often repeatable modules stacked on top of each other.

To evaluate the evolved architectures, we will implement a pipeline for translating DAG representations into fully operational neural networks. These models will be trained using publicly available datasets, such a subset of FineWeb’s sample-10BT. Primary evaluation will focus on perplexity, which measures the model's ability to predict test set data accurately.

Selection experiments will systematically vary evolutionary hyperparameters, including mutation rates, crossover probabilities, population sizes, and selection pressures. This diversity in configurations will allow for an in-depth exploration of the search space and identification of parameter sets that balance exploration and exploitation

Open Source Licensing

MIT - Massachusetts Institute of Technology License

Links and references

https://www.vectorial.us

https://pjnercessian.wordpress.com/2023/11/11/evolving-convolutional-neural-networks/

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

Group Expert Rating (Final)

Overall

5.0

  • Compliance with RFP requirements 4.8
  • Solution details and team expertise 4.3
  • Value for money 4.8

New reviews and ratings are disabled for Awarded Projects

Overall Community

4.5

from 5 reviews
  • 5
    2
  • 4
    2
  • 3
    0
  • 2
    0
  • 1
    0

Feasibility

4.8

from 5 reviews

Viability

4.3

from 5 reviews

Desirabilty

4.8

from 5 reviews

Usefulness

0

from 5 reviews

Sort by

5 ratings
  • Expert Review 1

    Overall

    4.0

    • Compliance with RFP requirements 4.0
    • Solution details and team expertise 5.0
    • Value for money 0.0
    Focused

    While the proposal is naive compared to modern approaches such as NEAT, the team, correct scope, and clarity lead me to believe it has a high chance of success.

  • Expert Review 2

    Overall

    5.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 5.0
    • Value for money 0.0
    Great proposal

    Strong alignment with the RFP and is backed by a capable team with relevant expertise.Technically robust – promising candidate for funding.

  • Expert Review 3

    Overall

    5.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 5.0
    • Value for money 0.0
    A solid proposal by a team who seems to know the field quite well

    The approach suggested makes a lot of sense... I would have preferred a little more detail on what probabilistic mutation / crossover approaches they are thinking of, but this will in the end be determined experimentally... the interesting question I suppose is will they incorporate dependencies beyond what CMA-ES does...

  • Expert Review 4

    Overall

    4.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 4.0
    • Value for money 0.0

    Solid detailed proposal adhering strictly to the RFP with a few new ideas. Appears to be a solid team.

  • Total Milestones

    4

  • Total Budget

    $40,000 USD

  • Last Updated

    3 Feb 2025

Milestone 1 - Framework Development

Status
😐 Not Started
Description

1. Develop a mechanism to represent neural networks architectures (including transformers) as DAGs. 2. Implement hyperparameter mutation and recombination operators. 3. Implement architectural mutation and recombination operators which ensure valid architectures.

Deliverables

The creation of a framework which can create artificial neural networks from DAGs. By training on very small amounts of data we will also confirm the framework’s ability to generate valid architectures using mutation and recombination.

Budget

$12,000 USD

Success Criterion

success_criteria_1

Link URL

Milestone 2 - Initial Experiments

Status
😐 Not Started
Description

1. Integrate evaluation and selection mechanisms using standard NLP datasets and fitness metrics. 2. Run initial evolutionary experiments. Try different neural network sizes and dataset sizes to generate data about how these experiments and architectures differ across scale.

Deliverables

An updated framework which is feature-complete for evolutionary runs. The output of the evolutionary experiments including perplexity on next-word prediction and other related evolutionary output metrics.

Budget

$8,000 USD

Success Criterion

success_criteria_1

Link URL

Milestone 3 - Refinement of Operators

Status
😐 Not Started
Description

1. Develop advanced mutation and recombination operators incorporating biases toward favorable traits such as modularity or sparsity. 2. Run further experiments using these updated operators including creative population initialization (such as portions of the population having varying subsets or magnitudes of these new biases).

Deliverables

An updated framework which includes these advanced variation operators aimed to improve the search of the evolutionary algorithm.. The output of the evolutionary experiments including perplexity on next-word prediction and other related evolutionary output metrics.

Budget

$8,000 USD

Success Criterion

success_criteria_1

Link URL

Milestone 4 - Indentifying and Scaling Final Model

Status
😐 Not Started
Description

1. Scale up the best-performing architectures for larger-scale training runs. 2. Compare the scaled models’ performance against known transformers of similar scale using training/test metrics and benchmark results. 3. Document findings and release codebase with comprehensive instructions and supporting documentation.

Deliverables

Larger-scale training runs to demonstrate the scalability of the best output model architectures from the evolutionary runs. Also the final codebase and report of our findings over the entire project.

Budget

$12,000 USD

Success Criterion

success_criteria_1

Link URL

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

Group Expert Rating (Final)

Overall

5.0

  • Compliance with RFP requirements 4.8
  • Solution details and team expertise 4.3
  • Value for money 4.8

New reviews and ratings are disabled for Awarded Projects

  • Expert Review 1

    Overall

    4.0

    • Compliance with RFP requirements 4.0
    • Solution details and team expertise 5.0
    • Value for money 0.0
    Focused

    While the proposal is naive compared to modern approaches such as NEAT, the team, correct scope, and clarity lead me to believe it has a high chance of success.

  • Expert Review 2

    Overall

    5.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 5.0
    • Value for money 0.0
    Great proposal

    Strong alignment with the RFP and is backed by a capable team with relevant expertise.Technically robust – promising candidate for funding.

  • Expert Review 3

    Overall

    5.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 5.0
    • Value for money 0.0
    A solid proposal by a team who seems to know the field quite well

    The approach suggested makes a lot of sense... I would have preferred a little more detail on what probabilistic mutation / crossover approaches they are thinking of, but this will in the end be determined experimentally... the interesting question I suppose is will they incorporate dependencies beyond what CMA-ES does...

  • Expert Review 4

    Overall

    4.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 4.0
    • Value for money 0.0

    Solid detailed proposal adhering strictly to the RFP with a few new ideas. Appears to be a solid team.

Welcome to our website!

Nice to meet you! If you have any question about our services, feel free to contact us.