Heirachical reward functions

chevron-icon
RFP Proposals
Top
chevron-icon
project-presentation-img
Tofara Moyo
Project Owner

Heirachical reward functions

Expert Rating

n/a

Overview

We will have a critic output scalar values that will be ordered in stages as rewards for the agent. the agents reward will not be the global reward R, but be r. where r=R*s+R. The critic will be trained solely on R. This equation is designed to bring about a two step hierarchy of needs in the agent since s and R are correlated...to get maximal values of r the agent must perform actions that optimize s in such a way that it ultimately optimizes R otherwise if s is large and R is small that's a smaller value than if R is large and s is small...but it is optimal is for both to be large. Our initial experiments show that this process works. A two stepped network would involve r=R*(s1*s2+s1)+R

RFP Guidelines

Develop a framework for AGI motivation systems

Internal Proposal Review
  • Type SingularityNET RFP
  • Total RFP Funding $40,000 USD
  • Proposals 13
  • Awarded Projects n/a
author-img
SingularityNET
Aug. 13, 2024

Develop a modular and extensible framework for integrating various motivational systems into AGI architectures, supporting both human-like and alien digital intelligences. This could be done as a highly detailed and precise specification, or as a relatively simple software prototype with suggestions for generalization and extension.

Proposal Description

Proposal Details Locked…

In order to protect this proposal from being copied, all details are hidden until the end of the submission period. Please come back later to see all details.

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

  • Total Milestones

    3

  • Total Budget

    $30,000 USD

  • Last Updated

    5 Dec 2024

Milestone 1 - Created graph version of heirachy

Description

we would have created a more complex version of the critic which outputs a graph of values rather than a vector of scalars

Deliverables

Code that is debugged and ready for training

Budget

$10,000 USD

Success Criterion

tested code on small dataset

Milestone 2 - Tested algorithm

Description

we would have tested the algorithm in many scenarios to see how it performs benchmarking it against the simpler version

Deliverables

table of data showing results of experiments

Budget

$10,000 USD

Success Criterion

published results

Milestone 3 - Fine tuned

Description

fine tuned the algorithms hyperparameters and implemented extensive testing and produced reports

Deliverables

reports showing performance of our algorithm

Budget

$10,000 USD

Success Criterion

reports

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

  • Expert Review 1

    Overall

    4.0

    • Compliance with RFP requirements 3.0
    • Solution details and team expertise 5.0
    • Value for money 4.0
    Essential component for motivational frameworks

    The proposal aims to enhance an embryonic, but fundamental approach to inducing hierarchical reward structures in artificial agents based on Markov decision-making and tested on the pendulum with excellent results. The approach has the potential to become an essential component of motivational frameworks given that hierarchical reward structures are the "engine" of motivation, a critical "brick" in the structure on which motivational frameworks for various forms of intelligence can be built. While not addressing the full spectrum of the Call, it is of higher value than other, more "thin air" but comprehensive proposals, given its clarity and potential for seamless implementation. 

feedback_icon