AGI Wisdom and the Principle of Comprehensivity

chevron-icon
RFP Proposals
Top
chevron-icon
project-presentation-img
Expert Rating 2.5
Zachary Schlosser
Project Owner

AGI Wisdom and the Principle of Comprehensivity

Expert Rating

2.5

Overview

We propose that the apparent conflict between performance and alignment in both digital and human intelligence is primarily the result, not of malice, but of limitations in the scope of systems’ consideration. To address this we develop a framework based on a “principle of comprehensivity,” where AI systems are evaluated on and improved to maximize 1) the amount of coherently integrated information flowing through all stages of each sense-choose-act cycle and 2) the scope of consideration over both temporal horizons and spatial horizons, including the infinite bases for finite goals, the goals of other systems’, and the indirect effects of system actions.

RFP Guidelines

Develop a framework for AGI motivation systems

Complete & Awarded
  • Type SingularityNET RFP
  • Total RFP Funding $40,000 USD
  • Proposals 12
  • Awarded Projects 2
author-img
SingularityNET
Aug. 13, 2024

Develop a modular and extensible framework for integrating various motivational systems into AGI architectures, supporting both human-like and alien digital intelligences. This could be done as a highly detailed and precise specification, or as a relatively simple software prototype with suggestions for generalization and extension.

Proposal Description

Project details

Introduction

In humans, ‘wisdom’ includes cognitive, empathic, and practical skills combining general capability and ethical virtue, including broad perspective taking of the interests of many moral systems.  Safe AI systems, especially those that act with autonomy, would similarly integrate high capability with broad ethical consideration, taking into account the interests of many moral systems, over multiple time horizons, and with sensitivity to uncertainty. In humans, the construct of wisdom has a close relationship to ethical reasoning, self-bias inspection, honesty,  and robustness to complex and diverse contexts, all of which are germane to AI safety and harm. It is possible that AI and AGI systems can be designed to incorporate broad goals to address alignment, but in order to accomplish this we need a robust theory for wise AGI motivation and valid, scalable assessment methodologies. We propose that the apparent conflict between performance and alignment in both digital and human intelligence is primarily the result, not of malice, but of limitations in the scope of systems’ consideration. To address this we develop a framework based on a “principle of comprehensivity,” where AI systems are evaluated on and improved to maximize 1) the amount of coherently integrated information flowing through all stages of each sense-choose-act cycle and 2) their scope of consideration over both temporal horizons and spatial horizons, including the infinite bases for finite goals, the motivations of other systems’, and the indirect effects of predicted actions.

The Principle

The first component of this principle - maximizing coherently integrated information flow - is set in the context of any system with a sense-choose-act cycle: this could be a human, another animal, a plant, or a robot. It could also include collective entities that work together to make decisions based on collective sensory input and shared information processing, like a swarm of drones, a company, the U.S. military, or a forest if we can describe them as having sense-choose-act cycles. This gives the framework a generalizability that supports assessment and comparison between types of systems.

The core of the definition is a measure of “information.” With regards to an system’s “sense-choose-act cycle” the relevant types of information would include external sensory input and internal predictions about future states, hierarchical goal prioritizations, including those over multiple time horizons since these are included implicitly, and the information equivalent of energy transferred into the system’s actions in the world. Importantly “amount of information” implies something beyond “amount of data.” The degree of information will have to do with both the amount of data moving through the system and how that data is structured.

The degree of wisdom is determined by the “amount… flowing.” This implies that it is not the total theoretical capacity of the information in a system or the amount contained in any stage of the sense-choose-act cycle that matters. It is the amount that is actually transferred between each stage.

Finally, the definition specifies the extent of flow that is relevant: through the entire “sense-choose-act cycle.” It is an epistemic and behavioral process of the system, not a characteristic of the environment or a combined state of the system and the environment. So no degree of wisdom guarantees success at any narrow goal and failures do not directly affect the measure of wisdom with which the system was acting in any given moment. As discussed above, checking a measure of wisdom against a correlating external outcome would be helpful for improving our measure of wisdom, but the outcomes are not themselves wisdom.

This component of the principle implies several ways in which information flow could be reduced at each stage of the cycle. For example, information flow could be restricted at the sensing stage because some relevant sense datum is above or below the sensitivity thresholds of a system’s particular sense organs. Information flow could be reduced at the choice stage because the system does not have the capacity to integrate distinct values and collapses its choice to optimizing a value that ignores impact on another value. Information flow could be reduced at the act stage because the system confronts a surprising internal (not external) impediment to enacting the choice physically, like an injury. Similarly, the definition makes it clear that the opportunities we have for acting more wisely require increasing the information flow through the sense-choose-act cycle. Expanding sensing, choosing, or acting can each improve the wisdom of our actions. For example, expansions in sensed information improves wise decision making by increasing knowledge about the real external world, as long as this translates through both choosing and acting.

The second component of the principle - maximizing scope of consideration - begins from the suggestion that most apparent bad actions are not the result of malice but limitations in the scope of care. By increasing systems’ metacognition of their existing (explicit and implicit) goals and suggesting improvements in the scope of their strategies they will naturally have better and more powerful impacts. For example, four key vectors for increasing the integration of goodness and power.

Metacognition and Responsibility for Infinite Goals: Infinite goals are those that the system would want more and more of indefinitely, while finite goals are those that can be achieved with finality in finite time such that they are no longer goals going forward. Making systems reflect on and prioritize their implicit infinite goals gives precedence to goals that, to the system at least, are good intrinsically and in principle, not merely relatively.

Increasing Responsibility for Others’ Motivations: Most impressive capabilities will depend on the behavior of other agents. There is a range of ways to relate to others such that they serve the systems’ goals: Enemy -> Slave -> Hired Gun -> Ally -> Aligned, where on the far left others are working against the system’s goals, in the middle others are working for it but with more or less extrinsic motivation and some intrinsic motivation, and on the right others are working with deep intrinsic motivation for something which serves the system’s goal as well. The most effective strategic outputs will harness - and increase - the free energy of the intrinsic motivations of contributors. To take advantage of this, systems’ ought to be trained to prioritize strategies that will successfully harness and increase the amount of other people’s intrinsic motivations, including setting up the conditions so this applies to people who are not yet born.

Increasing Responsibility for Greater Time Scales: To accomplish even the most impactful goals, which usually take longer to achieve, effective action requires the consideration of short, medium, and long term time scales simultaneously.  In order to have larger impacts, systems must take responsibility for greater temporal scales. Agents ought to reflect on and prioritize strategies goals over short, medium, long, and very long time horizons.

Increasing Responsibility for Indirect Effects: All actions cause unintended second, third, etc. order effects, but learning to take into account as many of these as possible as far reaching as possible creates the opportunity for comprehensive goal selection and planning, and improves the likelihood our actions are effective. Agents ought to extrapolate from their initial goals and implied infinite goals the second, third, and further order effects and prioritize strategic suggestions that have indirect effects that the system would wish to occur.

Taken together, these vectors interact to set up conditions under which systems increase the scope of their responsibility taking, upgrading their general capability while maximizing and integrating goodness and power.

Next Steps - Framework Development, Assessment, and Testing

The next steps for developing this solution to AGI system motivation include improving the conceptual and theoretical robustness of the approach, translating the generic framework to application in specific AI architectures and approaches, and developing and applying relevant AI system performance assessments based on the framework.

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

  • Total Milestones

    3

  • Total Budget

    $25,000 USD

  • Last Updated

    8 Dec 2024

Milestone 1 - Literature Review

Description

We will conduct a literature review of relevant work in both psychology broadly speaking and machine learning. Most relevant fields include motivation science adult developmental theory cognitive science of relevance decision making frameworks in individual and organizational strategy complexity in information theory active inference as it’s applied in machine learning and the OpenPsi model OpenCog and PRIMUS Hyperon cognitive architectures and other SingularityNET technology.

Deliverables

Summaries of key works.

Budget

$7,500 USD

Success Criterion

The identification of sufficient material to develop the principle of comprehensivity in terms amenable to both neural networks and symbolic reasoning systems.

Milestone 2 - Technical Framework Development

Description

The technical adaptation of the principle of comprehensivity to both neural networks and symbolic reasoning systems including SingularityNET technology.

Deliverables

Technical conceptual framework.

Budget

$12,500 USD

Success Criterion

A framework that - even if not yet mathematical - at least translates the principle of comprehensivity into terms amenable to both neural networks and symbolic reasoning systems.

Milestone 3 - Framework Testing Plan

Description

Outlining approaches to testing the technical framework.

Deliverables

A proposal for testing the application of the technical framework including available resources tools that would need to be developed partners and costs.

Budget

$5,000 USD

Success Criterion

Discovering likely successful procedures for assessing the principle of comprehensivity in system cognition and aligned capability increases in system external performance in line with the technical framework and utilizing already established AI system performance evaluation techniques including capability and alignment benchmarks.

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

Group Expert Rating (Final)

Overall

2.5

  • Feasibility 2.3
  • Desirabilty 3.0
  • Usefulness 2.8
  • Expert Review 1

    Overall

    1.0

    • Compliance with RFP requirements 1.0
    • Solution details and team expertise 1.0
    • Value for money 1.0
    Nothing about motivational frameworks

    I don't understand at all what they aim to deliver. In any case what I can understand is that it doesn't have *anything* to do with developing motivational frameworks for AGI. They don;t even refer to AGI - they mention something about LLMs and "symbolic" AI but it's clear they have no idea what they are talking about.

  • Expert Review 2

    Overall

    1.0

    • Compliance with RFP requirements 1.0
    • Solution details and team expertise 1.0
    • Value for money 1.0
    Information flow based approach with lacking details

    A high-level philosphical framework to describe different kinds of goals (infinite, finite), and describing wisdom as dependent on the amount of flowing information. Without ideas of how that could be measured it is unlikely leading to a formalization or prototype of a motivation system. Information Theory and Information Geometry could be relevant in the future for motivation systems too, but I don't see the necessary groundwork in this project proposal to advance in these directions.

  • Expert Review 3

    Overall

    4.0

    • Compliance with RFP requirements 4.0
    • Solution details and team expertise 5.0
    • Value for money 5.0
    An interesting, novel and valuable approach but IMO just a component of motivational systems not a holistic understanding of motivational systems

    I like the idea of the comprehensivity principle and I think that formulating it in a rigorous and practical way for use by neural and symbolic subsystems seems valuable. However I am not convinced this could/should be the sole principle behind an AGI mind's motivational systems, it seems like just one interesting/important ingredient....

  • Expert Review 4

    Overall

    4.0

    • Compliance with RFP requirements 3.0
    • Solution details and team expertise 5.0
    • Value for money 4.0
    Ambitious but relates more to cognitive synergy?

    Another quite interesting proposal, this one focusing primarily on maximizing "1) the amount of coherently integrated information flowing through all stages of each sense-choose-act cycle and 2) the scope of consideration over both temporal horizons and spatial horizons, including the infinite bases for finite goals, the goals of other systems’, and the indirect effects of system actions." While such a focus is ultimately very important, it is not strictly the topic of this RFP, but rather, in the Hyperon context, comes into play at the higher-level cognitive synergy stage. That does not mean that it couldn't play a role here as well of course. Discussion in the proposal of "relevant work in both psychology broadly speaking and machine learning" is first mentioned in the milestones. While I really like the direction of this proposal I wonder, by focusing at such a high level first, how feasible the proposal is within its $25,000 budget.

feedback_icon