Rob Freeman

Project Owner

Neurosymbolics from Chaos in Language Models

Type SingularityNET RFP
Funding Awarded $80,000 USD
RFP Guidelines Neural-symbolic DNN architectures

Status

Overall Status
🛠️ In Progress
Funding Transfered
$25,000_USD
Max Funding Amount
$80,000_USD

Funding Schedule

View Milestones

Milestone Release 1	$15,000 USD	Transfer Complete	15 May 2025
Milestone Release 2	$10,000 USD	Transfer Complete	05 Jun 2025
Milestone Release 3	$20,000 USD	Pending	TBD
Milestone Release 4	$20,000 USD	Pending	TBD
Milestone Release 5	$10,000 USD	Pending	TBD
Milestone Release 6	$4,999 USD	Pending	TBD
Milestone Release 7	$1 USD	Pending	TBD

Overview

An experiment to test whether symbolic cognitive structure can be made to emerge simply as chaotic attractors in a network of observed language. This experiment is an extension of earlier work which: 1) Demonstrated partial structuring of natural language from ad hoc (chaotic?) re-structurings of vectors. 2) Demonstrated spontaneous oscillations in a network of language sequences. The core idea posits neuro-symbolic integration has eluded us because what we perceive as symbolic order in the world, may actually be chaotic attractors on what is a fundamentally multiply interpretable, reality. If that is so, full neuro-symbolic integration may be easy, we just have to embrace the chaos.

RFP Guidelines

Neural-symbolic DNN architectures

Complete & Awarded

Type SingularityNET RFP
Total RFP Funding $160,000 USD
Proposals 2
Awarded Projects 2

SingularityNET

Oct. 4, 2024

This RFP invites proposals to explore and demonstrate the use of neural-symbolic deep neural networks (DNNs), such as PyNeuraLogic and Kolmogorov Arnold Networks (KANs), for experiential learning and/or higher-order reasoning. The goal is to investigate how these architectures can embed logic rules derived from experiential systems like AIRIS or user-supplied higher-order logic, and apply them to improve reasoning in graph neural networks (GNNs), LLMs, or other DNNs.

Proposal Description

Project details

In 2022 Ben Goertzel commented in a session of the AGI Discussion Forum:

"For ... decades, which is ridiculous, it's been like, OK, I want to explore these chaotic dynamics and emergent strange attractors, but I want to explore them in a very fleshed out system, with a rich representational capability, interacting with a complex world, and then we still haven't gotten to that system ... Of course, an alternative approach could be taken as you've been attempting, of ... starting with the chaotic dynamics but in a simpler setting. ...

This project is an opportunity for the Primus architecture to take a glance also at that alternative approach, and explore its possibilities.

In more detail

In a major departure from all work to date, we assume natural language grammar is chaotic. So it cannot be abstracted, only generated constantly anew.

To perform this constant, dynamic, expansion of grammatical structure, we encode large bodies of text from a language in a network of word sequences. This is easy to do:

Then we seek to cluster grammatical patterns, constantly anew, in real time, by setting the network oscillating and seeing which parts of the network synchronize their oscillations.

The network should generate a raster plot of words and groups of words which share similar contexts:

The information of shared context will be coded automatically in the network of word sequences from the language. Words and word groups which share contexts will be more tightly connected, and tend to synchronize their oscillations.

Further, nested hierarchies of such synchronized groups of nodes might be expected to form trees which reveal meaningful language structure:

An earlier vector implementation of this chaotic, dynamic structuring, idea, did generate meaningful hierarchies comparable to parse structure:

This is initially attempted for natural language as a simplified problem. But it is expected the underlying principles will extend to other perceptual and general concept formation processes.

For instance vision may be linearized as a sequence of fixations (saccades?) and objects identified in similar ways.

It may generally apply to all sensory data, processed in a sequential way, say with feedback learning in the AIRIS system.

Further discussion

The success of Large Language Models these last few years has astounded the world. But people are mystified by the lack of structure. And this seems connected with a broader problem of a lack of "truthiness".

Also these systems puzzle with their enormous size, data, training, and power requirements.

Most proposed solutions are along the lines that we need to add structure using some other kind of data. Typically characterized as "world knowledge", visual, multi-media, physical, or psychological "priors".

The thesis of this project is that there is no Holy Grail, fixed or "primitive" semantic structure to be found in other data. But that instead we should take the very lack of structure exposed by language models as hinting at a solution, rather than a problem, and solve the structure problem by better understanding what language models are trying to tell us. Only then expanding the solution to other data.

Instead of lamenting the lack of symbolic structure in language models. One way to look at the history of AI is to see it as a constant movement away from stable structure. The progress of AI has been from first symbolic models and logic, then to statistical models, then to supervised, distributed, models, then unsupervised, distributed, models. And most recently to unsupervised, distributed, sequence prediction (language) models. So progress has been constantly towards less structure. Until with language models, finally, all attempt at structure is abandoned completely, in favour of prediction alone. And, surprise surprise, ignoring structure completely, performs best of all.

This seems a paradox.

Somehow addressing the language problem, has led us to a cognitive modeling that has the least structure, but works better for it.

What is it that the language modeling problem is tring to tell us?

I think the answer is that the language modeling problem is trying to tell us that cognitive structure is not the finite, stable, set of primitives we have imagined it to be. That actually it is attractors in a chaotic dynamical system. The very largeness of language models is telling us that we seem to have no structure, because we have a chaotic abundance of it. All crushed in together, opaquely. And that we are burning a hole in the planet attempting to expand it all in one massive training step. When actually the way to generate it clearly and cheaply, is to allow the chaotic system to respond naturally to context from moment to moment. And generate its complexity spread over time, expressed as creativity, instead of all crushed in together, and simplified because of it.

In a word I think the history of AI has been a movement away from stable structure, because cognitive structure is chaotic.

The solution to this would not be limited to language. But as with Large Language Models, the path to modeling it may still lie most naturally through the language problem.

This project proposes that we fully explore this proposed chaotic character to cognitive structure exposed by language. In the expectation that the same techniques will be relevant for broader cognition, the question of what "objects" are, what "truth" is, vision, balance, and indeed, reason.

In the context of the Hyperon system, particularly one bridging deep neural nets and symbolic reasoning, it might be said to interpose itself possibly even below an "atom" space. So not just interfacing with an atom space, but ever constructing and reconstructing, an atom space.

Everything else in the reasoning system might remain the same. It might still perform its pattern matching operations. Or any kind of formal reasoning over the resolution of perceptual space into "atoms". With the constantly reconstructed atom space assisting it, in a sense, by resolving the world into a sense of "objects" relevant to one question or another, in a sense posing "better questions".

In practical terms it is suggested these benefits might be obtained in the simplest way possible, by essentially using the same "learning" principles that are used in LLMs now. And "learning" the same patterns. But contrasted only by now anticipating that the patterns learned might be chaotic. And finding them from moment to moment at run time, rather than in one massive training phase.

We might do this easily, because language models lead us to conceive the structuring problem as one of cause & effect prediction. This compared to earlier iterations of neural networks, tied first to fixed external patterns with surpervised learning, or fixed internal patterns with unsupervised learning. By contrast LLMs learn predictive symmetries. And predictive symmetries are not only free to vary so long as they continue to predict, but as symmetries they don't require an estimation of error at all. We can find predictive symmetries without back propagation.

And actually a natural way to do this may be by some kind of "vibration analysis". Vibrations being an excellent way to detect symmetries.

So, in the concrete, this project proposes that we attempt to merge neuro and symbolic in the Hyperon system, by replacing backprop in existing LLMs, and instead finding the same predictive symmetries, only in real time, changing (chaotically) from moment to moment, using some kind of oscillatory "vibration analysis". The result forming a new network of "atoms", also changing from moment to moment (and a basis for combining/merging atoms), to be used by Hyperon to perform the graph analysis and pattern matching which it is naturally adpated to do, as a graph based reasoning system.

Open Source Licensing

Custom

Links and references

Most relevant portion of my presentation at AGI-21:

Vector Parser - Cognition a compression or expansion of the world? - AGI-21 Contributed Talks https://youtu.be/0FmOblTl26Q?t=2891

Five minute summary for “Q&A” session of AGI-21, Online Contributed Talks:

https://youtu.be/E7kbK9m3g-U?t=34607

Recent podcast discussion:

https://x.com/rob_freeman/status/1842168014827098522

Link to Github project:

https://discourse.numenta.org/t/the-coding-of-longer-sequences-in-htm-sdrs/10597/27

Additional videos

Vector Parser - Cognition a compression or expansion of the world? - AGI-21 Contributed Talks https://youtu.be/0FmOblTl26Q?t=2891

Activity Summary

Milestones

7

Total

Discussion

0

Total Comments

Reviews

6

Total Posted

Project Team

3

Total People

Group Expert Rating (Final)

Overall

5.0

Compliance with RFP requirements 3.6
Solution details and team expertise 3.6
Value for money 3.8

New reviews and ratings are disabled for Awarded Projects

Overall Community

3.6

from 6 reviews

5

1
4

3
3

0
2

0
1

1

Feasibility

3.6

from 6 reviews

Viability

3.6

from 6 reviews

Desirabilty

3.8

from 6 reviews

Usefulness

0

from 6 reviews

Sort by

6 ratings

Expert Review 1
Overall

4.0
- Compliance with RFP requirements 3.0
- Solution details and team expertise 3.0
- Value for money 0.0

Expert Review 2
Overall

5.0
- Compliance with RFP requirements 5.0
- Solution details and team expertise 5.0
- Value for money 0.0
Exciting research proposal

Bold vision and alignment with the RFP’s goals. Lack of detailed technical and experimental plans are concerns. Given the credibility of the researcher this project could be a high-risk, high-reward candidate for funding.

Expert Review 3
Overall

1.0
- Compliance with RFP requirements 1.0
- Solution details and team expertise 1.0
- Value for money 0.0
Worrisome interpretation of key DL-based NLP results

"Until with language models, finally, all attempt at structure is abandoned completely, in favour of prediction alone. And, surprise surprise, ignoring structure completely, performs best of all." This seems to be a misinterpretation of key results, since attention heads allow to learn multiple structures that matter for an output (e.g. relevant prior words in a sentence to predict a target word), that is part of why they work so well. This approach which dismisses Transformer models seems suspicious to me, and due to the further lack of technical details I gave a low rating. If it would build on top of Transformer it could be a different story, for instance it was quite relevant (but only in light of Transformers and the LLM's) what they found in Bhargava, A., Witkowski, C., Looi, S. Z., & Thomson, M. (2023). What's the Magic Word? A Control Theory of LLM Prompting. arXiv preprint arXiv:2310.04444.

Expert Review 4
Overall

4.0
- Compliance with RFP requirements 4.0
- Solution details and team expertise 5.0
- Value for money 0.0
A creative and solid proposal, but in the direction of some very specific (excellent) NLP R&D rather than a general purpose neural-symbolic learning method. But the method could possibly extend beyond language too.

Freeman is a known entity and a serious researcher and in his own way a visionary thinker. He has been brewing these ideas for a long time. It would be cool to see him experiment with his creative directions in Hyperon. OTOH in the first instance it's not a general-purpose neural-symbolic learning method, it's a novel computational-linguistics idea with deep cognitive/philosophical underpinnings. It might have implications/value beyond language as well.

Expert Review 5
Overall

4.0
- Compliance with RFP requirements 5.0
- Solution details and team expertise 5.0
- Value for money 0.0
Interesting idea with potential for complex systems emergence. Con: unproven and speculative.

Posting publicaly as

edit profile

Overall

Confidence level that the project is possible at all Feasibility

0
Confidence in a successful outcome considering team, time, and budget Viability

0
Market fit - Balancing needs and benefits against competition Desirabilty

0
To what extent will the project help the AI platform grow? Usefulness

0

Review Headline

0 /50 chars

Review Summary

0 /5000 chars

Expert Review

Overall

Confidence level that the project is possible at all Feasibility

0
Confidence in a successful outcome considering team, time, and budget Viability

0
Market fit - Balancing needs and benefits against competition Desirabilty

0
To what extent will the project help the AI platform grow? Usefulness

0

Review Headline

0 /50 chars

Review Summary

0 /5000 chars

Reviews and Ratings in Deep Funding are structured in 4 categories. This will ensure that the reviewer takes all these perspectives into account in their assessment and it will make it easier to compare different projects on their strengths and weaknesses. Overall (Primary) This is an average of the 4 perspectives. At the start of this new process, we are assigning an equal weight to all categories, but over time we might change this and make some categories more important than others in the overall score. (This may even be done retroactively). Feasibility (secondary) This represents the user's assessment of whether the proposed project is theoretically possible and if it is deemed feasible. E.g. A proposal for nuclear fission might be theoretically possible, but it doesn’t look very feasible in the context of Deep Funding. Viability (secondary) This category is somewhat similar to Feasibility, but it interprets the feasibility against factors such as the size and experience of the team, the budget requested, and the estimated timelines. We could frame this as: “What is your level of confidence that this team will be able to complete this project and its milestones in a reasonable time, and successfully deploy it?” Examples:

A proposal that promises the development of a personal assistant that outperforms existing solutions might be feasible, but if there is no AI expertise in the team the viability rating might be low.
A proposal that promises a new Carbon Emission Compensation scheme might be technically feasible, but the viability could be estimated low due to challenges around market penetration and widespread adoption.

Desirability (secondary) Even if the project team succeeds in creating a product, there is the question of market fit. Is this a project that fulfills an actual need? Is there a lot of competition already? Are the USPs of the project sufficient to make a difference? Example:

Creating a translation service from, say Spanish to English might be possible, but it's questionable if such a service would be able to get a significant share of the market

Usefulness (secondary) This is a crucial category that aligns with the main goal of the Deep Funding program. The question to be asked here is: “To what extent will this proposal help to grow the Decentralized AI Platform?” For proposals that develop or utilize an AI service on the platform, the question could be “How many API calls do we expect it to generate” (and how important / high-valued are these calls?). For a marketing proposal, the question could be “How large and well-aligned is the target audience?” Another question is related to how the budget is spent. Are the funds mainly used for value creation for the platform or on other things? Examples:

A metaverse project that spends 95% of its budget on the development of the game and only 5 % on the development of an AI service for the platform might expect a low ‘usefulness’ rating here.

A marketing proposal that creates t-shirts for a local high school, would get a lower ‘usefulness’ rating than a marketing proposal that has a viable plan for targeting highly esteemed universities in a scaleable way.
An AI service that is fully dedicated to a single product, does not take advantage of the purpose of the platform. When the same service would be offered and useful for other parties, this should increase the ‘usefulness’ rating.

Reviews and Ratings in Deep Funding are structured in 4 categories. This will ensure that the reviewer takes all these perspectives into account in their assessment and it will make it easier to compare different projects on their strengths and weaknesses. Overall (Primary) This is an average of the 4 perspectives. At the start of this new process, we are assigning an equal weight to all categories, but over time we might change this and make some categories more important than others in the overall score. (This may even be done retroactively). Feasibility (secondary) This represents the user\'s assessment of whether the proposed project is theoretically possible and if it is deemed feasible. E.g. A proposal for nuclear fission might be theoretically possible, but it doesn’t look very feasible in the context of Deep Funding. Viability (secondary) This category is somewhat similar to Feasibility, but it interprets the feasibility against factors such as the size and experience of the team, the budget requested, and the estimated timelines. We could frame this as: “What is your level of confidence that this team will be able to complete this project and its milestones in a reasonable time, and successfully deploy it?” Examples:

A proposal that promises the development of a personal assistant that outperforms existing solutions might be feasible, but if there is no AI expertise in the team the viability rating might be low.
A proposal that promises a new Carbon Emission Compensation scheme might be technically feasible, but the viability could be estimated low due to challenges around market penetration and widespread adoption.

Creating a translation service from, say Spanish to English might be possible, but it\'s questionable if such a service would be able to get a significant share of the market

A metaverse project that spends 95% of its budget on the development of the game and only 5 % on the development of an AI service for the platform might expect a low ‘usefulness’ rating here.

A marketing proposal that creates t-shirts for a local high school, would get a lower ‘usefulness’ rating than a marketing proposal that has a viable plan for targeting highly esteemed universities in a scaleable way.
An AI service that is fully dedicated to a single product, does not take advantage of the purpose of the platform. When the same service would be offered and useful for other parties, this should increase the ‘usefulness’ rating.

Total Milestones
7
Total Budget
$80,000_USD
Last Updated
12 Aug 2025

Milestone 1 - Get a suitable neurosimulator codebase working

Status

😀 Completed

Description

Get sequence network oscillating and identifying more tightly clustered subnetworks: Select platform: Option 1: Get clustering working in sequence network with old neurosimulator code from paper: https://ncbi.nlm.nih.gov/pmc/articles/PMC3390386/ Code here: https://modeldb.science/144502 Option 2: Alternatively extend already existing implementation of oscillating sequence network in Github project: https://discourse.numenta.org/t/the-coding-of-longer-sequences-in-htm-sdrs/10597/27 Which is chosen might depend on the experience of coders hired for the task. Sketching one month timeline to incorporate project starting and admin wrinkles.

Deliverables

Project platform established.

Budget

$15,000 USD

Link URL

Reviewer's report Link 2 Link 3 Link 4

Milestone 2 - Get neurosimulator oscillating for corpus network

Status

😀 Completed

Description

Get neurosimulator oscillating in network formed from a moderate sized language corpus. Option 1: Brown Corpus Option 2: British National Corpus Also sketching timeline of one month for this. It might not be too hard but allowing some time for problems. Likely problems might be issues with the size of a corpus network. It might require hardware upgrades or adjustments.

Deliverables

Input language connectivity for a sufficiently large language sample.

Budget

$10,000 USD

Link URL

Link URL

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

Group Expert Rating (Final)

Overall

5.0

Compliance with RFP requirements 3.6
Solution details and team expertise 3.6
Value for money 3.8

New reviews and ratings are disabled for Awarded Projects

Expert Review 1
Overall

4.0
- Compliance with RFP requirements 3.0
- Solution details and team expertise 3.0
- Value for money 0.0

Expert Review 2
Overall

5.0
- Compliance with RFP requirements 5.0
- Solution details and team expertise 5.0
- Value for money 0.0
Exciting research proposal

Bold vision and alignment with the RFP’s goals. Lack of detailed technical and experimental plans are concerns. Given the credibility of the researcher this project could be a high-risk, high-reward candidate for funding.

Expert Review 3
Overall

1.0
- Compliance with RFP requirements 1.0
- Solution details and team expertise 1.0
- Value for money 0.0
Worrisome interpretation of key DL-based NLP results

"Until with language models, finally, all attempt at structure is abandoned completely, in favour of prediction alone. And, surprise surprise, ignoring structure completely, performs best of all." This seems to be a misinterpretation of key results, since attention heads allow to learn multiple structures that matter for an output (e.g. relevant prior words in a sentence to predict a target word), that is part of why they work so well. This approach which dismisses Transformer models seems suspicious to me, and due to the further lack of technical details I gave a low rating. If it would build on top of Transformer it could be a different story, for instance it was quite relevant (but only in light of Transformers and the LLM's) what they found in Bhargava, A., Witkowski, C., Looi, S. Z., & Thomson, M. (2023). What's the Magic Word? A Control Theory of LLM Prompting. arXiv preprint arXiv:2310.04444.

Expert Review 4
Overall

4.0
- Compliance with RFP requirements 4.0
- Solution details and team expertise 5.0
- Value for money 0.0
A creative and solid proposal, but in the direction of some very specific (excellent) NLP R&D rather than a general purpose neural-symbolic learning method. But the method could possibly extend beyond language too.

Freeman is a known entity and a serious researcher and in his own way a visionary thinker. He has been brewing these ideas for a long time. It would be cool to see him experiment with his creative directions in Hyperon. OTOH in the first instance it's not a general-purpose neural-symbolic learning method, it's a novel computational-linguistics idea with deep cognitive/philosophical underpinnings. It might have implications/value beyond language as well.

Expert Review 5
Overall

4.0
- Compliance with RFP requirements 5.0
- Solution details and team expertise 5.0
- Value for money 0.0
Interesting idea with potential for complex systems emergence. Con: unproven and speculative.

The weighted average of the 4 perspectives Overall

0.0
Each RFP defines a maximum allowed budget, but teams can differentiate their proposal by offering a solution with a lower budget or a wider scope.Value for money

0.0
This rating indicates compliance to 'Must haves' but also adaptation of 'Nice to haves' and Non-functional requirements defined in the RFP.Compliance with RFP requirements

0.0
RFPs will offer varying degrees of freedom. This rating indicates the quality of the team's specific solution ideas, the provided details, and the reviewer's confidence in the team's ability to execute.Solution details and team expertise

0.0

Review Headline

0 /50 chars

Review Summary

0 /5000 chars

The weighted average of the 4 perspectives Overall

5.0
Each RFP defines a maximum allowed budget, but teams can differentiate their proposal by offering a solution with a lower budget or a wider scope.Value for money

3.8
This rating indicates compliance to 'Must haves' but also adaptation of 'Nice to haves' and Non-functional requirements defined in the RFP.Compliance with RFP requirements

3.6
RFPs will offer varying degrees of freedom. This rating indicates the quality of the team's specific solution ideas, the provided details, and the reviewer's confidence in the team's ability to execute.Solution details and team expertise

3.6

Review Headline

0 /50 chars

0 /5000 chars

Warning: Adding final group rating for this project will prevent expert users from adding new or editing existing reviews

Reviews and Ratings in Deep Funding are structured in 4 categories. This will ensure that the reviewer takes all these perspectives into account in their assessment and it will make it easier to compare different projects on their strengths and weaknesses. Overall (Primary) This is an average of the 4 perspectives. At the start of this new process, we are assigning an equal weight to all categories, but over time we might change this and make some categories more important than others in the overall score. (This may even be done retroactively). Feasibility (secondary) This represents the user\'s assessment of whether the proposed project is theoretically possible and if it is deemed feasible. E.g. A proposal for nuclear fission might be theoretically possible, but it doesn’t look very feasible in the context of Deep Funding. Viability (secondary) This category is somewhat similar to Feasibility, but it interprets the feasibility against factors such as the size and experience of the team, the budget requested, and the estimated timelines. We could frame this as: “What is your level of confidence that this team will be able to complete this project and its milestones in a reasonable time, and successfully deploy it?” Examples:

A proposal that promises the development of a personal assistant that outperforms existing solutions might be feasible, but if there is no AI expertise in the team the viability rating might be low.
A proposal that promises a new Carbon Emission Compensation scheme might be technically feasible, but the viability could be estimated low due to challenges around market penetration and widespread adoption.

Creating a translation service from, say Spanish to English might be possible, but it\'s questionable if such a service would be able to get a significant share of the market

A metaverse project that spends 95% of its budget on the development of the game and only 5 % on the development of an AI service for the platform might expect a low ‘usefulness’ rating here.

A marketing proposal that creates t-shirts for a local high school, would get a lower ‘usefulness’ rating than a marketing proposal that has a viable plan for targeting highly esteemed universities in a scaleable way.
An AI service that is fully dedicated to a single product, does not take advantage of the purpose of the platform. When the same service would be offered and useful for other parties, this should increase the ‘usefulness’ rating.

Rob Freeman

Project Owner

Project Leader Managing the grant, and looking for the right collaborative synergies.

View Profile

Marco Fernandez

Technical Lead

Technical input for ways to generate word vectors using shared contexts in a language sequence network. Marco has a background with word vectors & language tasks, working with Bob Coecke at Oxford.

View Profile

Compl Yue

Developer

Compl has been key in some early trials, published on his Github: https://github.com/complyue/OscBrain

View Profile