Autograph: Factoring Knowledge Graphs into Frames

chevron-icon
Back
Top
chevron-icon
project-presentation-img

Autograph: Factoring Knowledge Graphs into Frames

Status

  • Overall Status

    ⏳ Contract Pending

  • Funding Transfered

    $0 USD

  • Max Funding Amount

    $60,000 USD

Funding Schedule

View Milestones
Milestone Release 1
$30,000 USD Pending TBD
Milestone Release 2
$20,000 USD Pending TBD
Milestone Release 3
$10,000 USD Pending TBD

Project AI Services

No Service Available

Overview

There are many large, generic semantic graphs (e.g. WikiData, DBpedia, etc.) alongside growing numbers of domain-specific ones. LLMs offer a quick path to KG extraction but introduce: Inconsistency: erratic, non-deterministic entity resolution; Inaccuracy: missing or hallucinated predicates; Context loss: unstable frames of reference; Confidence: displaying confidence regardless of accuracy; Speed: impractical at scale. We are developing an AGI-focused KG tooling suite tackling these challenges via four modules—Context Identification, Entity Management, Predicate Management, and Confidence Management. This proposal addresses Context Identification for efficient framing of noisy KGs.

RFP Guidelines

Advanced knowledge graph tooling for AGI systems

Complete & Awarded
  • Type SingularityNET RFP
  • Total RFP Funding $350,000 USD
  • Proposals 39
  • Awarded Projects 5
author-img
SingularityNET
Apr. 16, 2025

This RFP seeks the development of advanced tools and techniques for interfacing with, refining, and evaluating knowledge graphs that support reasoning in AGI systems. Projects may target any part of the graph lifecycle — from extraction to refinement to benchmarking — and should optionally support symbolic reasoning within the OpenCog Hyperon framework, including compatibility with the MeTTa language and MORK knowledge graph. Bids are expected to range from $10,000 - $200,000.

Proposal Description

Our Team

MLabs AI sees AGI as a very important strategic goal, and our senior technical team already devote a portion of their time in pursuit of this outcome. We see our collaborations with SingularityNET as an important part of this endeavor.

Company Name (if applicable)

MLabs LTD

Project details

There are a number of large, generic semantic graphs (e.g. WikiData, DBpedia, BabelNet, and so on), with an ever-increasing number of domain-specific knowledge graphs appearing every month. In addition to these more traditional graphs, there has been a move recently to using LLMs for the production of knowledge graphs.

While this capability provides a useful shortcut to a difficult problem, there are several problems with using LLMs for knowledge graph extraction:

  • inconsistency - LLMs often fail to correctly resolve entities, and the output is non-deterministic
  • accuracy - they can miss valid predicates, or (more often) hallucinate invalid predicates
  • context - LLMs often fail to maintain a frame of reference specific to the knowledge domain
  • confidence - LLMs display the same level of confidence about valid and invalid conclusions
  • speed - knowledge graph extraction from even a single website can take many minutes, and is impractical for very large knowledge graphs

Our KG tooling roadmap assumes that most of the structured graph information being operated on will be noisy, of variable quality, and not subject to ontological best practices. It is likely to have been created manually, using ad hoc processes, or from the output of an LLM.

We have designed requirements that specifically address the concerns listed above. Our tooling needs to maintain a suitable frame of reference (and to identify multiple frames of reference in large KGs), with contextually resolved entities, have relationships with quantified confidence, and be able to operate computationally efficiently at the very large scales that will be required in AGI systems. There are four types of tooling:

  • Context Identification
  • Entity Management
  • Predicate Management
  • Confidence Management

This is one of two KG tooling projects submitted to the current RFP and concentrates on context identification. We will open source not only the tooling developed during the project but also the tooling worked on thus far. We will continue to develop these tools as an important component of our AGI roadmap and may deliver further enhancements to the Singularity NET community too.

Frame of Reference (Context) Identification

AGI systems will have to deal with multiple frames of reference, with different entities, predicates and conceptual nuances operating in each frame. When applied to knowledge graphs this can be reduced to three main tasks:

  • Discovering sub-graphs
  • Identifying frame-of-reference entities
  • Assigning predicates to frames of reference

Discovering sub-graphs

Graph partitioning is an NP-hard problem, and so optimal sub-graph discovery for the large KGs that will be in operation is likely to remain technically infeasible on current hardware. There are a number of approximate methods for graph partitioning that scale suitably for large graphs but are still computationally expensive for KGs of the size we are dealing with (many millions, to hundreds of millions of nodes).

MLabs has proprietary IP in the form of a novel algorithm for efficient, approximate partitioning of extremely large sparse graphs. The method is based on Szemerédi’s Regularity Lemma and exploits the same algorithmic finesse as MergeSort. It has thus far remained an MLabs trade secret. We intend to open-source this algorithm, which is guaranteed to run in O(n logn) and produce partitions that compare favorably with those found by eigen-decomposition of the graph Laplacian, but orders of magnitude more quickly. The same algorithm can be used in recommender systems with trivial modification; MLabs will not restrict its use by the SingularityNET community for either purpose.

We will demonstrate the efficacy of frame of reference identification using our algorithm on a KG based on WikiData. We will reuse this dataset for other aspects of the project.

Frame-of-Reference Entities

Word co-occurrence graphs have long been known to possess the small-world property; they have a high clustering coefficient and can be divided into cliques. We believe that this is important for AGI efficiency as it enables analysis that is local to a frame of reference - it is worth noting that the connections between neurons in the cortex have a topology consistent with this observation. We have established that large knowledge graphs have similar properties and that nodes can be assigned to frames of reference using their frame linkage statistics. This process can be initialised using the sub-graph results from above to ensure that the assignment is computationally efficient.

Frame-of-Reference Predicates

One of the statistics that helps us assign entities to frames of reference is the type of predicate linking the entities in the frame. While some predicates appear to be universal (instance_of, subclass_of and so on), there are a large number of predicates that are specific to a subset of frames. For example, integral_of is specific to the algebra frame, subsidiary_of is specific to the corporate law frame. So the assignment of predicates to frames, and entities to frames is a collaborative inference task for which we employ Expectation-Maximisation to find a (locally) optimal assignment matrix.

Assignment of entities and predicates will be demonstrated using the same WikiData KG as in the previous activity.

Open Source Licensing

MIT - Massachusetts Institute of Technology License

Background & Experience

Dr Bedworth is the Chief Scientist of MLabs AI with over 40 years of experience. He co-authored a 2000 AGI paper with psychologist Carl Frankel. He worked with Nobel Laureate Geoff Hinton on Boltzmann machines, and his work on Bayesian probability received worldwide recognition. Two of his patents were acquired by Apple for use in Siri.

Ibrahim received his Masters Degree in Computer Engineering and Machine Intelligence from the American University of Beirut. He specializes in optimization, search algorithms, and deep learning. He has worked at MLabs AI on novel approaches to deep learning and is currently leading the LLM workflow.

Dr Sarma received her Doctorate in Data Science and Artificial Intelligence from the Indian Institute of Technology. She specializes in NLP and reinforcement learning. She has built context embedding models, and LLMs for a number of languages, and is currently a member of the NLP team at MLabs AI, as well as handling our reinforcement learning activities.

Links and references

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

Reviews & Rating

New reviews and ratings are disabled for Awarded Projects

Overall Community

0

from 0 reviews
  • 5
    0
  • 4
    0
  • 3
    0
  • 2
    0
  • 1
    0

Feasibility

0

from 0 reviews

Viability

0

from 0 reviews

Desirabilty

0

from 0 reviews

Usefulness

0

from 0 reviews

Sort by

0 ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

  • Total Milestones

    3

  • Total Budget

    $60,000 USD

  • Last Updated

    28 Aug 2025

Milestone 1 - Block Factorization of WikiData Knowledge Graphs

Status
😐 Not Started
Description

In this milestone we will ingest a JSON dump of WikiData selecting a suitable subset of the properties available and use it to create a test knowledge graph. We will write a production-quality implementation of the block factorization algorithm and test it on the resulting WikiData knowledge graph. We will detail the accuracy and computational efficiency of the decomposition and provide instructions on how to use the software for factorizing large graphs or as a recommender system.

Deliverables

Software for efficient block factorization of large knowledge graphs tooling for ingestion of WikiData and details of the results of factorizing the WikiData knowledge graph.

Budget

$30,000 USD

Success Criterion

Fully operational, general-purpose graph factorization software, with a demonstration on WikiData, and accompanying documentation.

Link URL

Milestone 2 - Assign Entities & Predicates to Reference Frames

Status
😐 Not Started
Description

This milestone will probabilistically assign entities and predicates associated with those entities to the frames of reference determined in the first milestone. A specific entity can be assigned to more than one frame of reference - the degree of belonging is constrained to be 1 or less but there is no constraint on the sum of frame of reference assignment values as appropriate. Assignment of entities between more than one frame of reference provides the link between the different elements of the KG. Assignment of entities will be demonstrated on the same WikiData KG used earlier.

Deliverables

Software for entity assignment of factorized knowledge graphs tooling for ingestion of factorized graphs and details of the results of assignment for the WikiData knowledge graph.

Budget

$20,000 USD

Success Criterion

Fully operational, entity assignment software with a demonstration on WikiData and accompanying documentation.

Link URL

Milestone 3 - Frame of Reference Evaluation and Final Report

Status
😐 Not Started
Description

The use of factorization into context-driven frames of reference enables more accurately guided and more efficient traversal of large KGs for specific tasks. This we believe introduces a focus mechanism which removes the superfluous information contained in large KGs that will be needed when using AGI-scale KGs for real-world tasks. This milestone will evaluate the quality of this focus.

Deliverables

Detailed report of experimental findings on the factorization and entity assignment of large KGs with efficiency and accuracy assessments determined using WikiData and other KGs as appropriate.

Budget

$10,000 USD

Success Criterion

Delivery of final report.

Link URL

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

New reviews and ratings are disabled for Awarded Projects

    No Reviews Avaliable

    Check back later by refreshing the page.

Welcome to our website!

Nice to meet you! If you have any question about our services, feel free to contact us.