MeTTaPedia

chevron-icon
Back
Top
chevron-icon
project-presentation-img
Nimrod Busany
Project Owner

MeTTaPedia

Expert Rating

n/a
  • Proposal for BGI Nexus 1
  • Funding Request $50,000 USD
  • Funding Pools Beneficial AI Solutions
  • Total 4 Milestones

Overview

We propose “MeTTaPedia,” an open-source effort to parse and encode DBpedia into the MeTTa language as a factual basis for Large Language Models (LLMs). By requiring LLM responses to cite MeTTa-formatted statements, we reduce hallucination, validate AI outputs, and mitigate misinformation. This curated knowledge graph will be freely accessible, fostering broader social impact by improving trustworthiness in AI, supporting knowledge workflows, and laying a foundation for responsible, beneficial AI.

Proposal Description

How Our Project Will Contribute To The Growth Of The Decentralized AI Platform

MeTTaPedia supports the BGI Nexus mission by encoding a verified knowledge base into MeTTa statements. This resource empowers users and AI to reduce misinformation and build trust, forming a foundation for community-driven AI and a more equitable future.

Our Team

Nimrod Busany is an AI Research Scientist and Manager at Accenture Labs with 15+ years in formal methods, semantic web, and advanced ML. He leads a team that develops award-winning Generative AI technologies with a strong patent pipeline and industry collaborations.

Gil Rosenblum is a Data & AI Product Manager at Accenture Labs with a background in law and engineering, effectively bridging technical and business realms to drive impactful AI products.

AI services (New or Existing)

MeTTaPedia Search Engine

Type

New AI service

Purpose

Proving access to relevant MeTTa statements

AI inputs

Natural language search query

AI outputs

Top K matching statements

Company Name (if applicable)

NA

The core problem we are aiming to solve

Modern LLMs frequently produce unverified or inaccurate statements—often called “hallucinations”—due to the opaque nature of their internal representations. This undermines trust, fosters misinformation, and impedes knowledge validation. Our proposal tackles this core problem by providing a clearly structured, formal knowledge base in MeTTa, enabling LLMs to cross-reference factual statements, reduce errors, and offer transparent, evidence-backed responses.

 

Our specific solution to this problem

Our solution begins with designing a mapping language—inspired by R2RML—that systematically transforms RDF+OWL statements into MeTTa form. This step includes defining an open, well-documented specification for handling classes, properties, and relationships within DBpedia. We then implement a parser that reads DBpedia’s triples, applies the mapping rules, and outputs a structured set of MeTTa statements. To ensure scalability and quality, we’ll incorporate testing and validation procedures, comparing the transformed MeTTa knowledge base against reference subsets of DBpedia. Finally, we’ll open-source our parser and knowledge base under a version-controlled repository, complete with a change management protocol for continuous updates and community contributions. This approach provides a clear, maintainable pathway for others to enrich the MeTTaPedia resource and leverage it for AI-driven projects.

 

Project details

1. Technical Approach

  1. Discovery and Data Selection

    • We will begin by determining which portions of the DBpedia data set are most relevant for an initial release, possibly focusing on a broad coverage of classes and properties. This ensures we have a representative subset before scaling to all DBpedia facts.
  2. Mapping Language (RDF+OWL to MeTTa)

    • Inspired by R2RML (which maps relational data to RDF), we will define an open, well-documented mapping language that transforms RDF+OWL constructs (classes, properties, relations, etc.) into MeTTa statements.
    • The mapping syntax will capture hierarchical relationships, domain/range constraints, and possibly align with other ontologies like Hyperseed-1 for future interlinking.
  3. Parser Implementation

    • We then develop a parser that systematically reads DBpedia’s RDF triple store, applies the mapping rules, and outputs the resulting MeTTa statements.
    • The parser will be modular, allowing for easy updates as DBpedia or MeTTa evolve. We will incorporate validation checks to ensure the correctness and consistency of generated statements.
  4. Version Control and Change Management

    • To ensure robust, ongoing updates, we will store the resulting MeTTa knowledge base in a public, version-controlled repository (e.g., GitHub).
    • We will define a clear change management protocol enabling incremental updates, community contributions, and structured discussions around each revision. This protocol will allow the knowledge base to grow organically while maintaining reliability and traceability.
  5. Scalability and Testing

    • Our pipeline will be designed for scalability, allowing batch processing of millions of DBpedia triples.
    • We will perform cross-validation against reference subsets of DBpedia or known ontologies to confirm that our transformations preserve semantic integrity.

2. Alignment with Other Initiatives

  • Hyperseed-1 (Ben Goertzel)
    Our MeTTa knowledge base can incorporate or cross-reference Hyperseed-1’s “core ontology” elements, making it easier for an early-stage OpenCog Hyperon mind (and similarly structured AI systems) to reason with validated statements.

  • MettaMotto (Alexey Potapov)
    MettaMotto enables calling LLMs from within MeTTa scripts. By integrating our curated MeTTa knowledge base, MettaMotto can leverage structured facts for context, thereby reducing hallucinations and boosting reasoning accuracy.

  • Create Corpus for NL-to-MeTTa LLM (SingularityNET)
    MeTTaPedia will yield a massive corpus of (Natural Language → MeTTa) pairs, invaluable for training or fine-tuning an AI-powered coding assistant. Developers can build LLMs that generate MeTTa code accurately, using real-world DBpedia facts.

  • Knowledge Graph Workflows (Robert Haas)
    This project focuses on loading external knowledge graphs into MeTTa, aligning well with our ETL approach for DBpedia. MeTTaPedia will provide a ready-to-use, consistent dataset that helps test new MeTTa scalability solutions like MORK or PLN.

  • MetaCOrpus (iliachry)
    By converting DBpedia into a MeTTa corpus, we complement other corpus-building efforts that aim to expand the MeTTa ecosystem. Our structured approach and change management protocols can serve as a blueprint for further corpus expansions.


3. Project Deliverables

  1. Mapping Specification

    • Public documentation detailing the rules, mappings, and syntax for converting RDF+OWL constructs into MeTTa statements.
  2. Parser Codebase

    • An open-source parser that ingests DBpedia triple dumps and outputs validated MeTTa facts.
    • Includes robust logging, error handling, and a test suite.
  3. Initial MeTTa Knowledge Base

    • A publicly accessible repository containing a curated portion of DBpedia in MeTTa form.
    • Well-organized directories, each with domain-specific or ontology-based segments of data.
  4. Version-Controlled Repository & Protocol

    • A GitHub repository (or similar) with clear guidelines on how to propose changes, conduct reviews, and merge updates.
    • Community-driven workflow ensuring continuous improvements and expansions.
  5. Documentation & Tutorials

    • Comprehensive “Getting Started” guides, architecture diagrams, and usage examples to help new contributors and integrators.
    • Tutorials on how to query and leverage the MeTTa knowledge base for AI workflows.

4. Future Directions

  • Scalable Expansion
    After the initial release, we plan to expand coverage to the full DBpedia dataset, incorporating domain-specific subsets or specialized ontologies in areas like biomedicine or finance.

  • Ontological Alignment
    We will explore aligning the resulting MeTTa statements with existing upper ontologies, including Hyperseed-1, to facilitate cross-project integrations.

  • Atomspace Search Engine
    To improve data discoverability and accessibility (FAIR), we will explore and propose an AI search service that retrieves top matching statements from the large MeTTaPedia atomspace.
  • LLM Integration
    As the knowledge base matures, it can be plugged into LLM-based applications or MettaMotto workflows, ensuring that real-world knowledge is validated and consistent.

  • Community Contributions
    By establishing a transparent process for contributions, we invite academic researchers, data scientists, and open-source enthusiasts to enhance coverage, patch errors, or create new domain mappings.


5. Impact and Relevance

MeTTaPedia stands to become a foundational, open-source resource for mitigating AI hallucinations, verifying facts, and extending the broader MeTTa and Hyperon ecosystem. By providing a bridge from widely used semantic web standards (RDF+OWL) to the symbolic and formal nature of MeTTa, we reinforce the BGI vision of fostering beneficial AI and open collaboration.

Ultimately, this project supports responsible AI by delivering more trustworthy, evidence-backed outputs, thereby reducing misinformation, empowering innovative applications, and paving the way for a truly collective intelligence future in line with the BGI Nexus mission.

Needed resources

We require compute and storage resources, along with an extra MeTTa developer and a data engineer to build a scalable, robust solution.

 

Existing resources

Our project leverages DBPedia as our primary data source, and we plan to incorporate existing MeTTa meta-ontologies to enhance our solution.

 

Open Source Licensing

MIT - Massachusetts Institute of Technology License

Links and references

Our initiative aligns with several existing SingularityNet projects focused on decentralized, secure, and collaborative AI solutions. By integrating scalable compute and storage resources, our solution can enhance these initiatives—boosting efficiency, robustness, and the overall impact of AI-driven applications:

Was there any event, initiative or publication that motivated you to register/submit this proposal?

A personal referral

Proposal Video

Placeholder for Spotlight Day Pitch-presentations. Video's will be added by the DF team when available.

  • Total Milestones

    4

  • Total Budget

    $50,000 USD

  • Last Updated

    23 Feb 2025

Milestone 1 - Discovery, Planning, and Mapping Specification

Description

Discovery & Planning: Select initial DBpedia subsets to convert, assess technical requirements, and define an R2RML-inspired mapping strategy for transforming RDF+OWL into MeTTa statements. Architecture & Roadmap: Finalize overall project structure, specify data flow, and prepare a short proof of concept with a small DBpedia subset.

Deliverables

Mapping Specification Document: A reference detailing how each RDF and OWL construct (e.g., classes, properties, restrictions) will map to MeTTa syntax. Initial Proof of Concept: A small but fully functional dataset demonstrating how the parser will eventually work at scale.

Budget

$10,000 USD

Success Criterion

A complete, reviewed mapping specification with no major open issues. Confirmation that a small subset of DBpedia can be transformed into valid MeTTa statements with consistent testing results.

Milestone 2 - Parser

Description

Parser Implementation: Develop the core parser to systematically read DBpedia’s RDF triple store and convert data into MeTTa using the mapping specification. Testing & Validation Framework: Implement automated tests (unit/integration) and a validation pipeline to compare MeTTa outputs against known reference data.

Deliverables

Open-Source Parser: Publicly accessible codebase (e.g., GitHub) including scripts, README, and basic usage examples. Test Suite & Validation Reports: Automated testing framework for functionality, performance, and consistency of transformed data.

Budget

$15,000 USD

Success Criterion

A functioning parser that can process at least a sizable (medium-scale) portion of DBpedia without errors. All unit and integration tests pass, and initial validation checks confirm the correctness of the MeTTa statements.

Milestone 3 - MeTTa Knowledge Base Construction

Description

Scaling Up & Initial Release: Expand the parser to handle the full (or large-scale) DBpedia dataset, applying any optimizations for performance. Comprehensive Validation: Conduct thorough coverage and consistency checks across domain subsets to ensure high-quality MeTTa statements.

Deliverables

Public MeTTa Knowledge Base (Beta): A sizable portion (or potentially all) of DBpedia in MeTTa form, hosted in a version-controlled repository with an accessible directory structure. Validation Logs & QA Results: Detailed reports on consistency checks, error rates, and coverage metrics, shared openly for community feedback.

Budget

$15,000 USD

Success Criterion

At least one large-scale release of the MeTTa knowledge base is publicly accessible, demonstrating acceptable performance and minimal errors. User or community tests reveal no critical data consistency issues that block practical usage.

Milestone 4 - Publishing

Description

Documentation & Community Onboarding: Finalize detailed usage guides, tutorials, and best practices for contributing. Change Management & Future Extensions: Establish processes and guidelines for version control, incremental updates, and integration with related initiatives (e.g., Hyperseed-1, MettaMotto, NL-to-MeTTa).

Deliverables

Comprehensive Documentation: Including quick-start guides, architecture diagrams, contributor guidelines, and tutorials demonstrating real-world usage (e.g., how to integrate with MettaMotto or incorporate Hyperseed-1 concepts). Community Feedback & Governance Model: Clear protocols for proposing new data sets or updates, reviewing community pull requests, and merging improvements.

Budget

$10,000 USD

Success Criterion

A stable, well-documented repository that new users can clone and use with minimal setup. At least one successful round of community feedback (pull requests, issues raised, etc.) is integrated, demonstrating a functional governance workflow.

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon