GraphRAG, LLM-derived Knowledge Graph for RAG

chevron-icon
Back
project-presentation-img
Anthony Oliko
Project Owner

GraphRAG, LLM-derived Knowledge Graph for RAG

Funding Requested

$115,000 USD

Expert Review
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 0
Community
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 4.7 (6)

Overview

We propose that the data and relationships of the foundational Knowledge Graph be derived by an LLM after ingesting the content in multiple format from APIs. This will lead us to GraphRAG, LLM-derived Knowledge Graphs. This is basically Knowledge Graphs generated by AI. Recent studies have shown that using an LLM-derived Knowledge Graph for RAG operations generates better results than baseline RAG. Our goal is to use knowledge and resources available in our DFR3 RFP8-funded project, to create a foundational Knowledge Graph that "not only structures Deep Funding data for immediate needs but also sets the groundwork for scalable expansion to encompass the entire SingularityNET ecosystem."

Proposal Description

Company Name (if applicable)

Trenches AI

Our specific solution to this problem

We will leverage open-source LLM and Vector embeddings tools, Django, Amazon Neptune, GraphQL and AWS.

Key components and specific functions:

  1. Graph Database- Amazon Neptune (Optional): As the backbone of the architecture, will store all relational data and provide infrastructure for complex queries and data relationships.

  2. Data Ingestion Component: This is for efficient data extraction and loading from diverse sources into the Graph database, it will also handle dynamic and scheduled updates necessary for keeping the KG current and valuable.

  3. CRUD Functionality Component: This is for the essential operations on the data (create, read, update, delete), making the KG a dynamic tool that evolves with the network's activities and insights.

  4. Search and Query Interface: Integrated via GraphQL, this is for data retrieval, crucial for supporting complex analytical tasks and user queries.

  5. Integration and API Management: Ensures that the KG can securely and efficiently interact with external applications through well-documented and managed APIs.

  6. Deployment and Hosting on AWS: By utilizing AWS services like EC2, Elastic Load Balancer, and potentially RDS/Aurora, the architecture guarantees reliability, scalability, and high availability.

  7. Maintenance and Monitoring: We can use AWS CloudWatch and other tools, for continuous monitoring and maintenance to keep the system performance optimized and under surveillance.

  8. Documentation and Developer Support: To accelerate adoption and ease of integration.

Project details

Trenches AI was funded in Deep Funding Round 3 RFP-8 to build "Phase 1 Of Huggingface Datasets Library To SingularityNET Pipeline" knowing fully well that it was only a piece of a much larger pie involving building tools for Knowledge Graphs And LLMs Integration. 

 

The project involved building a pipeline of queryable endpoints for uniformly assessing, downloading, and querying knowledge graphs on Huggingface through its Python library.

Phase 1 was completed by implementing and querying multiple knowledge graph datasets.

There are multiple points of overlap between this project and what we are trying to build in this RFP, especially in the areas of existing Graph Database infrastructure and its import tools. We will be making use of the existing tools and resources to speed up the development process.



Knowledge Graph System Architecture

1. Data Ingestion Component

Purpose: Handles the extraction, transformation, and loading (ETL) of data from various sources into the Knowledge Graph.

Technologies:

  • Python scripts for extracting data from the Deep Funding voting portal and other potential data sources.
  • Vector embeddings feature to convert ingested data into workable chunks.
  • LLM to perform reasoning operations on the embedded data.
  • Amazon Neptune Import Tools for efficient data loading.
  • Django for managing ingestion workflows and administration interfaces.

Features:

  • Scheduled ingestion jobs.
  • Data validation and cleaning to ensure the integrity of the graph.
  • API endpoints for dynamic data ingestion.

 

2. Graph Database (Amazon Neptune)

Purpose: Core storage and processing unit for all graph-based operations.

Features:

  • Storage of entities such as documentation, proposers, proposals, voters, votes, and their relationships.
  • Scalable schema to accommodate future expansion across the SingularityNET ecosystem.
  • Indexing and querying mechanisms optimized for graph traversal.


3. CRUD Functionality Component

Purpose: Manages the creation, reading, updating, and deletion of data within the Knowledge Graph.

Technologies:

  • Django is the backend framework for routing and handling CRUD operations.
  • AN Query Language for interacting with the graph database.

Features:

  • Secure API endpoints for CRUD operations.
  • Integration with Django's user authentication and authorization to manage permissions.

 

4. Search and Query Interface

Purpose: Provides robust search and query capabilities across the Knowledge Graph.

Technologies:

  • GraphQL interface for flexible and efficient data retrieval.
  • Django to expose GraphQL endpoints.

Features:

  • Support for complex queries to navigate and interrogate the Knowledge Graph.
  • Optimizations for performance in graph traversal and data retrieval.

 

5. Integration and API Management

Purpose: Manages how external applications interact with the Knowledge Graph.

Technologies:

  • Django REST framework for API management.
  • AWS API Gateway for secure API exposure.

Features:

  • Standardized API documentation using Swagger or similar tools.
  • Rate limiting, access control, and logging for API usage.

 

6. Deployment and Hosting

Purpose: Ensures the application is securely and reliably hosted with scalable resources.

Platform:

  • AWS Elastic Compute Cloud (EC2) for hosting Django and Amazon Neptune instances.
  • AWS Elastic Load Balancer to manage incoming traffic and ensure high availability.
  • AWS RDS/Aurora for relational data storage needs if required.

 

7. Maintenance and Monitoring

Purpose: Provides ongoing support and real-time monitoring of the system's health.

Technologies:

  • AWS CloudWatch for monitoring system performance and logs.
  • Mattermost for team communications and issue tracking.

Features:

  • Regular updates and patch management.
  • Performance tuning and anomaly detection.

 

8. Documentation and Developer Support

Purpose: Ensures comprehensive documentation and support for developers and end-users.

Technologies:

  • Sphinx or ReadTheDocs for maintaining and hosting project documentation.

Features:

Detailed API documentation and user manuals.

  • Developer guides and code examples for extending and integrating with the KG.

 

This architecture provides a scalable, flexible, and robust foundation for developing the Knowledge Graph as specified in the RFP, leveraging the power of LLMs, Vector embeddings, Django, AN, and AWS to meet and exceed the project requirements.

 

Our team

  1. Andria Ez, LinkedIn Project Manager, Community Lead
  2. Anthony Oliko, LinkedIn Project Lead,  Programmer
  3. Quyum Kehinde, Programmer
  4. Ibrahim Abdulmalik, LinkedIn Programmer, DevOps
View Team

What we still need besides budget?

Yes

Describe the resources you still need

We have proposed Amazon Neptune as the Graph Database for this project because of the ease of use for the job at this point in time. The RFP proposer has specified that using any of SingularityNETs partners like Ocean Protocol or ICP for hosting would be a bonus.

With that in mind, if we are able to find an individual or resources that can help in making use of Ocean or ICP for hosting the Graph Database, it would be a great advantage and we are willing to switch to the preferred partner.

Existing resources we will leverage for this project

Yes

Description of existing resources

DFR3-RFP8  Phase 1 Of Huggingface Datasets Library To SingularityNET Pipeline

Open Source Licensing

Apache

Describe license details and, if applicable, list any components that are not subject to this license.

Copyright ((C)) 2023 Trenches AI

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License, is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Additional videos

GraphRAG explanation

Proposal Video

Placeholder for Spotlight Day Pitch-presentations. Video's will be added by the DF team when available.

  • Total Milestones

    7

  • Total Budget

    $115,000 USD

  • Last Updated

    20 May 2024

Milestone 1 - Project Kickoff and Initial Setup

Description

This phase focuses on establishing the project infrastructure setting up development environments and initial planning and design of the Knowledge Graph system.

Deliverables

Project plan detailing timelines resources and requirements. Setup of development environments and necessary toolchains. Completion of system architecture design.

Budget

$10,000 USD

Milestone 2 - Development of Data Ingestion Component

Description

Development of data ingestion scripts and mechanisms to efficiently import data from the Deep Funding portal and other approved data sources into the Knowledge Graph.

Deliverables

Functional data ingestion module. Documentation on data formats source integration and update schedules.

Budget

$15,000 USD

Milestone 3 - CRUD Functionality and Graph Database Setup

Description

Setting up Amazon Neptune graph database schemas developing CRUD functionalities and ensuring data integrity and security.

Deliverables

Fully operational CRUD operations integrated with Amazon Neptune. Secure database setup with performance benchmarks.

Budget

$20,000 USD

Milestone 4 - Search and Query Interface Implementation

Description

Implementing and testing the GraphQL-based search and query interface to ensure it meets the specified requirements and performance metrics.

Deliverables

Implementation of the GraphQL API. Comprehensive API documentation and testing results.

Budget

$15,000 USD

Milestone 5 - Integration API Management & Initial Deployment

Description

Final integration of components setting up API management and the initial deployment of the Knowledge Graph on AWS. This includes preliminary stress testing and security assessments

Deliverables

Deployment of the complete system on AWS. API gateway setup with detailed access control. Initial stress testing and security report.

Budget

$15,000 USD

Milestone 6 - Docs Maintenance Planning Hosting and Launch

Description

Creating detailed user and developer documentation establishing a maintenance and monitoring plan and the official launch of the Knowledge Graph system.

Deliverables

Comprehensive system documentation. Hosting Maintenance and monitoring plans for 1 year. Official launch announcement and system go-live.

Budget

$20,000 USD

Milestone 7 - Integration with SNET as a 'knowledge node

Description

Integrate the Knowledge Graph as a 'knowledge node' within the SingularityNET ecosystem.

Deliverables

Knowledge Graph integrated as a 'knowledge node' within the SingularityNET ecosystem.

Budget

$20,000 USD

Join the Discussion (3)

Sort by

3 Comments
  • 0
    commentator-avatar
    Jan Horlings
    May 19, 2024 | 9:36 AM

    Is this proposal compliant with the actual RFP: https://deepfunding.ai/rfp/content-knowledge-graph/ ?

    If not, please add it to another pool such as Miscellaneous or, in case you are utilizing/creating an AI service, to 'new services.'

    • 0
      commentator-avatar
      Anthony Oliko
      May 19, 2024 | 11:55 AM

      Thank You Jan, I went back and read through both the proposal and the rules and guidelines for this pool, everything is in order.Regards.

      • 0
        commentator-avatar
        Jan Horlings
        May 19, 2024 | 1:53 PM

        Thanks Anthony, at first glance it looked a bit like a general KG proposal, but with this confirmation, I'm happy yo leave it to the review circle. Good luck!

Reviews & Rating

Sort by

6 ratings
  • 0
    user-icon
    mivh1892
    May 14, 2024 | 11:41 AM

    Overall

    5

    • Feasibility 5
    • Viability 5
    • Desirabilty 4
    • Usefulness 5
    Building a Knowledge Graph for SingularityNET

    The proposed project aims to develop a Knowledge Graph (KG) system that leverages Large Language Models (LLMs) to extract and analyze data from various sources, providing a comprehensive understanding of the SingularityNET ecosystem. The KG will serve as a central repository of structured data, enabling advanced search, query, and analysis capabilities.

    Feasibility:

    • Technical Expertise: The project team possesses the necessary expertise in LLM technology, graph databases, and software development to execute the proposed plan.
    • Open-Source Tools: The project utilizes readily available open-source tools such as Django, Amazon Neptune, and GraphQL, reducing development costs and ensuring maintainability.
    • Existing Resources: The team can leverage existing resources from their previous DFR3-RFP8 project, including codebases and infrastructure components.

    Sustainability:

    • Market Demand: The demand for AI-powered knowledge management solutions is growing, particularly in the context of large-scale data ecosystems like SingularityNET.
    • Scalability: The proposed architecture is designed to be scalable, allowing for the integration of additional data sources and the handling of increasing data volumes.
    • Open Standards: The use of open standards and protocols, such as GraphQL and Apache, facilitates integration with other systems and promotes long-term sustainability.

    Desirability:

    • Problem-Solving: The KG addresses the need for a structured and centralized repository of knowledge within the SingularityNET ecosystem, improving data accessibility and analysis capabilities.
    • Advanced Features: The integration of LLM technology enables advanced features like natural language search, relationship identification, and semantic reasoning.
    • Developer-Friendly: The project emphasizes developer documentation and support, encouraging adoption and contributions from the SingularityNET community.

    Usefulness:

    • Enhanced Data Understanding: The KG provides a deeper understanding of the relationships and patterns within the SingularityNET ecosystem, enabling informed decision-making.
    • Improved Search and Discovery: Advanced search capabilities facilitate the discovery of relevant information and connections across the ecosystem.
    • Support for AI Applications: The KG serves as a foundation for building AI applications that can utilize its structured data and relationships for tasks like recommendation, anomaly detection, and knowledge-based reasoning.

    Additional Considerations:

    • Exploration of Ocean Protocol and ICP: While Amazon Neptune is proposed as the graph database, investigating the potential of using Ocean Protocol or ICP for decentralized storage could be beneficial.
    • Community Engagement: Continuous engagement with the SingularityNET community is crucial to gather feedback, ensure alignment with project goals, and foster a collaborative development environment.

  • 0
    user-icon
    BlackCoffee
    May 12, 2024 | 2:40 PM

    Overall

    5

    • Feasibility 5
    • Viability 4
    • Desirabilty 4
    • Usefulness 5
    Proposal worthy of expert level

    A good proposal written by a reputable team, I think it would not be an exaggeration for me to call the team experts. I think milestones need to add a lot of detail to the work that needs to be done to be worthy of an expert's recommendation. Attached to each job is the corresponding amount of money to use for that job. The more meticulous you are, the more trust you will gain from the community. It is about clear transparency about the use of money.
    For the rest, I am satisfied with the feasibility, desirability and usefulness presentations. The possibility of survival is high if the team continues to do well and receives support from the community.

  • 0
    user-icon
    TrucTrixie
    May 10, 2024 | 11:18 AM

    Overall

    5

    • Feasibility 5
    • Viability 5
    • Desirabilty 5
    • Usefulness 5
    High-tech technical applications

    I predict this proposal will gain a lot of attention from the community because of the benefits it brings, the technology integrated in it and the amount of funding requested ($120,000 is a relatively large amount for a proposal of Deepfunding).
    Building a foundational knowledge map is a big and macro idea - it differs from other proposals in scope. Of course, to do so, it must also apply many complex technologies, be highly technical and picky about readers.
    The beauty here is the suggestion of knowing how to take advantage of existing technologies - that is, knowing how to use other people's things in a legal way to build your own.
    Regarding the great usefulness and feasibility, I won't mention much - I just want to mention that the implementation of technical measures must be very careful to maintain good viability.

    user-icon
    Anthony Oliko
    May 11, 2024 | 7:22 PM
    Project Owner

    Thanks for the review TrucTrixie. We are taking another look at the the milestones and deliverables , to make it more detailed and also take into consideration the technical dependence and viability with Deep Funding goals.

  • 0
    user-icon
    1993 Taiwoan
    May 10, 2024 | 2:58 AM

    Overall

    4

    • Feasibility 4
    • Viability 4
    • Desirabilty 4
    • Usefulness 4
    Knowledge Graph created by AI

    This is a GraphRAG project The knowledge graph derived from the LLM for RAG activities will produce better results than the basic RAG project that uses funding available in the project due to DFR3 RFP8 to create a Knowledge Graph communication.

    Possibility

    • essential operations on data (create, read, update, delete), making KG a dynamic engine that evolves with network activities and insights.
    • Trenches AI was funded in RFP-8 Deep Funding Round 3 to build “Phase 1 of the Huggingface Dataset Library for the SingularityNET Pipeline..
    • Develop scripts and data ingestion mechanisms to efficiently import data from the Deep Funding portal and other approved data sources into the Knowledge Graph.
    • Integrate the Knowledge Graph as a 'knowledge node' in the SingularityNET ecosystem.

    possibility of existence

    • Graph database (Amazon Neptune) core storage and processing unit for all graph-based operations.
    • The schema is scalable to accommodate future expansion across the SingularityNET ecosystem.
    • Stores entities such as documents, proposers, proposals, voters, votes, and their relationships.
    • CRUD functionality component Integrates with Django's user authentication and authorization for permission management.

    hankering

    • In my opinion, the project really has great potential because this Architecture provides a powerful, flexible and extensible platform to develop a Knowledge Graph as specified in the RFP, leveraging the power of Django, AN and AWS to meet and exceed project requirements.

    useful 

    • With an experienced team, I believe the project will help the SingularityNET ecosystem develop strongly and the community grow stronger.
    • with the technologies that the project brings positively to everyone who participates and knows about Deep Funding.

  • 0
    user-icon
    Max1524
    May 9, 2024 | 11:08 AM

    Overall

    5

    • Feasibility 5
    • Viability 5
    • Desirabilty 4
    • Usefulness 5
    Member identity, time and duration

    A proposal written by a highly qualified professional always has a certain guarantee of credibility. I don't have many comments. Of course it has high feasibility, viability, desirability and usefulness.
    I only have comments on the disclosure of members' identities, the more public the better for information transparency. And milestones should include implementation timing and duration (I'm judging based on the current presentation).

  • -1
    user-icon
    Joseph Gastoni
    May 8, 2024 | 2:01 PM

    Overall

    4

    • Feasibility 4
    • Viability 3
    • Desirabilty 3
    • Usefulness 4
    It has a strong foundation and addresses a need

    Trenches AI's Content Knowledge Graph proposal has a strong foundation and addresses a critical need within the SingularityNET ecosystem.

    Feasibility:

    • High: The project leverages well-established technologies (Django, Amazon Neptune, GraphQL) with a clear technical architecture.
    • Existing infrastructure and experience from a previous Deep Funding project can expedite development.

    Viability:

    • High: The project directly addresses the RFP requirements and has a clear funding source (Deep Funding).
    • The proposed architecture promotes scalability for future expansion within the SingularityNET ecosystem.

    Desirability:

    • High: A well-designed Content Knowledge Graph can be highly desirable for various stakeholders within SingularityNET.
    • It can facilitate data exploration, analysis, and decision-making for Deep Funding projects and potentially other applications.

    Usefulness:

    • High: A well-structured Knowledge Graph can be a valuable tool for organizing and interlinking data related to Deep Funding, proposals, and SingularityNET ecosystem participants.
    • It can improve data searchability, analysis, and potentially lead to new insights and discoveries.

    Besides, the project should consider:

    • Clear definition of data sources and quality control processes for data ingestion is important.
    • A well-defined plan for user access, authentication, and authorization is necessary.

    Here are some strengths of this project:

    • Leverages existing infrastructure and expertise from a previous Deep Funding project.
    • Proposes a scalable and flexible architecture using established technologies.
    • Directly addresses the requirements outlined in the RFP.

    Here are some challenges to address:

    • Ensuring data quality and consistency throughout the ingestion process from various sources.
    • Defining clear user roles and access levels for interacting with the Knowledge Graph.
    • Developing a comprehensive documentation and support system for developers and end-users.

    By addressing these challenges and focusing on a well-defined data governance strategy, Trenches AI's Content Knowledge Graph can become a valuable asset for the SingularityNET ecosystem .

     

Summary

Overall Community

4.7

from 6 reviews
  • 5
    4
  • 4
    2
  • 3
    0
  • 2
    0
  • 1
    0

Feasibility

4.7

from 6 reviews

Viability

4.3

from 6 reviews

Desirabilty

4

from 6 reviews

Usefulness

4.7

from 6 reviews