Unique Knowledge Graph for Source and Content Reliability

chevron-icon
Back
project-presentation-img
Dominik Tilman
Project Owner

Unique Knowledge Graph for Source and Content Reliability

Funding Awarded

$40,000 USD

Expert Review
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 0
Community
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 0 (0)

Status

  • Overall Status

    🛠️ In Progress

  • Funding Transfered

    $34,880 USD

  • Max Funding Amount

    $40,000 USD

Funding Schedule

View Milestones
Milestone Release 1
$4,800 USD Transfer Complete 22 Dec 2023
Milestone Release 2
$12,160 USD Transfer Complete 30 Dec 2023
Milestone Release 3
$11,200 USD Transfer Complete 07 Mar 2024
Milestone Release 4
$6,720 USD Transfer Complete 02 May 2024
Milestone Release 5
$3,040 USD Pending TBD
Milestone Release 6
$2,080 USD Pending TBD

Status Reports

May. 7, 2024

Status
🙂 Pretty good
Summary

- Finalized Query Interface - Optimized NER

Full Report

Project AI Services

No Service Available

Overview

TrustLevel proposes a comprehensive solution to assess the reliability and objectivity of textual information across various domains. Their approach involves integrating diverse data sources, structured knowledge graph development, and multifaceted analysis. By doing so, they aim to provide users with a powerful tool for evaluating text objectivity.

The project consists of several key components, including sentiment analysis, named entity recognition, clickbait detection, source reputation analysis, and the creation of a structured knowledge graph. Users will have access to a query interface that allows them to interact with the knowledge graph and perform complex analyses.

TrustLevel outlines a detailed budget breakdown for the project, divided into five milestones, spanning 14 weeks. Each milestone focuses on specific aspects of the project, such as system architecture design, knowledge graph development, text analysis integration, query interface development, and final documentation.

To mitigate potential risks, TrustLevel emphasizes proactive communication, compatibility analysis, and alternative APIs/sources in case of technical challenges or data quality issues. They also provide optional tasks to ensure project completion even if the scope or timing becomes a concern.

Overall, TrustLevel's proposal demonstrates a well-structured plan to address the RFP's goals by leveraging a knowledge graph and various text analysis tools to assess text objectivity comprehensively.

Proposal Description

Compnay Name

TrustLevel

Service Details

Our approach empowers users to make informed judgments about text reliability across various domains, from news to product reviews and legal cases. By creating a unique knowledge graph we align with the RFP’s goal. Our solution's uniqueness lies in its integration of diverse data sources, structured knowledge graph, and multifaceted analysis. The combination of these elements enables a holistic assessment of text objectivity, making it a powerful tool for users seeking reliable information. By integrating diverse data sources and providing multifaceted analysis, we offer a unique and comprehensive tool for evaluating text objectivity.

Solution Description

A: Overview

Since knowledge graphs play a crucial role in improving the capabilities of LLMs by providing structured, semantically rich data that helps these models to better understand context and generate outputs, we propose to develop an approach to assess the objectivity of textual information.

In the context of assessing objectivity, knowledge base query support would allow users to ask questions about the objectivity of texts, retrieve relevant information from the knowledge graph, and perform advanced analyses to make informed judgments. 

For example, a user could query the knowledge base to find all articles with a negative sentiment that mention a specific person, and then further analyze the source reputation of those articles to assess potential bias.

Our solution for assessing the objectivity of textual information:

  1. We apply systematically various text analysis APIs.

  2. We build a knowledge graph to represent the curated content and estimate a composed trustworthiness score .

  3. We develop a query interface that allows users to interact with our knowledge graph-based approach. 

B: Knowledge base query support

Our knowledge base query support enables users to leverage the structured information within the knowledge graph to conduct complex analyses and evaluations of textual data, contributing to the assessment of objectivity in a more sophisticated and data-driven manner.

  • Querying for Information: Users can submit queries to the knowledge base to request specific information, such as sentiment analysis results, named entities, clickbait probability, or source reputation assessments.

  • Filtering and Sorting: Users can refine their queries by applying filters and sorting options to obtain more precise results. For example, they could filter articles with a positive sentiment, sort entities by relevance, or filter out articles from sources with a low reputation score.

  • Cross-Referencing Data: Users can perform cross-referencing of data within the knowledge base. For instance, they might want to find all articles with a negative sentiment that mention specific persons or products.

C: Implementation details 

To create a knowledge graph to measure and elaborate on the objectivity of textual information we will use the outputs of various text analysis APIs in a multi-step process. Below, we'll outline the approach with specific tasks, practical steps, and examples for each of the components:

Note: Input data of TrustLevel‘s APIs are based on open datasets:

 

1. Sentiment Analysis:

  • Task: Determine the polarity of an article (negative, neutral, positive).

  • API: TrustLevel sentiment analysis API

  • Steps:

    • Extract the text from the article.

    • Send the text to the sentiment analysis API.

    • Receive the sentiment score.

    • Classify the sentiment as negative, neutral, or positive based on the score.

  • Example: For an article about a recent product launch, the sentiment analysis might classify it as "positive."

2. Text Analysis for Factual Information:

  • Task: Extract factual information like location, persons, products, dates, etc.

  • API: Use TrustServista Entities API

  • Steps:

    • Parse the article text.

    • Apply NER API to identify named entities.

    • Categorize entities into predefined categories (location, person, product, date, etc.).

  • Example: An article about a political event might extract entities like "Washington D.C." (location), "John Smith" (person), and "2023-09-02" (date).

3. Clickbait Detection:

  • Task: Determine the probability if a text is promotional (clickbait).

  • API: Use TrustLevel Clickbait API

    • Preprocess the text and extract relevant features.

    • Use the trained model.

    • Predict the probability of the text being clickbait.

  • Example: An article with phrases like "You won't believe what happened next!" might be flagged as potential clickbait.

4. Source Reputation and Bias Analysis:

  • Task: Determine potential biases of the source/author.

  • Approach: Using our InputData of source reputations and author profiles and using NLP (API of ChatGPT) to analyze the text for political content and we create a score to mark it as a potential political bias.

  • Steps:

    • Maintain a database of sources and authors with reputation scores.

    • Analyze the historical content from the source or author.

    • Look for patterns, political affiliations, or biases in their previous work.

  • Example: A source known for its conservative bias might be flagged as potentially biased in favor of certain political viewpoints.

5. Creating the Knowledge Graph:

  • Store the results of each analysis (sentiment, entities, clickbait probability, source reputation) in a structured format (JSON).

  • Relationship Extraction: Process the results by linking them using a unique ID. Identify and extract complex relationships between entities mentioned in the text. For example, if a document mentions a person and a location, establish the relationship between them.

  • Knowledge Graph Construction:

    • Organize the data into nodes (articles, entities, sources/authors) and edges (relationships between articles and entities, sources/authors and articles).

    • Populate the knowledge graph with nodes representing entities and edges representing relationships. Attach attributes like sentiment scores and clickbait probabilities to nodes or edges.

  • Visualization and Query:

    • Use a graph database like Neo4j or a triple-store database to build and query the knowledge graph.

    • Visualize and locally query the knowledge graph to check for potential corrections

    • Deploy the KG to a cloud service and build a query  interface to explore and analyze the structured information

6. Scoring Reliability and Trustworthiness of a Text:

  • Knowledge Graph Integration:

    • Utilize the knowledge graph to cross-reference information in the text. Verify facts, relationships, and historical context.

    • Identify any discrepancies or inconsistencies between the text and trusted sources. Inaccurate or unsupported claims in the text can lower trustworthiness.

  • Composite Score Calculation:

    • Assign weights to each of the factors based on their importance in determining trustworthiness (e.g., sentiment might have a lower weight compared to source reputation).

    • Calculate a composite trustworthiness score by combining the scores from each factor, considering the weights.

    • Apply the composite trustworthiness score to the text.

  • Optional tasks (dependent on the remaining capacity for the project):

    1. Classification and Threshold Determination:

    Establish a threshold for the composite trustworthiness score that separates trustworthy and untrustworthy texts.

    The threshold should be determined based on the specific context and application. For example, a news aggregator might have stricter criteria than a general information repository.

    2. Visualization:

    Provide users with a visualization or report that shows the trustworthiness score and factors contributing to it.

    Highlight areas of concern, such as biased sources or high clickbait probability.

D: Potential real world use cases which could utilize the information from this KG/KB:

1. Fake News Detection and Source Reliability Assessment:

  • Use Case: Identification and assessment of reliability of news articles and sources.

  • Advanced Reasoning: Analysis of sentiment, named entities, and historical patterns within a KG/KB to determine if a source or article exhibits suspicious bias or inconsistencies. Cross-reference with the KG/KB to check the historical reliability of the source and its previous reporting.

2. Investigative Journalism Support:

  • Use Case: Assistance for investigative journalists in uncovering hidden relationships, potential conflicts of interest, and biases in complex news stories.

  • Advanced Reasoning: Application of graph analytics on the KG/KB to uncover hidden connections between individuals, organizations, and events mentioned in articles. Detect patterns that may indicate undue influence or corruption.

3. Scientific Research Validation:

  • Use Case: Validation of scientific claims and research findings by assessing the objectivity and credibility of academic papers and studies.

  • Advanced Reasoning: Analysis of references and citations in academic papers against the KG/KB to assess the credibility of the sources cited. Evaluate the sentiment of discussions around the research topic in academic communities.

4. Product Reviews:

  • Use Case: Validation of product reviews.

  • Advanced Reasoning: Analysis of references and citations from the same source on different platforms in comparison to other sources, thereby using the KG/KB to assess the credibility of the sources. 

5. Legal Case Precedent Evaluation:

  • Use Case: Helping legal professionals to assess the relevance and objectivity of legal precedents in complex cases.

  • Advanced Reasoning: Analysis of legal documents, including court rulings and legal commentary, using sentiment analysis to understand the public perception of specific cases. Cross-referencing with KG/KB can help to determine the historical context and broader legal implications.

E: Justification of uniqueness

Our approach to assessing the reliability and trustworthiness of textual information using a knowledge graph and various text analysis tools offers several unique aspects:

  • Integration of Multiple Data Sources: Our approach involves gathering data from diverse sources, curating it, and then integrating it into a knowledge graph. This combination of data from various domains and sources allows for a more comprehensive assessment of text objectivity.

  • Structured Knowledge Graph: The use of a structured knowledge graph enables the representation of relationships between entities, which can be crucial for assessing objectivity. For example, you can trace the historical affiliations of sources, making it easier to identify potential biases.

  • Multi-Faceted Analysis: Our approach combines sentiment analysis, entity recognition, clickbait detection, source reputation analysis and knowledge graph integration. This multifaceted analysis provides a holistic view of text objectivity, considering not only the content but also the context and credibility of the sources.

F: Potential outlook and future steps

1.  Integration with Language Models and Dialogue Systems: The knowledge graph and query interface can be made compatible with generative language models and dialogue systems. This integration will allow these AI components to access and interact with the knowledge graph's data.

2. Domain and Multi Domain Representation: We can create unique domains or multi domain representations within the knowledge graph. This can involve categorizing and tagging content based on specific themes or topics to make it more accessible to users.

3. Deployment on the AI Platform: We can deploy the knowledge graph, knowledge base, and query interface on the  SNET AI marketplace, ensuring scalability, reliability, and security.

4. Further development of the TrustLevel KG/KB: The system can support advanced reasoning and analysis, such as identifying patterns or anomalies in sentiment across different sources or over time, or assessing the credibility of sources based on historical data.

Milestone & Budget

The proposal is divided into five milestones. The estimated time frame is 14 weeks.

Please check the Google Spreadsheet with a more detailed budget breakdown:

Milestone 1: Project Initiation and Architecture Design

Description:

The primary focus is on setting the project's foundation by designing the system architecture. This includes planning for scalability, data security, and data privacy. Additionally, we will select the appropriate technology stack for the implementation.

Deliverables:

  • System architecture design document: A comprehensive plan describing the overall architecture, including the structure of the knowledge graph, data storage mechanisms, and integration points for text analysis APIs and other components.

Timeline:

  • Weeks 1-3

Budget:

  • $4,800.00 USD

Milestone 2: Knowledge Graph Development

Description:

This phase focuses on gathering and curating data from various sources. The data will undergo cleaning and structuring to ensure consistency and quality. Data pipelines will be established to facilitate continuous data ingestion. The primary deliverable of this milestone is the creation of the knowledge graph, including defining the schema and ontology, and setting up the knowledge graph database.

Deliverables:

  • Curated dataset: A dataset collected from diverse sources, cleaned, and structured for consistency.
  • Data pipelines: Established pipelines to enable continuous data ingestion into the knowledge graph.
  • Knowledge graph schema and ontology: A defined structure for the knowledge graph, including entities, attributes, and relationships.
  • Populated knowledge graph database: The knowledge graph database populated with curated data, ready for further analysis.
  • If necessary, employ additional NLP and ML techniques to refine the query and scoring results.

Timeline:

  • Weeks 4-6

Budget:

  • $12,160.00 USD

Milestone 3: Text Analysis Integration and Source Reputation Analysis

Description:

In this phase, we integrate text analysis APIs, such as sentiment analysis, named entity recognition (NER), and clickbait detection, into the system. Additionally, we implement algorithms for assessing the reputation of sources. This includes analyzing historical source data and developing author analysis modules to track contributions over time.

Deliverables:

  • Integrated text analysis APIs: APIs integrated into the system for sentiment analysis, NER, and clickbait detection, enabling the extraction of valuable information from text.
  • Source reputation analysis algorithms: Algorithms for assessing the credibility and reputation of sources, which will be used to enhance the trustworthiness assessment of texts.
  • Author analysis module: A module for analyzing author profiles and tracking their historical contributions, providing insights into potential biases and expertise.
  • Optional: Historical source database: A database containing historical source information, allowing for reference and analysis of source reliability.

Timeline:

  • Weeks 7-10

Budget:

  • $11,200.00 USD

Milestone 4: Query Interface Development

Description:

In this phase, we focus on developing a query interface that supports advanced querying capabilities. Additionally, if necessary, machine learning models will be integrated into the system to enhance tasks like clickbait detection and sentiment analysis. Real-time update and feedback mechanisms will be implemented to continuously improve the models.

Deliverables:

  • User-friendly query interface: An intuitive interface that enables users to perform complex queries and customize their information retrieval.
  • Integrated machine learning models to improve the accuracy of tasks like clickbait detection and sentiment analysis.

Timeline:

  • Weeks 11-13

Budget:

  • $6,720.00 USD

Milestone 5: Open Tasks & Final Documentation

Description:

In the final phase, we ensure the reliability of the system and we have planned additional time in case individual building blocks need further improvement. Finally, we will make the results and the entire documentation available to the community in the CLose-out report. 

Deliverables:

  • Funktional KG/KB
  • Final Documentation
  • Close out report

Timeline:

  • Week 14

Budget:

  • $3,040.00 USD

Unrelated budget allocation:

API Calls / Hosting: $2,080.00 USD

Budget Summary: 

Total requested amount: $40.000 USD in AGIX

  • Milestone 1: $4,800 USD
  • Milestone 2: $12,160 USD
  • Milestone 3: $11,200 USD
  • Milestone 4: $6,720 USD
  • Milestone 5: $3,040 USD
  • Hosting/API Call Usage: $2,080 USD

Long Description

Company Name

TrustLevel

Request for Proposal Pool

RFP2: Unique knowledge graphs or knowledge bases coupled with clear user stories

Summary

Our approach empowers users to make informed judgments about text reliability across various domains, from news to product reviews and legal cases. By creating a unique knowledge graph we align with the RFP’s goal. Our solution's uniqueness lies in its integration of diverse data sources, structured knowledge graph, and multifaceted analysis. The combination of these elements enables a holistic assessment of text objectivity, making it a powerful tool for users seeking reliable information. By integrating diverse data sources and providing multifaceted analysis, we offer a unique and comprehensive tool for evaluating text objectivity.

Funding Amount

$40,000 USD

Our Solution

A: Overview

Since knowledge graphs play a crucial role in improving the capabilities of LLMs by providing structured, semantically rich data that helps these models to better understand context and generate outputs, we propose to develop an approach to assess the objectivity of textual information.

In the context of assessing objectivity, knowledge base query support would allow users to ask questions about the objectivity of texts, retrieve relevant information from the knowledge graph, and perform advanced analyses to make informed judgments. 

For example, a user could query the knowledge base to find all articles with a negative sentiment that mention a specific person, and then further analyze the source reputation of those articles to assess potential bias.

Our solution for assessing the objectivity of textual information:

  1. We apply systematically various text analysis APIs.

  2. We build a knowledge graph to represent the curated content and estimate a composed trustworthiness score .

  3. We develop a query interface that allows users to interact with our knowledge graph-based approach. 

B: Knowledge base query support

Our knowledge base query support enables users to leverage the structured information within the knowledge graph to conduct complex analyses and evaluations of textual data, contributing to the assessment of objectivity in a more sophisticated and data-driven manner.

  • Querying for Information: Users can submit queries to the knowledge base to request specific information, such as sentiment analysis results, named entities, clickbait probability, or source reputation assessments.

  • Filtering and Sorting: Users can refine their queries by applying filters and sorting options to obtain more precise results. For example, they could filter articles with a positive sentiment, sort entities by relevance, or filter out articles from sources with a low reputation score.

  • Cross-Referencing Data: Users can perform cross-referencing of data within the knowledge base. For instance, they might want to find all articles with a negative sentiment that mention specific persons or products.

C: Implementation details 

To create a knowledge graph to measure and elaborate on the objectivity of textual information we will use the outputs of various text analysis APIs in a multi-step process. Below, we'll outline the approach with specific tasks, practical steps, and examples for each of the components:

Note: Input data of TrustLevel‘s APIs are based on open datasets:

 

1. Sentiment Analysis:

  • Task: Determine the polarity of an article (negative, neutral, positive).

  • API: TrustLevel sentiment analysis API

  • Steps:

    • Extract the text from the article.

    • Send the text to the sentiment analysis API.

    • Receive the sentiment score.

    • Classify the sentiment as negative, neutral, or positive based on the score.

  • Example: For an article about a recent product launch, the sentiment analysis might classify it as "positive."

2. Text Analysis for Factual Information:

  • Task: Extract factual information like location, persons, products, dates, etc.

  • API: Use TrustServista Entities API

  • Steps:

    • Parse the article text.

    • Apply NER API to identify named entities.

    • Categorize entities into predefined categories (location, person, product, date, etc.).

  • Example: An article about a political event might extract entities like "Washington D.C." (location), "John Smith" (person), and "2023-09-02" (date).

3. Clickbait Detection:

  • Task: Determine the probability if a text is promotional (clickbait).

  • API: Use TrustLevel Clickbait API

    • Preprocess the text and extract relevant features.

    • Use the trained model.

    • Predict the probability of the text being clickbait.

  • Example: An article with phrases like "You won't believe what happened next!" might be flagged as potential clickbait.

4. Source Reputation and Bias Analysis:

  • Task: Determine potential biases of the source/author.

  • Approach: Using our InputData of source reputations and author profiles and using NLP (API of ChatGPT) to analyze the text for political content and we create a score to mark it as a potential political bias.

  • Steps:

    • Maintain a database of sources and authors with reputation scores.

    • Analyze the historical content from the source or author.

    • Look for patterns, political affiliations, or biases in their previous work.

  • Example: A source known for its conservative bias might be flagged as potentially biased in favor of certain political viewpoints.

5. Creating the Knowledge Graph:

  • Store the results of each analysis (sentiment, entities, clickbait probability, source reputation) in a structured format (JSON).

  • Relationship Extraction: Process the results by linking them using a unique ID. Identify and extract complex relationships between entities mentioned in the text. For example, if a document mentions a person and a location, establish the relationship between them.

  • Knowledge Graph Construction:

    • Organize the data into nodes (articles, entities, sources/authors) and edges (relationships between articles and entities, sources/authors and articles).

    • Populate the knowledge graph with nodes representing entities and edges representing relationships. Attach attributes like sentiment scores and clickbait probabilities to nodes or edges.

  • Visualization and Query:

    • Use a graph database like Neo4j or a triple-store database to build and query the knowledge graph.

    • Visualize and locally query the knowledge graph to check for potential corrections

    • Deploy the KG to a cloud service and build a query  interface to explore and analyze the structured information

6. Scoring Reliability and Trustworthiness of a Text:

  • Knowledge Graph Integration:

    • Utilize the knowledge graph to cross-reference information in the text. Verify facts, relationships, and historical context.

    • Identify any discrepancies or inconsistencies between the text and trusted sources. Inaccurate or unsupported claims in the text can lower trustworthiness.

  • Composite Score Calculation:

    • Assign weights to each of the factors based on their importance in determining trustworthiness (e.g., sentiment might have a lower weight compared to source reputation).

    • Calculate a composite trustworthiness score by combining the scores from each factor, considering the weights.

    • Apply the composite trustworthiness score to the text.

  • Optional tasks (dependent on the remaining capacity for the project):

    1. Classification and Threshold Determination:

    Establish a threshold for the composite trustworthiness score that separates trustworthy and untrustworthy texts.

    The threshold should be determined based on the specific context and application. For example, a news aggregator might have stricter criteria than a general information repository.

    2. Visualization:

    Provide users with a visualization or report that shows the trustworthiness score and factors contributing to it.

    Highlight areas of concern, such as biased sources or high clickbait probability.

D: Potential real world use cases which could utilize the information from this KG/KB:

1. Fake News Detection and Source Reliability Assessment:

  • Use Case: Identification and assessment of reliability of news articles and sources.

  • Advanced Reasoning: Analysis of sentiment, named entities, and historical patterns within a KG/KB to determine if a source or article exhibits suspicious bias or inconsistencies. Cross-reference with the KG/KB to check the historical reliability of the source and its previous reporting.

2. Investigative Journalism Support:

  • Use Case: Assistance for investigative journalists in uncovering hidden relationships, potential conflicts of interest, and biases in complex news stories.

  • Advanced Reasoning: Application of graph analytics on the KG/KB to uncover hidden connections between individuals, organizations, and events mentioned in articles. Detect patterns that may indicate undue influence or corruption.

3. Scientific Research Validation:

  • Use Case: Validation of scientific claims and research findings by assessing the objectivity and credibility of academic papers and studies.

  • Advanced Reasoning: Analysis of references and citations in academic papers against the KG/KB to assess the credibility of the sources cited. Evaluate the sentiment of discussions around the research topic in academic communities.

4. Product Reviews:

  • Use Case: Validation of product reviews.

  • Advanced Reasoning: Analysis of references and citations from the same source on different platforms in comparison to other sources, thereby using the KG/KB to assess the credibility of the sources. 

5. Legal Case Precedent Evaluation:

  • Use Case: Helping legal professionals to assess the relevance and objectivity of legal precedents in complex cases.

  • Advanced Reasoning: Analysis of legal documents, including court rulings and legal commentary, using sentiment analysis to understand the public perception of specific cases. Cross-referencing with KG/KB can help to determine the historical context and broader legal implications.

E: Justification of uniqueness

Our approach to assessing the reliability and trustworthiness of textual information using a knowledge graph and various text analysis tools offers several unique aspects:

  • Integration of Multiple Data Sources: Our approach involves gathering data from diverse sources, curating it, and then integrating it into a knowledge graph. This combination of data from various domains and sources allows for a more comprehensive assessment of text objectivity.

  • Structured Knowledge Graph: The use of a structured knowledge graph enables the representation of relationships between entities, which can be crucial for assessing objectivity. For example, you can trace the historical affiliations of sources, making it easier to identify potential biases.

  • Multi-Faceted Analysis: Our approach combines sentiment analysis, entity recognition, clickbait detection, source reputation analysis and knowledge graph integration. This multifaceted analysis provides a holistic view of text objectivity, considering not only the content but also the context and credibility of the sources.

F: Potential outlook and future steps

1.  Integration with Language Models and Dialogue Systems: The knowledge graph and query interface can be made compatible with generative language models and dialogue systems. This integration will allow these AI components to access and interact with the knowledge graph's data.

2. Domain and Multi Domain Representation: We can create unique domains or multi domain representations within the knowledge graph. This can involve categorizing and tagging content based on specific themes or topics to make it more accessible to users.

3. Deployment on the AI Platform: We can deploy the knowledge graph, knowledge base, and query interface on the  SNET AI marketplace, ensuring scalability, reliability, and security.

4. Further development of the TrustLevel KG/KB: The system can support advanced reasoning and analysis, such as identifying patterns or anomalies in sentiment across different sources or over time, or assessing the credibility of sources based on historical data.

Our Project Milestones and Cost Breakdown

The proposal is divided into five milestones. The estimated time frame is 14 weeks.

Please check the Google Spreadsheet with a more detailed budget breakdown:

Milestone 1: Project Initiation and Architecture Design

Description:

The primary focus is on setting the project's foundation by designing the system architecture. This includes planning for scalability, data security, and data privacy. Additionally, we will select the appropriate technology stack for the implementation.

Deliverables:

  • System architecture design document: A comprehensive plan describing the overall architecture, including the structure of the knowledge graph, data storage mechanisms, and integration points for text analysis APIs and other components.

Timeline:

  • Weeks 1-3

Budget:

  • $4,800.00 USD

Milestone 2: Knowledge Graph Development

Description:

This phase focuses on gathering and curating data from various sources. The data will undergo cleaning and structuring to ensure consistency and quality. Data pipelines will be established to facilitate continuous data ingestion. The primary deliverable of this milestone is the creation of the knowledge graph, including defining the schema and ontology, and setting up the knowledge graph database.

Deliverables:

  • Curated dataset: A dataset collected from diverse sources, cleaned, and structured for consistency.
  • Data pipelines: Established pipelines to enable continuous data ingestion into the knowledge graph.
  • Knowledge graph schema and ontology: A defined structure for the knowledge graph, including entities, attributes, and relationships.
  • Populated knowledge graph database: The knowledge graph database populated with curated data, ready for further analysis.
  • If necessary, employ additional NLP and ML techniques to refine the query and scoring results.

Timeline:

  • Weeks 4-6

Budget:

  • $12,160.00 USD

Milestone 3: Text Analysis Integration and Source Reputation Analysis

Description:

In this phase, we integrate text analysis APIs, such as sentiment analysis, named entity recognition (NER), and clickbait detection, into the system. Additionally, we implement algorithms for assessing the reputation of sources. This includes analyzing historical source data and developing author analysis modules to track contributions over time.

Deliverables:

  • Integrated text analysis APIs: APIs integrated into the system for sentiment analysis, NER, and clickbait detection, enabling the extraction of valuable information from text.
  • Source reputation analysis algorithms: Algorithms for assessing the credibility and reputation of sources, which will be used to enhance the trustworthiness assessment of texts.
  • Author analysis module: A module for analyzing author profiles and tracking their historical contributions, providing insights into potential biases and expertise.
  • Optional: Historical source database: A database containing historical source information, allowing for reference and analysis of source reliability.

Timeline:

  • Weeks 7-10

Budget:

  • $11,200.00 USD

Milestone 4: Query Interface Development

Description:

In this phase, we focus on developing a query interface that supports advanced querying capabilities. Additionally, if necessary, machine learning models will be integrated into the system to enhance tasks like clickbait detection and sentiment analysis. Real-time update and feedback mechanisms will be implemented to continuously improve the models.

Deliverables:

  • User-friendly query interface: An intuitive interface that enables users to perform complex queries and customize their information retrieval.
  • Integrated machine learning models to improve the accuracy of tasks like clickbait detection and sentiment analysis.

Timeline:

  • Weeks 11-13

Budget:

  • $6,720.00 USD

Milestone 5: Open Tasks & Final Documentation

Description:

In the final phase, we ensure the reliability of the system and we have planned additional time in case individual building blocks need further improvement. Finally, we will make the results and the entire documentation available to the community in the CLose-out report. 

Deliverables:

  • Funktional KG/KB
  • Final Documentation
  • Close out report

Timeline:

  • Week 14

Budget:

  • $3,040.00 USD

Unrelated budget allocation:

API Calls / Hosting: $2,080.00 USD

Budget Summary: 

Total requested amount: $40.000 USD in AGIX

  • Milestone 1: $4,800 USD
  • Milestone 2: $12,160 USD
  • Milestone 3: $11,200 USD
  • Milestone 4: $6,720 USD
  • Milestone 5: $3,040 USD
  • Hosting/API Call Usage: $2,080 USD

Risk and Mitigation

1. Technical Integration Challenges:

  • Risk: Integrating various components (text analysis APIs, knowledge graph, machine learning models) can introduce compatibility and technical complexities.

  • Mitigation: We will perform compatibility analysis before integration and mainly use established and known APIs. Furthermore, we will engage with the remaining API providers proactively, will use modular and scalable architecture, and have alternative APIs/sources in case the integration works not as planned.

2. Risk of Project Scope and Schedule:

  • Risk: There is a risk that we overestimate the scope of the project, both in terms of the development/integration part and the timing. However, if we have underestimated the scope (in case we still have capacity available), we have listed optional tasks and features that we would develop and implement.

  • Mitigation: We have already listed some further possibilities as optional to be able to deliver the core of the project in any case, as it is an MVP. Furthermore, we want to work very closely with the SNET/DeepFunding team and will communicate project difficulties directly to avoid wasting time and resources. 

3. Data Quality Risk:

  • Risk: Data quality issues, unpredictable data from diverse sources, and source unavailability or format changes are some concerns.

  • Mitigation: We will conduct early data quality assessments or have already done so for 3 of the 4 APIs. Furthermore, we already have alternative APIs up our sleeve in case we are not satisfied with the quality.

Our Team

Team Lead: Dominik Tilman 

Dominik is founder of TrustLevel and will be the lead on this proposal. He has been active in the blockchain scene for several years and is deeply involved in Cardano's Project Catalyst (the counterpart to Deep Funding) since the beginning. He has 15+ years in innovation management and company building. With his company CONU21 he mainly advises startups and actively helps in the founding phase to develop the right business model and to market the ideas successfully.

TrustLevel's AI & Blockchain Experts and Data Scientists:

Thomas Zuchtriegel: 15+ years in building teams, building products and building companies. Deep passion for emerging technologies to create innovative digital experiences. Co-founder Corticore (make pro athletes better using AI & VR), Co-founder bluquist (AI-driven talent & potential platform), managing partner metaverse GmbH (Innovation consultancy).

 

Dr. Margarita Diaz Cortes: 10+ years in AI research, Optimization algorithms, Machine Learning, Computer Vision, and applications in Engineering and Medicine; NLP techniques including Ontologies and LLMs.

 

Sergey K.: Several years experience in web3 development, former student assistant in the field of quantum supervised ML 

 

Iddo Lev (PhD): 20+ Years of Research and Engineering Experience in AI, NLP, Knowledge Representation, Logic.

 

Ohad Koren: 10+ Years experience as full stack developer and data scientist.

 

Related Links

 

Proposal Video

Unique Knowledge Graph for Source and Content Reliability - #DeepFunding IdeaFest Round 3

26 September 2023
  • Total Milestones

    6

  • Total Budget

    $40,000 USD

  • Last Updated

    3 Jun 2024

Milestone 1 - Project Initiation and Architecture Design

Status
😀 Completed
Description

The primary focus is on setting the project's foundation by designing the system architecture. This includes planning for scalability, data security, and data privacy. Additionally, we will select the appropriate technology stack for the implementation.

Deliverables

Budget

$4,800 USD

Milestone 2 - Knowledge Graph Development

Status
😀 Completed
Description

This phase focuses on gathering and curating data from various sources. The data will undergo cleaning and structuring to ensure consistency and quality. Data pipelines will be established to facilitate continuous data ingestion. The primary deliverable of this milestone is the creation of the knowledge graph, including defining the schema and ontology, and setting up the knowledge graph database.

Deliverables

Budget

$12,160 USD

Milestone 3 - Text Analysis Integration and Source Reputation Analysis

Status
😀 Completed
Description

In this phase, we integrate text analysis APIs, such as sentiment analysis, named entity recognition (NER), and clickbait detection, into the system. Additionally, we implement algorithms for assessing the reputation of sources. This includes analyzing historical source data and developing author analysis modules to track contributions over time.

Deliverables

Budget

$11,200 USD

Milestone 4 - Query Interface Development

Status
😀 Completed
Description

In this phase, we focus on developing a query interface that supports advanced querying capabilities. Additionally, if necessary, machine learning models will be integrated into the system to enhance tasks like clickbait detection and sentiment analysis. Real-time update and feedback mechanisms will be implemented to continuously improve the models.

Deliverables

Budget

$6,720 USD

Milestone 5 - Open Tasks & Final Documentation

Status
🧐 In Progress
Description

In the final phase, we ensure the reliability of the system and we have planned additional time in case individual building blocks need further improvement. Finally, we will make the results and the entire documentation available to the community in the CLose-out report. 

Deliverables

Budget

$3,040 USD

Link URL

Milestone 6 - Unrelated budget allocation

Status
😐 Not Started
Description

API Calls / Hosting

Deliverables

Budget

$2,080 USD

Link URL

Join the Discussion (0)

Reviews & Rating

New reviews and ratings are disabled for Awarded Projects

Sort by

0 ratings

Summary

Overall Community

0

from 0 reviews
  • 5
    0
  • 4
    0
  • 3
    0
  • 2
    0
  • 1
    0

Feasibility

0

from 0 reviews

Viability

0

from 0 reviews

Desirabilty

0

from 0 reviews

Usefulness

0

from 0 reviews