Proposal Description
Solution Description
Overview
Photrek will collaborate closely with the SingularityNET teams developing Hyperion OpenCog and Neuro Symbolic LLMs to both review the role of Probabilistic Logic Networks (PLN) in the current architecture for KGs and to determine how Photrek’s expertise in managing uncertainty through modeling of relative risk aversion can enhance the architecture.
Photrek will develop an architectural design that addresses two levels of uncertainty management that can then be recursively applied to multiple levels. At the graph level, Photrek will work to make sure that Hyperion OpenCog includes a measure of confidence for each triplet fact in a KG, where an uncertain triplet becomes a quartet (subject, predicate, object, confidence). UKGs often specify a confidence rather than probability because investigators have uncovered limitations in traditional applications of probability theory. Photrek will show that confidence can and should be grounded in probability but the modeling of relative risk aversion can overly upper and lower bounds on a probability estimate. The upper bound is determined by tolerating additional relative risk in the modeling of nonlinear correlation; so for instance, if two input probabilities are independent and thus multiply for the joint probability, the upper bound would be determined by the arithmetic average. The lower bound is determined by an aversion to relative risk. In the example of independent inputs, the lower bound would be determined by computing the generalized mean with a power of -2/3rds. With the traditional probability and its relative risk bounds, a wider variety of uncertainty computations can be applied to KG inference than traditional methods afford.
At the macro-level Photrek will carry out a prototype demonstration of using a UKG in conjunction with static KGs and LLMs. The UKG will serve as a dynamic, temporal source of information. Whereas, a KG has one triplet representing a fact that is assumed to be true, the UKG will draw upon the properties of probabilistic graphs and Markov chains, which represent the uncertainty about many possible relationships. For example, suppose a UKG was seeking to model a Julia's current employer, but their LinkedIn profile lists two current employers and has been updated in over a year. We may then have three potential subject-object relationships linked by the predicate, employedBy, {{Julia, Employer A}, {Julia, Employer B}, {Julia, Employer Unknown}}. To each of these facts, we would assign a probability based on the known evidence.
Implementation Details
Assigning accurate probabilities to uncertain events is extremely challenging. The Photrek technical leadership has collaborated for several decades on a variety of projects designing decision-making systems for complex environments. We have developed a suite of methods to design, analyze, and assess the accuracy, robustness, and decisiveness of uncertain decisions. For this UKG project, we will ground an estimated probability in the fundamentals of probability theory, including Bayesian reasoning and Information-theoretic metrics. Because estimates of probabilities are themselves uncertain we will add to these upper and lower bound probabilities.
The bounds will be based on a measure of the degree of freedom underpinning the estimate. In statistics, the degree of freedom is a measure of the amount of data available to estimate a distribution. From the degree of freedom, we can derive a relative risk aversion, r, which is used to estimate upper and lower-bound probabilities:
Positive values of r (aversion) dampen the certainty of the distribution, while negative values of r (tolerance) makes the distribution more confident. For the largest probability in the distribution, the averse-probability will be a lower bound, though an upper bound for the smallest probability in the distribution. For this reason, we will include in the project, conversations with the community as to the best approach to label and explain the methodology, its results, and usage.
For inference tasks, the user can select the upper, accurate, or lower probability depending on the relative risk aversion they want to apply to their decision process. Some circumstances might require a cautious estimate, while others require a decisive decision.
Knowledge graphs and probabilistic graphs have developed as somewhat independent fields. Our proposal is to use the foundation of probabilistic graphs to build a knowledge graph that includes uncertainty within its construction. Furthermore, probabilistic programming has advanced the methodologies of Bayesian reasoning to include advanced programming techniques. We will start the project by reviewing:
-
Scientific papers with code on implementation of UKGs;
-
The current status of SingularityNETs Probabilistic Logic Networks (PLNs);
-
A suitable probabilistic programming language, particularly those within Python, such as PyMC and ProbLog.
For the prototype experiment, we will keep the scope of the graph limited to 5-50 triplet facts with confidence, and 2-3 layers of relationships. The larger corpus of knowledge will be accessed via static KGs and/or LLMs.
Use Case Prototype
For our Community Contribution Score project during DF2, Photrek created a relational database of the contributions SingularityNET community members have created for Deep Funding. Contribution examples include creation of a proposal, commenting on a proposal, and voting. Particularly for the comment submissions, an issue arises regarding the originality and value of a comment for the community. Comments could for instance be classified as “low, neutral, or high” value with a degree of uncertainty.
In discussions with SingularityNET and its community, we will review the value of a demonstration in which a UKG is used to store information about the relative value of comments submitted to the SWAE Deep Funding website. Based on that review we will refine and then develop a demonstration of UKGs for storing uncertain information about community contributions.
Risks and Mitigation Plan
This effort entails both theoretical developments and proof-of-concept implementation aspects which present technical challenges. The following indicate key risks and associated mitigation strategies.
- Select appropriate LLM source: LLM experience by Photrek includes both (1) open-source LLMs (locally downloadable, architecturally modifiable, & capable of fine-tuning) and (2) API-based LLMs (server-hosted and token-based pricing for prompts). While Photrek anticipates becoming acquainted with SingularityNet LLMs throughout this effort, the contingency – should API-based models pose interfacing constraints – is to independently demonstrate proof-of-concept UKG functionality via open-source LLMs.
- Select suitable expression of probability: Should complications arise in implementing descriptive uncertainty characterization capabilities (e.g., upper and lower probability bounds) within the UKG, the contingency is to instead estimate scalar probabilistic accuracy.
- KG/LLM integration: Should interfacing complications arise in processing LLM outputs into the static KG, the fallback plan is to demonstrate operability of individual components while documenting challenges and recommendations.
Related Links
1. Nelson, K. P. Reduced Perplexity: A Simplified Perspective on Assessing Probabilistic Forecasts. in Advances in Info-Metrics: Information and Information Processing across Disciplines (eds. Chen, M., Dunn, J. M., Golan, A. & Ullah, A.) 0 (Oxford University Press, 2020). doi:10.1093/oso/9780190636685.003.0012.
2. Wang, J., Nie, K., Chen, X. & Lei, J. SUKE: Embedding Model for Prediction in Uncertain Knowledge Graph. IEEE Access 9, 3871–3879 (2021).
3. Goertzel, B., Iklé, M., Goertzel, I. F. & Heljakka, A. Probabilistic Logic Networks: A Comprehensive Framework for Uncertain Inference. (Springer Science & Business Media, 2008).
4. Li, T. et al. Embedding Uncertain Temporal Knowledge Graphs. Mathematics 11, 775 (2023).
5. Cowen-Rivers, A. I. et al. Neural Variational Inference For Estimating Uncertainty in Knowledge Graph Embeddings. Preprint at https://doi.org/10.48550/arXiv.1906.04985 (2019).
6. Yang, S., Tang, R., Zhang, Z. & Li, G. Uncertain Knowledge Graph Embedding: a Natural and Effective Approach. J. Phys.: Conf. Ser. 1824, 012002 (2021).
7. Wang, J., Wu, T. & Zhang, J. Incorporating Uncertainty of Entities and Relations into Few-Shot Uncertain Knowledge Graph Embedding. in Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy (eds. Sun, M. et al.) 16–28 (Springer Nature, 2022). doi:10.1007/978-981-19-7596-7_2.
8. Lu, G., Zhang, H., Qin, K. & Du, K. A causal-based symbolic reasoning framework for uncertain knowledge graphs. Computers and Electrical Engineering 105, 108541 (2023).
9. Yang, X. & Wang, N. A confidence-aware and path-enhanced convolutional neural network embedding framework on noisy knowledge graph. Neurocomputing 545, 126261 (2023).
10. Zhang, J., Wu, T. & Qi, G. Gaussian Metric Learning for Few-Shot Uncertain Knowledge Graph Completion. in Database Systems for Advanced Applications (eds. Jensen, C. S. et al.) 256–271 (Springer International Publishing, 2021). doi:10.1007/978-3-030-73194-6_18.
11. Chen, X., Chen, M., Shi, W., Sun, Y. & Zaniolo, C. Embedding Uncertain Knowledge Graphs. Proceedings of the AAAI Conference on Artificial Intelligence 33, 3363–3370 (2019).
12. Chen, W., Xiong, W., Yan, X. & Wang, W. Variational Knowledge Graph Reasoning. Preprint at https://doi.org/10.48550/arXiv.1803.06581 (2018).
13. Zhang, Y., Dai, H., Kozareva, Z., Smola, A. & Song, L. Variational Reasoning for Question Answering With Knowledge Graph. Proceedings of the AAAI Conference on Artificial Intelligence 32, (2018).
14. Ji, S., Pan, S., Cambria, E., Marttinen, P. & Yu, P. S. A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Transactions on Neural Networks and Learning Systems 33, 494–514 (2022).
15. Blei, D. M., Kucukelbir, A. & McAuliffe, J. D. Variational Inference: A Review for Statisticians. Journal of the American Statistical Association 112, 859–877 (2017).
Long Description
Company Name
Photrek
Request for Proposal Pool
RFP4: Tools for Knowledge Graphs And LLMs Integration
Summary
Photrek will develop an architectural design for managing uncertainty in knowledge graphs. We will approach the effort at a micro and macro level. For each micro-knowledge fact, a triplet contains subject, predicate, and object, we will embed a probability together with an upper and lower bound on the probability. At the macro level we will demonstrate use of an uncertain knowledge graph to store temporal, dynamic information in conjunction with pre-trained, static knowledge graphs and large language models. Our use case demonstration will be in the domain of reputation, as applied to credit reports and/or cryptocurrency community contribution scores. Photrek will work closely with SingularityNET to develop a plan for how UKGs can integrate into the Hyperion platform.
Funding Amount
$40,000
Problem we aim to solve
Knowledge is uncertain. Even the most well established scientific theories can be overturned by new knowledge, and even the most widely held beliefs can be false. Given the uncertainty of knowledge, deterministic knowledge graphs have a hopelessly flawed foundation. Given the complexity of representing the different varieties of uncertainty, a model of uncertainty must be embedded as a fundamental component of a knowledge graph (KG) and their utilization within large language models (LLM). At the same the size and complexity of KGs and LLMs makes it impossible to continuously update and retrain systems such WikiData and ChatGPT. So there is a need for a dynamic, temporal Uncertain Knowledge Graph (UKG) that can complement existing static, pre-trained KGs & LLMs.
Our Solution
Overview
Photrek will collaborate closely with the SingularityNET teams developing Hyperion OpenCog and Neuro Symbolic LLMs to both review the role of Probabilistic Logic Networks (PLN) in the current architecture for KGs and to determine how Photrek’s expertise in managing uncertainty through modeling of relative risk aversion can enhance the architecture.
Photrek will develop an architectural design that addresses two levels of uncertainty management that can then be recursively applied to multiple levels. At the graph level, Photrek will work to make sure that Hyperion OpenCog includes a measure of confidence for each triplet fact in a KG, where an uncertain triplet becomes a quartet (subject, predicate, object, confidence). UKGs often specify a confidence rather than probability because investigators have uncovered limitations in traditional applications of probability theory. Photrek will show that confidence can and should be grounded in probability but the modeling of relative risk aversion can overly upper and lower bounds on a probability estimate. The upper bound is determined by tolerating additional relative risk in the modeling of nonlinear correlation; so for instance, if two input probabilities are independent and thus multiply for the joint probability, the upper bound would be determined by the arithmetic average. The lower bound is determined by an aversion to relative risk. In the example of independent inputs, the lower bound would be determined by computing the generalized mean with a power of -2/3rds. With the traditional probability and its relative risk bounds, a wider variety of uncertainty computations can be applied to KG inference than traditional methods afford.
At the macro-level Photrek will carry out a prototype demonstration of using a UKG in conjunction with static KGs and LLMs. The UKG will serve as a dynamic, temporal source of information. Whereas, a KG has one triplet representing a fact that is assumed to be true, the UKG will draw upon the properties of probabilistic graphs and Markov chains, which represent the uncertainty about many possible relationships. For example, suppose a UKG was seeking to model a Julia's current employer, but their LinkedIn profile lists two current employers and has been updated in over a year. We may then have three potential subject-object relationships linked by the predicate, employedBy, {{Julia, Employer A}, {Julia, Employer B}, {Julia, Employer Unknown}}. To each of these facts, we would assign a probability based on the known evidence.
Implementation Details
Assigning accurate probabilities to uncertain events is extremely challenging. The Photrek technical leadership has collaborated for several decades on a variety of projects designing decision-making systems for complex environments. We have developed a suite of methods to design, analyze, and assess the accuracy, robustness, and decisiveness of uncertain decisions. For this UKG project, we will ground an estimated probability in the fundamentals of probability theory, including Bayesian reasoning and Information-theoretic metrics. Because estimates of probabilities are themselves uncertain we will add to these upper and lower bound probabilities.
The bounds will be based on a measure of the degree of freedom underpinning the estimate. In statistics, the degree of freedom is a measure of the amount of data available to estimate a distribution. From the degree of freedom, we can derive a relative risk aversion, r, which is used to estimate upper and lower-bound probabilities:
Positive values of r (aversion) dampen the certainty of the distribution, while negative values of r (tolerance) makes the distribution more confident. For the largest probability in the distribution, the averse-probability will be a lower bound, though an upper bound for the smallest probability in the distribution. For this reason, we will include in the project, conversations with the community as to the best approach to label and explain the methodology, its results, and usage.
For inference tasks, the user can select the upper, accurate, or lower probability depending on the relative risk aversion they want to apply to their decision process. Some circumstances might require a cautious estimate, while others require a decisive decision.
Knowledge graphs and probabilistic graphs have developed as somewhat independent fields. Our proposal is to use the foundation of probabilistic graphs to build a knowledge graph that includes uncertainty within its construction. Furthermore, probabilistic programming has advanced the methodologies of Bayesian reasoning to include advanced programming techniques. We will start the project by reviewing:
-
Scientific papers with code on implementation of UKGs;
-
The current status of SingularityNETs Probabilistic Logic Networks (PLNs);
-
A suitable probabilistic programming language, particularly those within Python, such as PyMC and ProbLog.
For the prototype experiment, we will keep the scope of the graph limited to 5-50 triplet facts with confidence, and 2-3 layers of relationships. The larger corpus of knowledge will be accessed via static KGs and/or LLMs.
Use Case Prototype
For our Community Contribution Score project during DF2, Photrek created a relational database of the contributions SingularityNET community members have created for Deep Funding. Contribution examples include creation of a proposal, commenting on a proposal, and voting. Particularly for the comment submissions, an issue arises regarding the originality and value of a comment for the community. Comments could for instance be classified as “low, neutral, or high” value with a degree of uncertainty.
In discussions with SingularityNET and its community, we will review the value of a demonstration in which a UKG is used to store information about the relative value of comments submitted to the SWAE Deep Funding website. Based on that review we will refine and then develop a demonstration of UKGs for storing uncertain information about community contributions.
Risks and Mitigation Plan
This effort entails both theoretical developments and proof-of-concept implementation aspects which present technical challenges. The following indicate key risks and associated mitigation strategies.
- Select appropriate LLM source: LLM experience by Photrek includes both (1) open-source LLMs (locally downloadable, architecturally modifiable, & capable of fine-tuning) and (2) API-based LLMs (server-hosted and token-based pricing for prompts). While Photrek anticipates becoming acquainted with SingularityNet LLMs throughout this effort, the contingency – should API-based models pose interfacing constraints – is to independently demonstrate proof-of-concept UKG functionality via open-source LLMs.
- Select suitable expression of probability: Should complications arise in implementing descriptive uncertainty characterization capabilities (e.g., upper and lower probability bounds) within the UKG, the contingency is to instead estimate scalar probabilistic accuracy.
- KG/LLM integration: Should interfacing complications arise in processing LLM outputs into the static KG, the fallback plan is to demonstrate operability of individual components while documenting challenges and recommendations.
Our Project Milestones and Cost Breakdown
Expected Start: November 1, 2023
Expected Length: 6 milestones completed in 8 months
Milestone 0: Contract Signing & Management Reserve
-
Duration: 0 months
-
Total Budget: $ 1750.00 USD in AGIX
-
Objective: Funding put in reserve to address unforeseen requirements
-
Deliverable: Contract Signature
Milestone 1: Brief SingularityNET KG & LLM team regarding Objectives
Milestone 2: Draft Architecture Design
Milestone 3: Develop Use Case Plan
Milestone 4: Develop UKG
Milestone 5: Use Case Demo
Milestone 6: Final Report with Next Steps
Our Team
Photrek will hire an expert machine learning developer with experience developing knowledge graphs and/or large language models. The individual will be mentored in the development of UKGs by Photrek’s experience design and development team.
Related Links
1. Nelson, K. P. Reduced Perplexity: A Simplified Perspective on Assessing Probabilistic Forecasts. in Advances in Info-Metrics: Information and Information Processing across Disciplines (eds. Chen, M., Dunn, J. M., Golan, A. & Ullah, A.) 0 (Oxford University Press, 2020). doi:10.1093/oso/9780190636685.003.0012.
2. Wang, J., Nie, K., Chen, X. & Lei, J. SUKE: Embedding Model for Prediction in Uncertain Knowledge Graph. IEEE Access 9, 3871–3879 (2021).
3. Goertzel, B., Iklé, M., Goertzel, I. F. & Heljakka, A. Probabilistic Logic Networks: A Comprehensive Framework for Uncertain Inference. (Springer Science & Business Media, 2008).
4. Li, T. et al. Embedding Uncertain Temporal Knowledge Graphs. Mathematics 11, 775 (2023).
5. Cowen-Rivers, A. I. et al. Neural Variational Inference For Estimating Uncertainty in Knowledge Graph Embeddings. Preprint at https://doi.org/10.48550/arXiv.1906.04985 (2019).
6. Yang, S., Tang, R., Zhang, Z. & Li, G. Uncertain Knowledge Graph Embedding: a Natural and Effective Approach. J. Phys.: Conf. Ser. 1824, 012002 (2021).
7. Wang, J., Wu, T. & Zhang, J. Incorporating Uncertainty of Entities and Relations into Few-Shot Uncertain Knowledge Graph Embedding. in Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy (eds. Sun, M. et al.) 16–28 (Springer Nature, 2022). doi:10.1007/978-981-19-7596-7_2.
8. Lu, G., Zhang, H., Qin, K. & Du, K. A causal-based symbolic reasoning framework for uncertain knowledge graphs. Computers and Electrical Engineering 105, 108541 (2023).
9. Yang, X. & Wang, N. A confidence-aware and path-enhanced convolutional neural network embedding framework on noisy knowledge graph. Neurocomputing 545, 126261 (2023).
10. Zhang, J., Wu, T. & Qi, G. Gaussian Metric Learning for Few-Shot Uncertain Knowledge Graph Completion. in Database Systems for Advanced Applications (eds. Jensen, C. S. et al.) 256–271 (Springer International Publishing, 2021). doi:10.1007/978-3-030-73194-6_18.
11. Chen, X., Chen, M., Shi, W., Sun, Y. & Zaniolo, C. Embedding Uncertain Knowledge Graphs. Proceedings of the AAAI Conference on Artificial Intelligence 33, 3363–3370 (2019).
12. Chen, W., Xiong, W., Yan, X. & Wang, W. Variational Knowledge Graph Reasoning. Preprint at https://doi.org/10.48550/arXiv.1803.06581 (2018).
13. Zhang, Y., Dai, H., Kozareva, Z., Smola, A. & Song, L. Variational Reasoning for Question Answering With Knowledge Graph. Proceedings of the AAAI Conference on Artificial Intelligence 32, (2018).
14. Ji, S., Pan, S., Cambria, E., Marttinen, P. & Yu, P. S. A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Transactions on Neural Networks and Learning Systems 33, 494–514 (2022).
15. Blei, D. M., Kucukelbir, A. & McAuliffe, J. D. Variational Inference: A Review for Statisticians. Journal of the American Statistical Association 112, 859–877 (2017).