Long Description
Company Name
TogetherCrew & P2P Foundation
Request for Proposal Pool
Memory-augmented LLMs - RFP 3 (although we also contribute a unique KB to RFP 2)
Summary
We are excited to present our proposal to collaborate with SingularityNET on the Community Memory-augmented LLMs - RFP 3. Our mission at TogetherCrew is to empower humane collaboration through healthy communities, which aligns well with SingularityNET's aim to make the world more-compassionate, just, and sustainable through a beneficial singularity.
We’re proposing to deliver HiveMind, an LLM-powered chatbot that provides answers to questions community members have, leveraging dispersed community knowledge. Our user research has shown that knowledge is often fragmented in communities and poorly documented, and significant time is lost trying to recall informal conversations and find the platform in which they took place. As such, HiveMind is designed to bridge various platforms (Discord, Discourse, GitHub, Google Drive, and Notion). As a result, the system can effectively answer loosely structured questions (e.g. “The other day, when we were talking about something related to the new partnership, what was the decision?”) based on content from community interaction platforms like Discord and Discourse as well as specific questions about content in official community documentation (e.g. “What steps do I need to take to get onboarded into this community?”) based on content from platforms like GitHub, Google Drive and Notion.
Additionally, we've struck a partnership with the P2P foundation - a non-profit organization and global network dedicated to advocacy and research of commons-oriented peer to peer (P2P) dynamics in society - and through it will pilot HiveMind on their unique knowledge base of research and articles on P2P and Commons, including 25,000 articles curated 1 by 1 over 15 years. This database includes many research articles that are no longer available online as they were hosted by small scientific journals without the resources to maintain online platforms. This constitutes a special opportunity to make this highly relevant knowledge for the Web3/DAOs community available.
Funding Amount
$72,000
Our Solution
User Experience
HiveMind is a Q&A bot with the capability to answer complex questions. For example:
“What were the suggested solutions for [PROBLEM] and which of the solutions gave good results?”
This will normally require the user to make multiple queries:
Query 1: find solutions
Query 2: what were results of solution 1
Query 3: what were results of solution 2
HiveMind is structured as a multi-agent system that can split the objective into separate queries, answer them individually and merge the results into a single final answer, providing a superior user experience.
We also know from our user research that documentation quickly gets outdated and is hard to maintain. There is a lot of knowledge that’s spread across conversations and thus has been historically hard to retrieve. For example, to find out “What did the marketing team do last week?” the only way was to ask them, taking their focus away from their work. With HiveMind, users can find the answer quickly, enabling communities to collaborate more effectively, and facilitating decision-making, coordination, and inclusivity (especially for part-time members).
HiveMind is envisioned to operate as both an API and a Discord chatbot, so that users can have their questions answered in the same place they’d normally ask them, thus reducing friction and facilitating user adoption.
Previous work, assets, and resources:
TogetherCrew started as a research project under RnDAO, yielding a
(funded by Aragon).
Subsequently, we started developing a solution (thanks to grants by Aave, Celo, and Near) to empower community builders and deliver community-powered growth and now have a Neo4j and MongoDB databases where data is automatically added by a discord bot. The data is computed using organizational network analysis to produce a dashboard that provides:
-
Network graph of the community (based on interactions)
-
Decentralisation and fragmentation analysis (also other analyses of the fit of the community’s social structure based on our research)
-
Breakdown of members by engagement level (using two-way interactions and not just lone posts to filter out spammers), tracking of disengagement, and onboarding funnel effectiveness.
-
And mapping of when the community is most active to optimize the scheduling of events, announcements, etc.
See the current dashboard on our
and
.
The team is currently working on adding additional data sources and metrics for: Twitter (funded by the Web3 Foundation), Snapshot and CoinGeko (funded by Pocket Network), and Discourse (Funded by Optimism).
If approved, our proposal for Singularity Net would allow us to leverage this existing infrastructure and research and leapfrog the development of the proposed solution to include numerous data sources and reduced cost thanks to pre-existing data pipelines infrastructure.
Architecture
Hivemind is a system comprising multiple LLM-powered agents that collectively work on answering a wide range of questions. Inspired by the human brain, this multi-agent system has access to multiple forms of memory (episodic, semantic, and procedural) in order to ensure optimal performance. The system builds its episodic and semantic memory by exploring its information environment, consisting of the vector stores with data from various sources and at various levels of abstraction like raw Discord messages, summaries of Discord messages, or contents of Notion documents.
A step-by-step procedure of the system is listed below, accompanied by a graphical representation of the system, labeled the same as the listed steps.
-
Hivemind user provides a question to the system
-
The Subtask-Agent splits the question into subtasks, ordered based on importance
-
The first subtask in the list is passed to the Task-Execution-Agent
-
The Task-Execution-Agent has access to multiple different vector stores. The information about what type of information can be found in each vector store is accessed by the Task-Execution-Agent through its procedural memory.
-
Through an enhanced ReAct framework with Criticism, the Task-Execution-Agent chooses to query one of the vector stores to obtain information that can help it complete its subtask. The query is being executed as a similarity search between the embedded query and the contents of the vector store.
-
The reasoning, query and top-k similarity search results are stored as episodic memory. In addition, the most important aspects of the obtained results for completing the subtask are extracted from the top-k similarity search results and separately stored as semantic memory.
-
Accessing the episodic and semantic memories of previous data-gathering iterations, the Task-Execution-Agent can choose to complete the subtask and pass on its findings instead of undergoing the next data-gathering iteration.
-
The Subtask-Result pair from the Task-Execution-Agent is passed to the Final-Answer-Agent. This agent decides if it has enough information to answer the question from the user (end of procedure) or if it should pass the next subtask to the Task-Execution-Agent (continue at step 4).
HiveMind is being prototyped with GPT 3.5 given its versatility, but the system could be subsequently iterated to use a different LLM or more task-specific AIs. We find it essential to focus first on user acceptance and adoption so that more significant efforts on the infrastructure side are not wasted.
Note, we also meet the generic RFP conditions (and will work to improve further upon them), namely:
-
each RFP proposal has the requirement to be published under a Open Sourced license, which enables future teams to continue to develop the codebase
-
The code should be clean and readable by other developers
-
The project should have proper functional and technical documentation
-
The code and documentation should be accessible through a public GitHub repository
Note on parallel grants:
We’re also submitting 2 parallel and synergistic grants to the community and want to reassure you that no double-counting will take place. This grant focuses on the features for the HiveMind Bot, and the other focuses on the Reputation RFP and final one on a more developed Knowledge Graph.
If only one proposal is approved, they can still be delivered. If multiple are approved, the benefits compound for the community by providing well-integrated (yet modular) functionalities, with benefits including less account management, easy UX, reduced costs, etc. For example, by approving our
proposal and this proposal (HiveMind), we’ll be able to include the data from the Voting Portal in answering questions, for example, “Has a proposal for a Q&A bot already been asked and what did the community think about it?”.
Our Project Milestones and Cost Breakdown
Note that there’s some overlap between milestones as different skills are involved at different stages.
Milestone 1: Kick-off and implementation of Discord and Discourse bot
-
Milestone Description: We’ll deploy a Discord bot on Singularity Net (or if needed, on our own server), develop a Discourse crawler bot (SN forum is on Discourse), and start collecting the data for analysis. Importantly, the bot should provide the ability to admins to include and exclude specific channels, so that sensitive conversations are not analysed. Note: We’ll add more data sources in subsequent milestones, but the complete dataset is not needed to start working on the data analysis scripts.
-
Milestone deliverable: 1. Discourse bot for data extraction: Discourse crawler bot that can be pointed to open Discourse forums 2. Discord bot for data extraction (already built) 3. Data collected in the database for analysis (database already built)
-
Milestone-related budget: $4,000
-
Period: weeks 0 to 1
Milestone 2: Storing the data
-
Milestone Description: Adding metadata to the collected data, embedding both the original messages and high-level summaries using a free open-source transformer, and storing the embedded data together with the metadata in DeepLake (free and open source) vector stores. This will allow the system to filter the search based on data sources and/or time periods. Note the system will include the option for connecting authors to the data or not, to be decided by each community based on their specific privacy needs.
-
Milestone deliverable: 1. Discord and Discourse Data stored with meta-data in vector stores
-
Milestone-related budget: $5,000
-
Period: weeks 1 to 3
Milestone 3: Developing the system that finds the answer
-
Milestone Description: Turn user questions into task(s) for LLM agents, define LLM agents that query the vector stores for solving the task(s), and turn task result(s) into the final answer. We’ll also enable the user to provide search parameters that can indicate if they want to only query a specific selection of platforms (for example: only Discord or only Notion and Google Drive) and/or a selection of time periods (for example: only discord messages from the last 7 days). After completing this work we’ll run user tests to validate the answer(s) quality and user satisfaction.
-
Milestone deliverable: 1. Scripts to query the data using input questions: break down complex questions into simple questions (sub-questions), retrieve answers for subquestions, and compose a final answer. 2. User test to validate: collect sample questions users asked on the server, manually run questions through the system and show answers to users, then collect feedback on quality/improvements, and refine.
-
Milestone-related budget: $8,000
-
Period: weeks 2 to 4
Milestone 4: Complete Data sources integration & Infrastructure
-
Milestone Description: Implement data pipelines for additional data sources that are regularly used, so that the system is not abandoned by users due to incompleteness. Each data pipeline is optional for users to add, enabling the selection of files/channels/pages that are excluded and included. We’ll also test internally (leveraging the RnDAO discord and TogetherCrew’s team) whether the new data sources are functioning properly.
-
Milestone deliverable: 1. GitHub data pipeline bot: a GitHub crawler bot for gathering the data and storing it in the database 2. Notion data pipeline bot: a Notion extension bot for gathering the data and storing it in the database 3. Google Drive pipeline bot: a Google Drive extension bot for gathering the data and storing it in the database
4. Wiki pipeline bot: a MediaWiki crawler bot for gathering the data and storing it in the database
5. DevOps: server and application in production.
-
Milestone-related budget: $16,000
-
Period: weeks 3 to 10
Milestone 5: Discord and Settings User Interface
-
Milestone Description: Developing a user interface to interact with the system: Discord chatbot where the user can ask their questions and receive answers, plus a settings page for configuring the system. Also usability testing with users.
-
Milestone deliverable: 1. Discord bot interface: a bot that reads questions when triggered by either slash command or automatically when a message is sent in a designated channel and then outputs the answer in Discord as a reply. 2. Settings page: webapp page where users can deploy the bot(s), select the channels and method for it to operate in Discord (f selected), filter the data sources the bot has access to, turn mentions of authors in answers on/off, and turn the bot on/off. 3. User testing: usability testing by at least 5 users and implementing critical fixes.
-
Milestone-related budget: $8,000
-
Period: weeks 4 to 12
Milestone 6: API
-
Milestone Description: Development of the API, API documentation, and test.
-
Milestone deliverable: 1. API: functioning API that enables developers to easily integrate HiveMind Q&A functionality into their own applications 2. API documentation: documentation with a step-by-step guide 3. PoC implementation of the API in a sample use case and usability testing.
-
Milestone-related budget: $5,000
-
Period: weeks 12 to 14
Milestone 7: Security improvements to provide user trust and facilitate adoption
-
Milestone Description: Protecting the database behind a VPN, review of security protocols, and penetration testing
-
Milestone deliverable: 1. Database only accessible via VPN
2. Penetration testing
-
Milestone-related budget: $6,000
-
Period: weeks 16 to 18
Milestone 8: P2P foundation
-
Milestone Description: HiveMind is piloted with the P2P foundation knowledge base (includes $500 for infrastructure costs)
-
Milestone deliverable: 1. P2P foundation data ingestion
2. A simple website created with search functionality via API to query the knowledge base.
-
Milestone-related budget: $2,000
-
Period: weeks 18 to 19
Milestone 9: SN marketplace integration
-
Milestone Description: HiveMind is made available via the Singularity Net Marketplace
-
Milestone deliverable: 1. Integration into SN marketplace, ensuring fully functioning operation.
-
Milestone-related budget: $15,000
-
Period: weeks 18 to 20
Milestone 10: Launch campaign
-
Milestone Description: Outreach campaign to market the bot, gather feedback, and inform next steps. (Via TogetherCrew channels and amplified by RnDAO, with optional amplification by Singularity Net).
-
Milestone deliverable: 1. Twitter campaign: at least 2 weeks before launch, start by evangelizing the problem. Followed by a series of launch announcements, and finally evangelizing benefits (community experience, team performance, cost reduction, etc.) over the next 2 weeks. 2. Blog article announcing the launch
-
3. Online event to demo the product and recording.
-
Milestone-related budget: $6,000
-
Period: weeks 19 to 21
Risk and Mitigation
Accuracy: The system is designed to provide an answer with the sources, we have been experimenting for some time with parameters to reduce hallucinations and increase accuracy, and have already gone through a cycle of testing these before this grant. Admins can curate the data sources from which the bot gathers information, and answers also include a time component. Future iterations could enable admins to add custom messages signaling the likely reliability of different sources and also automate the provision of feedback about accuracy to the system, so to create a continuous learning loop. However, those features have been estimated to further increase budget and as such are planned as potential phase 2 improvements.
GDPR: As with any project managing individual data, principles of consent and GDPR regulation needs to be taken into account. We’ve already consulted with a GDPR lawyer (via LexDAO) and will in this iteration provide users with the functionality to remove consent and delete their data should they desire (note that the data is in principle public as it’s been shared in publicly accessible forums. However, we believe in the ability of individuals to choose and govern their data as a guiding principle).
Illness and capacity reduction: As a small(ish) team, losing even if only temporarily a member risks delays in projects. As mitigation, we’re part of a broader alliance of closely aligned projects (RnDAO) where we benefit from a community talent funnel and multiple other contributors (MicroFlow team that’s part of RnDAO) who are already familiar with our codebase as we have called on them for previous projects when needed.
Code Quality: Although no single part of the system is significantly complex, the aggregation of the multiple systems does generate some risks of bugs and the like. To mitigate we have already implemented code quality monitoring systems, automated testing, and QA practices with pair programming and structured usability testing.
Usability & User Needs: The creators of a project are prone to self-confirmation bias and thus risk being blind to UX issues or otherwise customer pains. To mitigate this, we have conducted rigorous user research seeking to understand community manager pain points (over 30 interviews) and have found that being overwhelmed and lacking time are key challenges that our bot can help alleviate. Moreover, we have incorporated usability testing in multiple stages of our roadmap (milestones 3, 5, and 6).
Data aggregation: When aggregating data from multiple data sources, a potential risk lies in the diversity of data formats and structures, which could hinder compatibility and processing efficiency. To address this, we have implemented several mitigations. Firstly, we are adopting a Unified Data Schema approach to ensure compatibility across disparate data sources, promoting streamlined processing. Additionally, data normalization techniques will be employed to standardize the data, rendering it consistent and computation-ready. To facilitate efficient data aggregation while minimizing the need for manual intervention, we have established an ETL (Extract, Transform, Load) pipeline. This pipeline will enable the seamless extraction and consolidation of data, enhancing both the accuracy and speed of aggregation processes.
Team Capacity: as we’re applying to multiple grants, if all were to be approved, our team will need the capacity to deliver on all of them. We address this challenge thanks to a swarm-like organisation design and access to other teams within RnDAO (49+ consistently active contributors as of today 31st of August). This organization design consists of a system to delegate bounded tasks to community contributors as needed, self-onboarding documentation, transparency, and clear governance to enable rapid scaling of operations as needed. We’ve been refining this system for over a year and based on a decade-plus experience in Organisation Design.
Security & Privacy: Since the chatbot accesses data from multiple community platforms, sensitive conversations or information could be inadvertently exposed. To address this, we're taking a series of security measures into consideration:
- Data is encrypted at rest
- Traffic is encrypted
- Users have access to fine-grained control for which channels and platforms are included, so that no highly sensitive data ever reaches our database. Adding e.g. a new channel in discord requires first connecting discord, then giving permission to the bot in discord for each of the channels, and finally confirming in our UI. These 3 required approvals substantially eliminate the risk of a user accidentally adding a channel with highly sensitive data.
- Moreover, we're adding a milestone to review security protocols, protect the database behind a VPN, and contract a small penetration test. (further upgrades to security will be carried in the future after securing traction)
Project Management: as a project with multiple phases and components, there are risks of delay and unforeseen technical challenges. We've mitigated by clearly scoping each phase and doing technical due diligence on all the integrations needed. Moreover, the team counts with previous experience in this sort of work (data pipelines, using vector stores, prompt engineering, etc) and we've implemented best practices for software delivery (sprint planning, daily check-ins, retrospectives, code quality monitoring, testing and production environments, etc) and refined them for over a year of working together. That being said, unforeseen circumstances are always possible and the team undertakes the project at their own risk, with the possibility of not receiving funding should milestones not be completed.
Our Team
Team leads for the project:
-
Danielo (contact person) - Instigator at RnDAO and CoLead at TogetherCrew. Previously, Head of Governance at Aragon, 8 years experience in Organization Design consulting (clients include Google, BCG, Daymler, The UN, and multiple startups), founded two startups, and served as visiting lecturer at Oxford University. Twitter:
LinkedIn:
-
Tjitse van der Molen (AI specialist) - Researcher at University of California and data scientist at TogetherCrew. Over 5 years experience doing research at the intersection of AI and neuroscience. Research focuses both on using AI to study the brain and finding ways to better use AI based on our understanding of the functioning of the brain. Work at TogetherCrew focuses on utilizing AI to support collaboration in online communities. LinkedIn:
-
Katerina- Co-lead TogetherCrew. Ph.D. using social network analysis. Since 2016 she is co-instructing a graduate course on data analytics for HR at Northwestern University. She has also co-organized the Learning in Networks sessions at the International Conference of Social Network Analysis (2018 - 2020), and previously advised a people analytics company on social network metrics. Twitter: [
](
) Linkedin: [
](
) Blog:
Github:
-
Ashish - co-lead at TogetherCrew, Previously at Tata, working at the intersection of innovation and customers. 8 years in startup incubation and innovation. Launched and repositioned multiple brands. Twitter:
LinkedIn:
-
Cyrille Derché - Tech lead at TogetherCrew. Bsc. Computer Science. Ex-Accenture, 8 year as co-founder and CTO of SaaS company helping medical device manufacturers deliver product data + documentation to healthcare professionals (handling of sensitive data). Builder of products, processes, and teams. Linkedin:
Github:
(private)
Our team additionally counts with one more data scientist (
), 1 UX designer (
), and 3 core developers (
,
,
) and if needed, we can call upon
.
Related Links
Code:
Research:
https://orcid.org/0000-0002-4275-632X