
Universal Research Center
Project OwnerDr. Cédric Mesnage Developer, scientist
Our project pioneers a compassionate AI agent within the Minetest virtual world, integrating reinforcement learning (RL) and a transformer-based language model to enable autonomous thinking, ethical reasoning, and human-like interaction. The AI will observe, reflect, and engage safely with users, ensuring pro-social behavior and minimizing harmful actions. By expanding to multi-agent collaboration and human-AI chat, this project advances AI safety, ethics, and intelligent virtual companions for education, research, and interactive environments. It lays the groundwork for trustworthy AI systems that adapt dynamically to human needs, fostering safer and more meaningful AI-human interactions.
New AI service
tbc
tbc
tbc
In this milestone we will replace our existing ChatGPT-based layer with a locally hosted transformer-based LLM featuring approximately 2B parameters ensuring it can run smoothly on a standard laptop. By doing this we eliminate reliance on external APIs and gain full control over how the model is adapted for child-friendly interactions data privacy and real-time performance. Our approach will integrate seamlessly with the existing “thinking as an action” paradigm in which the AI’s internal dialogue is treated as part of its decision-making loop. We will refactor the codebase to route environment observations memory states and action prompts directly through the new local LLM. This involves restructuring the prompt templates to handle environment updates questions and introspective “thoughts” while ensuring the agent’s reward mechanisms remain tightly coupled to the local model’s outputs. By the end of this milestone our Minetest-based system will function entirely offline preserving user data on-device and allowing families or educational settings to deploy an advanced AI assistant without cloud dependencies. This milestone sets the stage for our long-term vision: an ethical privacy-respecting AI that demonstrates creative problem-solving consistent child-safe behavior and rapid adaptability to new scenarios all while ensuring a robust foundation for the subsequent compassion-oriented and multi-agent milestones.
Deliverable Description 1. Local LLM Integration Package A comprehensive module replacing ChatGPT calls with our 2B-parameter transformer complete with optimized configuration files inference scripts and a scalable approach to run on mid-range hardware. 2. Revised Prompt & Memory Infrastructure Updated code that streams game observations and internal “thinking” text into the local LLM ensuring tight feedback loops between the agent’s environment memory buffer and reward signals. 3. Demonstration Environment A fully functional Minetest build showcasing the local LLM’s capabilities: logs of the agent’s introspective dialogue response times and in-world decision-making. 4. Technical Documentation & User Guide Step-by-step instructions on setup and customization with best practices for child-safe prompts recommended GPU/CPU specs and guidance for minor parameter tuning if users need specialized behaviors. 5. Performance Benchmarks Measurable indicators (e.g. latency memory footprint task success rate) demonstrating that the local LLM operates at near-real-time speeds preserving or surpassing the AI’s former performance with ChatGPT.
$20,000 USD
1. Operational Independence The system runs entirely on local hardware—no external API calls—while maintaining stable inference speeds and a streamlined user experience for family or classroom scenarios. 2. Robust & Privacy-Focused All dialogue, rewards, and memory data remain on-device, substantially reducing privacy risks and offering users full ownership of their AI’s data pipeline. 3. Consistent Behavior & Creativity Qualitative tests show the new model can generate thoughtful, varied “inner dialogue” on par with ChatGPT-based performance, with no drop in adaptability or creativity. 4. Child-Friendly Interaction Quality Preliminary assessments confirm polite, safe, and encouraging dialogue. The AI’s suggestions remain appropriate and supportive, aligning with our goal of a pro-social, helpful companion. 5. Scalable for Future Milestones The newly integrated local LLM seamlessly supports subsequent milestones focused on compassion-driven reinforcement learning and multi-agent/human–AI collaboration, proving its viability as the project’s long-term backbone.
In this milestone we will scale up to a 7B-parameter transformer model and rigorously train our AI agent on compassion-oriented reinforcement learning objectives. Building on the local setup from Milestone 1 we will integrate a multi-objective RL framework where “compassion” or “non-harm” is a central reward factor—alongside curiosity problem-solving and user engagement. This approach ensures that the agent actively avoids harmful or inappropriate behaviors while proactively seeking to assist and nurture positive experiences especially for children. Our methodology involves explicit pro-social reward shaping: we will craft reward signals that reinforce empathetic communication supportive in-game actions (e.g. helping a human player build or learn) and conflict resolution in multi-agent scenarios. The training environment will include scenario scripts where the agent faces moral choices so it learns to respond with gentleness and cooperation rather than aggression or neglect. We will also fine-tune the agent’s language style to remain polite and age-appropriate. By the end of this milestone we expect the 7B-parameter model to demonstrate higher expressive capacity improved safety compliance and robust alignment with child-focused compassionate goals.
1. Expanded Model Architecture & Training Scripts A fully documented pipeline that accommodates the new 7B-parameter model complete with hyperparameter tuning advanced prompt templates and a multi-objective RL reward system emphasizing “compassion.” 2. Scripted Ethical Scenarios & Data A curated set of environment states dialogue prompts and interactive tasks specifically designed to assess and reinforce the agent’s pro-social choices and empathy. 3. Checkpoints & Fine-Tuning Logs Versioned checkpoints of the model’s progress alongside detailed logs of reward evolution performance metrics (e.g. “Cooperation Score”) and agent behaviors through multiple training runs. 4. Updated Minetest Integration An enhanced in-game interface showcasing how the RL agent reacts to user inquiries navigates moral quandaries and encourages constructive play sessions. 5. Technical Documentation & Ethics Brief A step-by-step guide on deploying the 2B model plus an “Ethics Brief” that explains how reward-shaping safe prompts and scenario design minimize the risk of harmful or offensive behavior.
$20,000 USD
1. Measurable Compassion Alignment Quantitative improvements in cooperation, helpfulness, and polite communication across test scenarios, with a significant drop in any rule-breaking or socially negative actions compared to baseline. 2. Stable Multi-Objective Convergence The RL system consistently converges toward higher scores on both “task efficiency” and “pro-social metrics,” demonstrating that compassion does not compromise overall performance. 3. In-Game Behavior Validation During interactive playtests, the agent handles scenarios such as sharing resources, comforting frustrated users, or peacefully resolving conflicts without external interventions or overrides. 4. Child-Friendliness & Trustworthiness Educators or parents evaluate the final agent’s interactions as safe, respectful, and beneficial for a child audience, confirming that it aligns with age-appropriate language and supportive play. 5. Transparent Reporting & Community Engagement A final report detailing training runs, ethical compliance, reward outcomes, and user impressions is shared openly, fostering trust and providing a roadmap for community feedback on further refinements.
In this final milestone we will expand our Minetest-based environment to support multi-agent interactions and human–AI chat transforming the single-agent paradigm into a collaborative ecosystem. Building on the compassionate RL framework (Milestone 2) we will enable multiple AI agents—and optionally human players—to engage in real-time dialogue coordinate on complex tasks and negotiate resources. Our system will incorporate partial observability (where each participant may see different portions of the environment) and a conflict-resolution module that guides agents toward peaceful outcomes instead of adversarial behavior. We will also implement a user-friendly chat interface that allows children parents or educators to directly converse with AI agents ask questions or give commands all while seeing how the agents reason and respond. This two-way interaction is designed to be child-safe and empathetic leveraging the pro-social reward signals introduced earlier. By blending multi-agent intelligence with human input we expect richer emergent behaviors deeper collaboration and a more engaging overall experience—paving the way for child-friendly AI companions capable of creative teamwork.
1. Multi-Agent Coordination Toolkit A newly integrated module where two or more AI agents can share or trade information allocate tasks and communicate via textual “thoughts” or explicit chat messages. 2. Human–AI Chat Interface A real-time chat overlay for human participants to interact with the AI agents directly in the Minetest world—issuing instructions posing questions or simply conversing. 3. Conflict Resolution & Negotiation Logic Additional reinforcement learning rules and scenario scripts that reward cooperative strategies and penalize aggressive or destructive actions ensuring stable group dynamics in shared tasks. 4. Demonstration & Documentation A working environment showcasing multiple agents (and at least one human user) collaborating on in-game goals (e.g. building a structure gathering resources) with step-by-step documentation on how to replicate these multi-agent/human interactions. 5. Performance & Behavior Metrics Quantitative and qualitative reports measuring conversation quality cooperation levels conflict frequency and user satisfaction particularly focusing on child-friendly and non-harmful behaviors.
$10,000 USD
1. Smooth Multi-Agent Collaboration Two or more AI agents reliably coordinate tasks, share knowledge, and resolve conflicts without external intervention, demonstrating stable group behaviors under partial observability. 2. High-Quality Human–AI Dialogue Real-time chat sessions exhibit coherent, context-aware, and polite exchanges, with no harmful or inappropriate outputs. Human testers report a positive and engaging experience. 3. Ethical & Child-Safe Conduct The negotiation module effectively prevents or mitigates destructive actions, and any user-facing communication aligns with age-appropriate guidelines, reinforcing trust in the system’s safety. 4. Extensibility & Reusability The delivered multi-agent/chat framework can be easily adapted or extended, supporting additional AI agents or more complex human roles (e.g., moderators or teachers), indicating a robust architecture for future development. 5. Demonstrable Impact Live demonstrations or pilot studies confirm that the multi-agent/human chat feature not only enriches gameplay but also highlights the system’s compassionate and collaborative potential in broader educational or social contexts.
Please create account or login to post comments.
Reviews & Ratings
Please create account or login to write a review and rate.
Check back later by refreshing the page.
© 2024 Deep Funding
Simon250
Mar 9, 2025 | 1:51 PMEdit Comment
Processing...
Please wait a moment!
Do elaborate the new AI service under AI services (New or Existing)?
shagofta1605
Mar 14, 2025 | 3:07 PMEdit Comment
Processing...
Please wait a moment!
Thank you for your comment. We plan to provide a new AI service. Could you please clarify what additional details you would like in the service description? For instance, are you looking for more technical specifications, implementation details, or potential use cases? Your guidance would be much appreciated.