Unsupervised Learning

chevron-icon
Back
Top
chevron-icon
project-presentation-img
Ramin Barati
Project Owner

Unsupervised Learning

Status

  • Overall Status

    ⏳ Contract Pending

  • Funding Transfered

    $0 USD

  • Max Funding Amount

    $15,000 USD

Funding Schedule

View Milestones
Milestone Release 1
$2,500 USD Pending TBD
Milestone Release 2
$5,000 USD Pending TBD
Milestone Release 3
$5,000 USD Pending TBD
Milestone Release 4
$2,500 USD Pending TBD

Project AI Services

No Service Available

Overview

Our team proposes the implementation of clustering heuristics in MeTTa, focusing initially on K-Means, Hierarchical Clustering, Spectral Clustering, and the Gaussian Mixture Model (GMM). Leveraging our experience in variational inference and EM algorithms, including past work on a generalized GMM model, we aim to align our approach with MeTTa’s design philosophy, collaborating closely with the Hyperon community to ensure an intuitive fit. By adhering to conventions from libraries like numpy, scikit-learn, and matplotlib, our project will create a foundational toolkit for unsupervised learning in MeTTa, promoting broader adoption and setting the stage for more advanced applications.

RFP Guidelines

Implement clustering heuristics in MeTTa

Complete & Awarded
  • Type SingularityNET RFP
  • Total RFP Funding $40,000 USD
  • Proposals 6
  • Awarded Projects 1
author-img
SingularityNET
Aug. 12, 2024

The goal is to implement clustering algorithms in MeTTa and demonstrate interesting functionality on simple but meaningful test problems. This serves as a working prototype providing guidance for development of scalable tooling providing similar functionality, suitable for serving as part of a Hyperon-based AGI system following the PRIMUS cognitive architecture.

Proposal Description

Project details

Our project aims to develop a comprehensive suite of clustering heuristics in the MeTTa language, an emerging framework within the OpenCog ecosystem. Clustering is foundational in unsupervised learning, with extensive applications in data analysis, pattern recognition, and model optimization. By introducing robust clustering capabilities in MeTTa, we seek to create a versatile, scalable toolkit that enables diverse applications and serves as a stepping stone for advanced machine learning techniques within the platform.

Phase 1: Core Clustering Algorithms

The project’s first phase will focus on implementing widely used clustering algorithms, including:

  1. K-Means Clustering: A fast and intuitive clustering method that iteratively refines cluster centers for data partitioning.
  2. Hierarchical Clustering: A flexible technique is a method of cluster analysis that seeks to build a hierarchy of clusters.
  3. Spectral Clustering: Make use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimensionality reduction before clustering in fewer dimensions.
  4. Gaussian Mixture Model (GMM): Based on probabilistic modeling, GMM provides soft clustering and can handle overlapping clusters effectively.

Once these core algorithms are established, we plan to evaluate the remaining time and budget to incorporate additional clustering methods, based on community feedback and project requirements.

Technical Approach and Methodology:

Our team will leverage its background in statistical and machine learning techniques, including variational inference and the Expectation-Maximization (EM) algorithm, to implement efficient, accurate clustering methods. Our prior experience includes developing a learning algorithm for a mixture of Gaussian Mixture Models (GMMs), which positions us well to handle complex clustering tasks and make MeTTa a reliable tool for unsupervised learning.

Throughout this project, we’ll focus on integrating our solution seamlessly within MeTTa by adhering to its language conventions. We’ll also seek active collaboration with the Hyperon and OpenCog communities to ensure our implementations are well-aligned with the language’s design principles. For accessibility and ease of use, we’ll model our implementation structure after popular Python libraries such as numpy, scikit-learn, and matplotlib, utilizing these libraries where possible to maintain consistency and familiarity for end-users.

Impact and Significance of the Project:

Clustering provides critical functionality in unsupervised learning, helping to uncover underlying patterns and structures within data, often without labeled examples. In MeTTa, these tools can help streamline tasks such as data summarization and domain decomposition, which are integral for a range of machine learning applications, including the training of models like mixtures of experts. Furthermore, implementing clustering heuristics will serve as a foundational contribution, enabling broader adoption and more sophisticated applications in the MeTTa community.

By providing these initial implementations, we anticipate that our project will act as a reference for further development of advanced machine learning algorithms in MeTTa. This project aligns well with MeTTa’s goals of facilitating general AI research and advancing the capabilities of unsupervised learning models.

Team Expertise and Background:

We are a team of AI-focused software engineers and graduates with 4-5 years of experience in enterprise software development and data science. Our technical background includes significant experience with statistical modeling, variational inference, and machine learning algorithms. We believe our skills uniquely position us to contribute effectively to MeTTa’s capabilities and to facilitate the adoption of clustering methods within its framework.

Expected Deliverables:

  1. Implementation of core clustering algorithms (K-Means, Hierarchical, Spectral, and GMM) in MeTTa.
  2. Documentation and examples for each clustering method, following conventions used in popular libraries like scikit-learn.
  3. Collaboration with the Hyperon community to ensure that our approach aligns with MeTTa’s design principles.
  4. Initial support for additional clustering methods, contingent on resources and timeline.

Conclusion:

This proposal represents an exciting opportunity to expand the functionality of MeTTa with essential unsupervised learning tools, positioning it as a robust choice for machine learning research and practical applications. With strong alignment to MeTTa’s community and goals, our work aims to deliver high-quality, reusable clustering modules that can benefit researchers and developers across diverse domains.

Open Source Licensing

MIT - Massachusetts Institute of Technology License

Proposal Video

Not Avaliable Yet

Check back later during the Feedback & Selection period for the RFP that is proposal is applied to.

Group Expert Rating (Final)

Overall

5.0

  • Feasibility 4.0
  • Desirabilty 3.7
  • Usefulness 3.7

New reviews and ratings are disabled for Awarded Projects

Overall Community

3.3

from 3 reviews
  • 5
    1
  • 4
    0
  • 3
    1
  • 2
    1
  • 1
    0

Feasibility

4

from 3 reviews

Viability

3.7

from 3 reviews

Desirabilty

3.7

from 3 reviews

Usefulness

0

from 3 reviews

Sort by

3 ratings
  • Expert Review 1

    Overall

    2.0

    • Compliance with RFP requirements 2.0
    • Solution details and team expertise 2.0
    • Value for money 0.0
    The proposal lacks technical details.

    I have the following three comments about all the clustering proposals and to be fair, I will mention them for all the proposals. At the end, you can see my comments specifically for this current proposal. First, I was expecting to see more on the difficulties that one may face when a clustering algorithm is implemented in MeTTa, in other words, MaTTa-specific challenges, and the proposing team plans to handle them. I did not see that in any of the proposals. Second, I was expecting to see their plan for making sure the MeTTa clustering library will have the ability to work robustly on diverse datasets. For example, they could have listed a few datasets that may cause problems for a clustering algorithm and could have mentioned how they plan to avoid those problems. Third, based on my experience with clustering algorithms, most computational gains come from vectorization. None of the proposals even mention that even though the RFP specifically mentions Concurrent processing and the ability to work on large datasets. Proposal-specific comments: The proposal lacks details about their plan. For example, they don’t mention what difficulties they anticipate to face when making sure that their solution integrates seamlessly with MeTTa. They don’t mention if they will allow user-specific distance metrics. They don’t add any links to their claimed prior experience in clustering.

  • Expert Review 2

    Overall

    5.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 5.0
    • Value for money 0.0
    A thorough and solid proposal from a proven OpenCog contributor

    Ramin has contributed to OpenCog Classic in the past and is a solid researcher and developer. This is a relatively simple task which he is clearly very capable of doing, and he's fleshed it out competently in the proposal.

  • Expert Review 3

    Overall

    3.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 4.0
    • Value for money 0.0

    A solid and straightforward approach focusing on four important clustering algorithms. Well structured proposal focusing on learning MeTTa and communicating with the MeTTa community. Would have been good to discuss evaluation metrics used.

  • Total Milestones

    4

  • Total Budget

    $15,000 USD

  • Last Updated

    3 Feb 2025

Milestone 1 - Learning MeTTa and Design (2 Weeks)

Status
😐 Not Started
Description

In this initial phase our team will focus on understanding MeTTa’s syntax language constructs and foundational paradigms to ensure a solid grasp of its features and capabilities. This learning phase will include: - Hands-on experimentation with MeTTa code examples and the available documentation. - Collaboration with the Hyperon and OpenCog communities to gain insights and seek guidance where necessary. - Creating an initial interface outline that specifies user interactions with the package, aligning with MeTTa’s syntax and conventions.

Deliverables

A well-prepared team with a thorough understanding of MeTTa, along with a preliminary package design and interface draft to inform the subsequent implementation phases.

Budget

$2,500 USD

Link URL

Milestone 2 - Implementation and Documentation (1 Month)

Status
😐 Not Started
Description

This milestone will center on the development of the primary clustering algorithms and their integration into MeTTa ensuring that they align with the language’s framework and design goals. Activities in this phase include: - Implementing K-Means Hierarchical Clustering Spectral Clustering and Gaussian Mixture Model (GMM) in MeTTa. - Setting up data ingestion pipelines to accommodate various input formats and data types facilitating seamless data flow. - Creating demo examples for each algorithm and writing comprehensive documentation to guide users on setup usage and output interpretation.

Deliverables

A functional suite of clustering algorithms accompanied by documentation and demo examples to support users in effectively applying these techniques in MeTTa.

Budget

$5,000 USD

Link URL

Milestone 3 - Visualization and Exporting (1 Month)

Status
😐 Not Started
Description

With the core algorithms in place the focus of this milestone will shift to enhancing usability interactivity and data handling. Specific tasks include: - Developing visualization capabilities to enable users to interpret clustering results graphically. - Implementing concurrency where applicable to optimize performance particularly for large datasets. - Enabling data export functionalities to ensure compatibility with external tools and frameworks. - Preparing a comprehensive technical report detailing the project’s approach results and challenges.

Deliverables

A robust user-friendly clustering module with visualization and exporting options supported by a technical report documenting the implementation.

Budget

$5,000 USD

Link URL

Milestone 4 - Final Refinements (2 Weeks)

Status
😐 Not Started
Description

In this final phase the focus will be on polishing the project fostering community engagement and preparing the project for future extensions. Activities include: - Reviewing and refining the codebase to adhere to best practices ensuring maintainability and clarity. - Creating example-driven documentation tutorials and video guides to encourage adoption and usage within the MeTTa community. - Exploring and documenting potential future directions for the project including additional algorithms and advanced clustering applications.

Deliverables

A finalized polished toolkit with ample resources for user engagement and support accompanied by recommendations for future work in clustering within MeTTa.

Budget

$2,500 USD

Link URL

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

Group Expert Rating (Final)

Overall

5.0

  • Feasibility 4.0
  • Desirabilty 3.7
  • Usefulness 3.7

New reviews and ratings are disabled for Awarded Projects

  • Expert Review 1

    Overall

    2.0

    • Compliance with RFP requirements 2.0
    • Solution details and team expertise 2.0
    • Value for money 0.0
    The proposal lacks technical details.

    I have the following three comments about all the clustering proposals and to be fair, I will mention them for all the proposals. At the end, you can see my comments specifically for this current proposal. First, I was expecting to see more on the difficulties that one may face when a clustering algorithm is implemented in MeTTa, in other words, MaTTa-specific challenges, and the proposing team plans to handle them. I did not see that in any of the proposals. Second, I was expecting to see their plan for making sure the MeTTa clustering library will have the ability to work robustly on diverse datasets. For example, they could have listed a few datasets that may cause problems for a clustering algorithm and could have mentioned how they plan to avoid those problems. Third, based on my experience with clustering algorithms, most computational gains come from vectorization. None of the proposals even mention that even though the RFP specifically mentions Concurrent processing and the ability to work on large datasets. Proposal-specific comments: The proposal lacks details about their plan. For example, they don’t mention what difficulties they anticipate to face when making sure that their solution integrates seamlessly with MeTTa. They don’t mention if they will allow user-specific distance metrics. They don’t add any links to their claimed prior experience in clustering.

  • Expert Review 2

    Overall

    5.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 5.0
    • Value for money 0.0
    A thorough and solid proposal from a proven OpenCog contributor

    Ramin has contributed to OpenCog Classic in the past and is a solid researcher and developer. This is a relatively simple task which he is clearly very capable of doing, and he's fleshed it out competently in the proposal.

  • Expert Review 3

    Overall

    3.0

    • Compliance with RFP requirements 5.0
    • Solution details and team expertise 4.0
    • Value for money 0.0

    A solid and straightforward approach focusing on four important clustering algorithms. Well structured proposal focusing on learning MeTTa and communicating with the MeTTa community. Would have been good to discuss evaluation metrics used.

feedback_icon