SIBYL: The General-Purpose Forecaster

chevron-icon
Back
project-presentation-img
Kevin R.C.
Project Owner

SIBYL: The General-Purpose Forecaster

Funding Awarded

$84,000 USD

Expert Review
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 0
Community
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 0 (0)

Status

  • Overall Status

    🛠️ In Progress

  • Funding Transfered

    $65,000 USD

  • Max Funding Amount

    $84,000 USD

Funding Schedule

View Milestones
Milestone Release 1
$11,000 USD Transfer Complete TBD
Milestone Release 2
$8,000 USD Transfer Complete TBD
Milestone Release 3
$10,000 USD Transfer Complete TBD
Milestone Release 4
$11,000 USD Transfer Complete TBD
Milestone Release 5
$5,000 USD Transfer Complete TBD
Milestone Release 6
$8,000 USD Transfer Complete 11 Apr 2024
Milestone Release 7
$12,000 USD Transfer Complete TBD
Milestone Release 8
$8,500 USD Pending 11 Apr 2024
Milestone Release 9
$10,500 USD Pending TBD

Video Updates

SIBYL: The General-Purpose Forecaster

9 February 2024

SIBYL: The General-Purpose Forecaster

30 January 2024

Project AI Services

No Service Available

Overview

SIBYL is an AutoML service and research tool that produces personalized forecasts on various time-series data, regardless of scientific or industry domain. All users need to do is input the raw data. Applications are boundless, including but not limited to financial/crypto market flows, sales demand/usage, energy/weather, sensors/satellites, computer/blockchain networks, and bioinformatics.

Proposal Description

AI services (New or Existing)

Compnay Name

Temporai

Service Details

SIBYL is an AutoML service and research tool that produces personalized forecasts on various time-series data, regardless of scientific or industry domain. All users need to do is input the raw data. Applications are boundless, including but not limited to financial/crypto market flows, sales demand/usage, energy/weather, sensors/satellites, computer/blockchain networks, and bioinformatics.

The team behind SIBYL includes the core developers of NeuralProphet, a popular time-series open-sourced library (and likely successor of Facebook’s Prophet) with over 1.35 million total downloads and over 2,750 GitHub stars as of this writing.

 
Although the end goal of this proposal is to deliver SIBYL v1.0 to the SNET marketplace so that it can be used by all, we expect to continue developing and improving the capabilities of the SIBYL service. Future versions may include but are not limited to the following features:
  • Advanced Sybils: including the latest deep learning (e.g., N-BEATS and Wavelet) and/or neural-symbolic (e.g., OpenCog Hyperon and NARS) Sybils into the Delphi. Could this be a pathway toward AGI?
  • Scalable back-end: incorporating microservices architecture to execute each user’s forecast independently.
  • Front-end GUI: user can better interact with and customize the Sybils’ training and forecasts, whether the intermediary prophecies or the final oracle.
  • Transfer learning: user can select from a library of global pre-trained Sybils (like Hugging Face for NLP) and then fine-tune it with her data to greatly reduce the training time.

The scope of work for these future versions depends on our funding runway, through a combination of SIBYL v1.0 revenue, subsequent Deep Founding round(s), and other external investments/grants.

Problem Description

Forecasting what lies ahead is integral to our day-to-day activities. For example, we use weather forecasts to help us pick what clothes to wear that day, or whether to bring an umbrella. Companies use their sales growth forecasts to manage their inventories better, or to invest in a new product line.

The forecasting practice, however, remains a highly elusive and often unreliable affair.

, the forecasters were the mystics, shamans, prophets, and sybils giving out divine visions and omens using astrological signs and oracle bones. Today, they are the television pundits or social media influencers giving out their neverending political/economic hot takes or what stocks/cryptos to invest in, leveraging their ratings or subscriber counts for their credibility. Even the experts in those fields can make forecasts that are

when actually put to the test.

With the recent advances in computing power and big data capabilities, perhaps forecasting should be done in a more quantitative and data-driven approach. But even that is easier said than done reliably, for the following four reasons:

 
  1.  

    Extrapolating historical and current events into the future is no easy feat, if it is even possible at all.

  2. Time-series data is as “real-world” as it gets, meaning it is often unsanitized, noisy, sparse, and shifty (i.e., prone to regime changes, non-stationary).
  3. The

    theorem strongly applies to time-series modeling. On one hand, statistical models are easier to use and explain but make strong assumptions about the world to be predicted and cannot scale well to more complex or high-frequency data. On the other hand, machine and deep learning models are less biased and can learn deeper or more nonparametric features within the time-series data but also risk overfitting its inherent noises.

  4. There is a lack of standardized resources and tools for time-series forecasting practitioners, mainly due to reasons 1-3. Big FAANG tech companies can build-out their own in-house forecasting teams for their respective business needs, but the rest of us likely cannot afford that type of luxury.

In short, forecasting is essential yet challenging to pull off. Especially if we lack the resources or know-hows to do so properly. So how do we overcome this paradox?

Solution Description

Our solution is not to reinvent the wheel by building another fancy Transformer or GPT-whatever model, and spend a cool billion dollars in doing so. Rather, it is to take various tried-and-true forecasting models out there, both statistical and ML-based, and have them complement each other to produce a more robust and generalizable forecast. This is called the “wisdom of crowds” approach. It is analogous to assembling a kind of forecasting Justice League.

SIBYL’s service will use a stacked generalization (or stacking for short) architecture in particular. Stacking is an ensemble method that first trains a diverse amount of base models, then their predictions are aggregated by a meta-model, who finally produces the composite output. Here is our tentative architecture design, which will be finalized after the M1 milestone:

 

This stacked generalization (

 

) first contains the base models (

). They consist of, but are not limited to, the following time-series friendly or compatible models:

  • Seasonal-Naïve
  • Auto-ARIMA
  • Holt-Winters (Exponential Smoothing)
  • LASSO
  • LightGBM
  • NeuralProphet
  • LSTM-based

The Sybils are a mix of statistical, machine/deep learning, and

. They cover time-series datasets of different sizes, frequencies, variables, and other temporal characteristics. The statistical models cover simpler datasets, while hybrid and deep learning models cover the more complex ones, including ones with exogenous variables (or covariates). The SIBYL service will automatically preprocess the time-series datasets for the models and perform model selection to preemptively filter out models that are clearly unsuitable for training the given time-series.

The meta-model (

) consists of one or a hybrid of the following:

  • Weighted Average (WA)
  • Linear Regression (LR)
  • Blending

In the current Delphi, Pythia is a rule-based algorithm that weighs the trained outputs ("prophecies") from Sybils based on their performance metrics on the Out-of-Sample (OOS) test set. It will then use these prophecies and their weights to create the final composite output ("oracle"). This oracle is the forecast the user will get after running the SIBYL service.

This service will be implemented in Python 3. We will use Scikit-learn and Statsmodel libraries for the statistical models and PyTorch Lightning for the deep learning/neural network ones. We will also consider utilizing existing marketplace services and develop the remaining Sybils to be built-in within the SIBYL service. Finally, we plan to deploy this service on an AWS cloud instance (or serverless) but may also consider using a more decentralized computing environment like Nunet.

Description of service related to this proposal

SIBYL has one service with two primary API functions for users to call:

  1. Train: users input training dataset for SIBYL to train with its stacking architecture Delphi, and outputs model parameters back to the user in JSON or another compatible file format.
  2. Predict: user inputs the training dataset and model parameters to SIBYL. SIBYL then outputs the model forecast or oracle back to the user.
 

 

See the attached .pdf for a simple but illustrative example of how an AGIX crypto trader can use this service.

External Services

Here are the external services on (or will be on) SNET’s marketplace that may be utilized for this project:

  • NeuralProphet: Use this NeuralProphet service as the hybrid Sybil. See our separate onboarding NeuralProphet proposal

    in DF2 - Pool B [existing AI services].

  • : Potentially use these services as Facebook's Prophet and LSTM Sybils instead of having them built into the SYBIL service.

  • : Potentially utilize the service’s Dynamic Coupled Variational Autoencoder (C-VAE) for time-series data generation to

    the input data before training.

Milestone & Budget

Cost breakdown

The proposed budget for SIBYL is $84,000. Here is the overall cost breakdown:

  • Modeling and AI R&D ("R&D"): $36,000
  • Software Engineering and Integration ("SWE"): $15,000
  • Product Management ("PM"): $8,000
  • Company Operations, Accounting, Legal, Website, and Marketing ("OPS"): $4,000
  • Service Host and API Calls ("API"): $21,000

Service Host and API Calls meet the 25% reservation rule.

The expected total time to complete all milestones is around 5-7 months. Therefore, if this proposal gets approved in late Q1 2023, the completion time is in the first half of Q4 2023. Nevertheless, we follow the iterative the process to model development so we can begin integrating our stack end-to-end at milestone M2 (v0.1) and, with feedback from alpha users, continually improve that stack until M* (v1.0).

Service License Info

Yes eventually. It will be under the GNU General Public License (GPL) v3.0 License.

Revenue Sharing

We will adhere to the API Calls “user-friendly” template. If this service crosses the threshold of $1,000 in monthly revenue, then 10% of the additional revenue will be fed back into the SNET and Deep Funding wallets.

Marketing & Competition

SIBYL’s marketing approach aims to appeal to the following two communities: 1) time-series open-source and 2) SNET/crypto ecosystems.

Regarding time-series open-source, the team behind SIBYL includes the core developers of NeuralProphet (NP). NP is a widely used time-series library with 1.25M+ total downloads and 4-5K average daily downloads (

). It aims to succeed Facebook’s Prophet as the leading time-series library, with 16M+ total downloads and 40-50K average daily downloads (

). If SIBYL can acquire just 1 of every 1,000 of Prophet's user base, then that alone will be 16K+ total API calls with 40-50 made daily.

Regarding the SNET/crypto ecosystems, SIBYL has invited collaborators at the SingularityNET and two of its spin-offs Cogito and SingularityDAO to participate as alpha users. We will work on the following Proof-of-Concepts (PoCs) throughout the development of the SIBYL service to continuously validate product-market fit:

  1. Forecasting temperature, weather, and/or air quality data for certain climate/sustainability use cases. Data sources may include OpenAQ, PlanetWatch, GEDI, NOAA, and USGS EarthExplorer. With Matt I. from SingularityNET. See our separate soil health proposal

    in DF2 - Pool C [ideation].

  2. Forecasting demand or transaction volume of an established stablecoin (e.g., DAI) to spot certain market patterns. Data sources may include Yahoo! Finance, CryptoQuant, Bitquery, and crypto exchange APIs. With Nejc Z. from Cogito.
  3. Forecasting inflow and outflow of SDAO tokens to and from certain decentralized exchanges (DEXes) using a myriad of crypto or stock factors. With Rafe T. from SingularityDAO.

We are open to including a few more alpha users from the SNET community. If you have any other use cases that you think SIBYL can help with, please let us know!

Upon completing the SIBYL v1.0 implementation, we will expand our invitation to the greater SNET, Cardano Catalyst, and broader crypto communities to discover additional fascinating forecasting applications and co-develop using the SIBYL service. We will distribute any remaining API Calls funding to incentivize usage further.

Needed Resources

We may need to recruit an additional junior, part-time MLE and/or DevOps to support our implementation efforts. Other than that, any community feedback, support, and usage of our work will be greatly appreciated!!

Related Links

SYBIL slide deck:

NeuralProphet:

Kevin’s previous time-series talks and demos:

  • Time-Series Anomaly Detection with Kaggle NYC (2019) -
  • Tools for Exploring Air Quality Data with OpenAQ (2020) -
  • Photrek FCNT NeuralProphet Demo with Catalyst Swarm (2022) -

AI Services

Proposal Video

DF Spotlight Day - DFR4 - Kevin R. C. - SYBIL: The Deep Learning Forecaster

7 June 2024
  • Total Milestones

    9

  • Total Budget

    $84,000 USD

  • Last Updated

    8 Jun 2024

Milestone 1 - 1 (M1)

Status
😀 Completed
Description

Description: finalize Temporai company formation and SIBYL Delphi design/user requirements. Deliverables: establish company as LLC, finalize contracts with team, assemble legal, set up business and development tools (e.g., Slack, GitHub, AWS, Quickbooks), finalize stacking architecture or Delphi design, understand the SNET’s API schema, gather several user requirements from SNET ecosystem (e.g., PoCs mentioned in the proposal Marketing section), and write the following above up as a mini-report.

Deliverables

Budget

$11,000 USD

Milestone 2 - 2 (M2)

Status
😀 Completed
Description

Description: SIBYL v0.1 with one baseline statistical Sybil. Deliverables: create a baseline statistical time-series base model or Sybil (e.g., Auto-ARIMA), deploy v0.1 to AWS (e.g., CodeDeploy), and test/evaluate forecasts or oracles with alpha users.

Deliverables

Budget

$8,000 USD

Link URL

Milestone 3 - 3 (M3)

Status
😀 Completed
Description

Description: SIBYL v0.2 with multiple statistical Sybils and baseline Pythia. Deliverables: create multiple statistical Sybils, ensemble them to form a foundational Delphi, use a baseline meta-learner Pythia (e.g., weighted-average (WA)), deploy v0.2 to AWS, resolve any deployment bugs from current and previous milestones, and test/evaluate oracles with alpha users.

Deliverables

Budget

$10,000 USD

Link URL

Milestone 4 - 4 (M4)

Status
😀 Completed
Description

Description: SIBYL v0.3 with more Delphi automation/optimization and alternative Pythia. Deliverables: further automate data preprocessing and model selection functions, use an alternative Pythia (e.g., linear regression (LR)), deploy v0.3 to AWS, optimize code to reduce latency, and test/evaluate oracles with alpha users.

Deliverables

Budget

$11,000 USD

Link URL

Milestone 5 - 5 (M5)

Status
😀 Completed
Description

Description: Deploy SIBYL v0.3 from AWS to SNET Platform. Deliverables: Integrate v0.3 app/endpoint as a SNET service, deploy service onto SNET marketplace platform, and develop basic front-end marketplace UI.

Deliverables

Budget

$5,000 USD

Milestone 6 - 6 (M6)

Status
😀 Completed
Description

Description: SIBYL v0.4 with hybrid Sybil using external NeuralProphet service. Deliverables: add onboarded NeuralProphet service as hybrid Sybil into the Delphi, enable multivariate and lagged-covariate features, deploy v0.4 to SNET platform end-to-end, optimize code to reduce latency, refine front-end marketplace UI, and test/evaluate oracles with alpha users.

Deliverables

Budget

$8,000 USD

Milestone 7 - 7 (M7)

Status
😀 Completed
Description

Description: SIBYL v0.5 with machine/deep learning Sybils. Deliverables: create and add at least one ML Sybil (e.g., LightGBM) and at least one DL Sybil (e.g., LSTM) (or alternatively at least two of one of them) into the Delphi, deploy v0.5 to SNET platform end-to-end, optimize code to reduce latency, and test/evaluate oracles with alpha users.

Deliverables

Budget

$12,000 USD

Link URL

Milestone 8 - 8 (M8)

Status
🧐 In Progress
Description

Description: SIBYL v1.0 with community reports. Deliverables: test/evaluate v0.5 forecasts with alpha users, make final tweaks and release it as v1.0 to SNET platform end-to-end, write a final report and/or whitepaper about the SIBYL architecture and PoCs, create Temporai website with SIBYL page if necessary, and focus more on marketing/promotion with the assistance of SNET and community.

Deliverables

Budget

$8,500 USD

Link URL

Milestone 9 - Final (M*)

Status
😐 Not Started
Description

Description: disburse remaining API funds for production hosting, maintenance, and usage incentivization on-demand.

Deliverables

Budget

$10,500 USD

Link URL

Join the Discussion (0)

Reviews & Rating

New reviews and ratings are disabled for Awarded Projects

Sort by

0 ratings

Summary

Overall Community

0

from 0 reviews
  • 5
    0
  • 4
    0
  • 3
    0
  • 2
    0
  • 1
    0

Feasibility

0

from 0 reviews

Viability

0

from 0 reviews

Desirabilty

0

from 0 reviews

Usefulness

0

from 0 reviews

Get Involved

Contribute your talents by joining your dream team and project. Visit the job board at Freelance DAO for opportunites today!

View Job Board