GlucoseDAO

chevron-icon
Back
Top
chevron-icon
project-presentation-img
Livia_Zaharia
Project Owner

GlucoseDAO

Expert Rating

n/a
  • Proposal for BGI Nexus 1
  • Funding Request $50,000 USD
  • Funding Pools Beneficial AI Solutions
  • Total 5 Milestones

Overview

GlucoseDAO is a decentralized organization that enables diabetics and others with continuous glucose monitors (CGMs) to get accurate glucose predictions at least one hour in advance. For millions of diabetic people, such predictions are crucial to planning their everyday lives (when to inject, act, or do sports). It is also helpful for healthy people as it allows them to optimize their diet and exercise routines based on glucose patterns. What we are developing: -Extension of GlucoBench benchmark that will also measure human performance -Open-source ML model to predict glucose and related health outcomes -ML service and app that everybody can use easily

Proposal Description

How Our Project Will Contribute To The Growth Of The Decentralized AI Platform

17 million people worldwide regularly use CGMs , and this number will double pretty soon. For diabetics (like me) it is a life-saving device that allows making decisions on when to inject, what to eat, and exercise. Even though some glucose ML models exist, most of them are not open and get stuck in companies and academia. So, what ordinary CGM users can do now is only look at the growth/decline trends in numbers. We want CGM users to benefit from the power of AI! 

Our Team

The team is led by a diabetic with expert open-source & AI developers. Both the skills and the common interest of the team is in line with our project. But probably, the most important of all is that it was started by someone who has skin in the game.

AI services (New or Existing)

Glucose prediction model

Type

New AI service

Purpose

To predict glucose variations 60 minutes into the future. It will require users to provide a minimum amount of data to fine tune for further purposes

AI inputs

It should be able to deal with data in csv format obtain via uploading or API call

AI outputs

Predicted time series with confidence estimations. In the future we can also allow sending the action (i.e. eat ice cream inject a specific dosage of insulin etc.) to allow users to estimate how glucose will react

The core problem we are aiming to solve

Millions of CGM users face a gap: current devices only provide real-time readings and growth/decline projection, leaving us unprepared for sudden fluctuations that affect our meals, exercise, and insulin management. Something that I faced as diabetic. Meanwhile, many advanced machine learning models remain locked in academic research and proprietary systems, inaccessible to everyday users. 

We want to change this by providing an open-source ecosystem with an open-source+openweights model, improved Glucobench benchmark and AI service that various health apps can integrate.

Our specific solution to this problem

We will develop Glucose Forecaster AI service to provide one-hour-ahead glucose predictions and tailored lifestyle suggestions. Instead of simply displaying current values and trends, our solution offers actionable forecasts to help users better plan meals, exercise, and medication.

Before deploying our model, we face two main challenges. First, our machine learning model must outperform human predictions. Although 45 glucose prediction models exist, most are locked in academia or proprietary systems—and little research has examined how well people can forecast their own glucose levels. To address this, we are developing this game and will make a mini-study to collect human prediction data and pinpoint common errors to improve benchmarking.

Second, we must build a uniform, open-source dataset and preprocessing pipeline. While free CGM datasets are available—such as Awesome-CGM , created by our adviser—we still need an automated system that combines these with user-contributed data.

Third, we will test how taking a foundational model and fine-tuning it on personal data can improve prediction.

We’ve considered developing our app, but tapping into the extensive user base of existing health apps can be an effective strategy to reach more people. Once these milestones are reached, we will fine-tune our Gluformer-based model (and explore additional options) and wrap it in AI service to make advanced glucose forecasting accessible to everyone.

Project details

I want to enable diabetics and others with continuous glucose monitors (CGMs) to get accurate glucose predictions at least one hour in advance. I also want researchers and citizen scientists to use our public dataset and fine-tune it with their own data.

Currently, about 17 million people worldwide use CGMs, but they lack long-term predictions. For diabetics this is a major problem as you need predictions to plan food, exercise, and insulin injections. Even healthy users of CGMs need predictive models for diet and exercise optimization. To provide such models, we need to pre-train them on CGM data first and then fine-tune them for individuals. Unfortunately, most data is siloed by companies and hospitals, with public datasets barely comparable to my 5 years of data. I am leading an initiative to create a public open-source anonymized dataset, pretrain a model, and provide a service for personalized fine-tuning that can be integrated into various apps.

I started the project as a personal endeavor working on it in my free time; Gradually, people joined me. At some point I got in touch with Renat Sergazinov, a core developer of GlucoBench  and Gluformer prediction model. Based on my feedback, some improvements to GlucoBench have been made. I also trained my model based on the Gluformer architecture, and we also discussed including our future citizen-science public CGM dataset in the GlucoBench.

In the summer of 2024, I contributed to the LongevityGenie open-source project. I integrated my glucose prediction model into their AI assistant, and it was presented as a poster at the ARDD conference.

The Glucose project started with my ideas, but my lack of technical skills was an issue since my studies lie in Architecture, not biology or programming—even though being a diabetic glucose analysis is second nature to me. To improve, I have been taking online courses, attending biomedical conferences and contributing to other open-source projects. 

After ARDD, I left my job as a senior architect and focused on improving my skills by contributing to the Longevity Genie open-source project and volunteering for the IBIMA Institute at Rostock Medical University. This gave me some practical skills and advice. 

Later, I applied for decentralized science funding (GG21 funding round) and got a microgrant of 2K USD. It was the starting point for having some leeway to develop the project.

The microgrant allowed me to attend the Zelar event, a pop-up city hosted in Berlin, where I met people versed in the fields of longevity, medicine, and AI. 

A cohort of 7 people agreed to acquire glucose sensors and track their glucose and food intake. The study is not yet done, but we hope to find new data.

Our endeavor also caused some interest from the longevity community, and HEALES organization invited me to Eurosymposium on Healthy Aging to co-teach a workshop. There, I also gave a presentation explaining the correlation between glucose values and longevity according to existing papers.

Existing efforts vary. There are apps related to glucose monitoring, but there are variations in terms of usage and method.

  • Academic glucose prediction models (mostly non-usable). Most of them have no weights or public validation data, with rare exceptions (like Glucobench). Others often lacked more easy-to-set-up environments, open weights, or open tests, which renders them primarily useless for most potential users.

 

  • Usable open-source apps devoted to adjacent but slightly different use cases
    Such apps comprise either prediction models valid only to specific subpopulations at specific conditions or apps like the Xdrip app that were developed for a particular need—to connect any sensor with any phone. Such apps are very popular, robust, and easy to use but do not have universal glucose prediction models.

 

  • Apps of the CGM manufacturers. Despite having a huge amount of data, CGM producers provide only basic stats like time in range, charges, and (de)acceleration. Moreover, Dexcom, Libre, and other manufacturers country– and vendor-lock users, creating huge problems. Most of them are user-unfriendly and have 1-3 star ratings, but users still have to use them because of vendor locks.

 

  • Generic health commercial apps with CGM support. There are commercial apps that aggregate various types of data, including CGM data. However, due to their lack of specialization, they do not provide any analysis beyond self-evident things like “time in range” and (de)acceleration of glucose.

 

  • CGM data collection app. Apps allow collections of CGM data but lack prediction capacities. It is the one where the user-provided data is free of charge (monetarily), but the company that receives it sells it further to pharma companies to provide for their cost. Users have some benefits- granted by the lack of default apps to do so- such as exporting reports for the doctors to read or a secondary place to store their data in a more accessible way- but at the end of the day, they do not receive any payment back.


In comparison with all the above categories, we:

  • Make things open (including source code, weights, and anonymized dataset)

  • integrate the models into Glucobench (public benchmark)

  • provide a user-friendly interface that non-technical people can use.

  • Results are specific, not cohort based, they reference an individual and a very specific value- not some general recommendation.

Regarding positioning in a global setting, if we look into the academic space, what is striking is that academia is focused on publishing and not on solving problems of the end users or giving citizens and companies something (public datasets and open-weight models) to improve on which is clearly stated by the latest literature review in GlucoBench paper: “We emphasize that, among the 45 methods identified in Table 1, a staggering 38 works do not offer publicly available implementations.” - 

( https://doi.org/10.48550/arXiv.2410.05780 ). I want to develop something that people will really use and that researchers can improve upon.

 

Our project aims to be open source, at least the foundational model. If fine-tuning or individual training will involve significant cost we will also consider additional payments to cover computational resources, but generally would try to avoid making the user pay.

For example here the models we have primarily trained can be accessed and here are the datasets  which are publicly accessible.
We also have the huggingfaceUI interface where I tried the first snapshot of the project, this is still a work-in-progress.

There are several technical challenges that I expect to address, or at least improve upon, during the project:

1. Noisy and inconsistent data

The first challenge is that data from CGM is only sometimes consistent. Many covariates (food, sports, etc.) often need to be included as it is hard for the users to track everything constantly. Different devices have biases and usually get decalibrated (especially after sensor switch/reset). Such a problem is typically solved by pre-training. Training a good CGM foundational model can make it robust to missing values and even let it operate without covariates, purely on glucose values.

 

2. Model Customization: Generic vs. Individual

Studies on Gluformer models show that pre-training can help, but it’s still debated whether representations learned from large datasets are better than training directly on individual data. Real-world performance often differs from controlled clinical studies, adding to this uncertainty.

We take an open-minded approach: if foundational models don’t perform well enough, we’ll shift our focus to improving the experience for training fully personalized models.

 

3.User Accessibility
From a user's perspective, it can be frustrating to have access to a commercial product that lacks key features like glucose prediction. Users who are not skilled in programming face significant barriers to using current prediction tools, which require running scripts, training models, and performing inferences.
Our goal is to simplify this process by providing a user-friendly web interface. Users will be able to create an account, upload their data files, and receive predictions—all with basic browser skills, without needing to write any code.

 

4. Data Standardization and Integration

Different manufacturers use different data formats, which complicates data centralization and standardization. The same glucose value can be formatted in multiple ways, which creates challenges for centralized training and analysis. We aim to address this issue by creating standardized conversion methods that can handle different data inputs from various sensors.

 

5. GDPR Compliance and Data Security

Since we are dealing with personal health data, ensuring compliance with data privacy regulations like GDPR is a critical challenge. We are reviewing  decentralized approaches like SOLID PODs , but we must also consider how to align this with the simplicity goals outlined in Challenge 3. 

The project stands out due to its open-source, user-centered approach, emphasizing accessibility, community collaboration, and transparency. Unlike other projects that are either overly technical or proprietary, it balances research and utility, offering actionable glucose predictions while allowing users to contribute to a public dataset. The open-source nature provides a valuable platform for both the diabetic community and researchers, fostering a collaborative ecosystem that bridges gaps left by existing solutions.

Needed resources

-members for community management

-members for promoting and social media

-members for front end coding

-members for analysing data

-GPU resources for inference/training

Existing resources

We can use Systems Biology of Aging Group server to host Web-ui, however it does not have GPU-s for training.

Open Source Licensing

Apache License

We use Apache License for our code ( published at https://github.com/GlucoseDAO ) and datasets (right now using huggingface for them). Foundational model, code for training and benchmarks will be open. At the same time, fine-tunnig for specific users will not be open by default and users will have agency over their personal data.

Additional videos

https://www.youtube.com/watch?app=desktop&v=uerUE6OOgPI

recording of the talk given in Zelar- Berlin

Was there any event, initiative or publication that motivated you to register/submit this proposal?

A personal referral

Describe the particulars.

I have received information via email from the Longevity community informing about the existence of this funding.

Proposal Video

Placeholder for Spotlight Day Pitch-presentations. Video's will be added by the DF team when available.

  • Total Milestones

    5

  • Total Budget

    $50,000 USD

  • Last Updated

    24 Feb 2025

Milestone 1 - Starting glucose prediction as a service

Description

Wrapping a toy glucose prediction model as Singularity.NET service Making the model interact with Singularity.NET will help us better understand the abilities and limitations of the platform and how the health apps might use it. We will use our fork of the Gluformer model a state-of-the-art model that is not yet good enough for our final goal. It should be noted that while this is the first milestone other stages of development are going on in parallel.

Deliverables

Using the trained model we have so far accesible via the Singularity. NET service. That way we will have the inputs and outputs workinng accordingly for this first step.

Budget

$2,000 USD

Success Criterion

we have a model in the catalog and a code that integrates it into apps.

Milestone 2 - Finishing user's interaction tools

Description

Knowledge of human performance and typical pitfalls is essential for improving our model. The model must be better than humans in most common use cases to be usable but to measure human performance we have to provide a game-like tool to make predictions. We made a basic prototype (https://github.com/GlucoseDAO/sugar-sugar) but we need time to make it usable. Another tool we started and need to finalize is our fork of just-chat (https://github.com/GlucoseDAO/just-chat) to tune it to user interactions. It is a chat agent that answers questions about glucose values with additional data from the literature. We need it to engage users and learn their concerns before training the model. It will allow us to recruit beta-testers faster and know which aspects of glucose predictions to focus more on.

Deliverables

There are two- one is the glucose prediction game that would establish a base of what is good enough in regards to prediction standards while the second- the chat of GlucoseDAO. This last tool will allow users to interact in a chat to find out detalis about the project and due to indexed papers to even find out information about the latest research in glucose study.

Budget

$6,000 USD

Success Criterion

Usage of the two deliverables. Of course their function but if they are widely spread and used then it means more knowledge to the people and more information spread about what this project is all about and what people can actually do to improve lifestyle

Milestone 3 - Data-processing pipelines

Description

Only a few open datasets are uniformly integrated by the GlucoBench repository. The preprocessing way is okay for benchmarking but not convenient for training and fine-tuning models. We also have to decide on mechanisms for how users can contribute their data (both technically and organizationally/legally) as it is the first question people ask when we tell them about the project.

Deliverables

Data-processing pipelines that can integrate user provided data according to input device (different CGMs have diferent type of csv formats) and also pipelines to adjust to the bigest open databases. Kind of standardization of data. Or otherwise put- pipelines for all available datasets.

Budget

$8,000 USD

Success Criterion

stability in function of pipeline and compatibility.

Milestone 4 - Micro-study of human-performance measurement

Description

We need around 100 volunteers to estimate the performance of the prediction model (the scientific advisory board can adjust the details). This work will happen in parallel with 3rd milestone. It may take time because of regulations and the slow nature of the recruitment process.

Deliverables

Study and check of the functioning of all milestones so far.

Budget

$9,000 USD

Success Criterion

Gathering of required number of people but also functioning of whole system so far

Milestone 5 - Fine-tuning of the models

Description

Our core idea is that a foundational model can be fine-tuned to a specific person to be accurate in predicting her values. With additional data and the first beta-testers users we will get the initial version of the service with an improved model. We assume that we may need more iterations to improve both model and user experience but for this we will need an additional funding round.

Deliverables

We will have the initial version of the service with a stable glucose prediction. It will use fine-tuning for users so it will be upgraded from the very early iteration from the begining.

Budget

$25,000 USD

Success Criterion

Stability and accuracy. There will certainly be more steps to solve so geting things to be accesibile (stable in access and in I/O of data) and accurate will be a good success criterion.

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon