Sentiment Analysis – Celeste AI

chevron-icon
Back
Top
chevron-icon
project-presentation-img
Ana Paula Pereira
Project Owner

Sentiment Analysis – Celeste AI

  • Project for Round 3
  • Funding Awarded $65,000 USD
  • Funding Pools New projects
  • Milestones 2 / 5 Completed

Status

  • Overall Status

    🛠️ In Progress

  • Funding Transfered

    $30,450 USD

  • Max Funding Amount

    $65,000 USD

Funding Schedule

View Milestones
Milestone Release 1
$15,650 USD Transfer Complete 11 Apr 2024
Milestone Release 2
$14,800 USD Transfer Complete 20 Sep 2024
Milestone Release 3
$6,500 USD Pending TBD
Milestone Release 4
$11,800 USD Pending TBD
Milestone Release 5
$16,250 USD Pending TBD

Status Reports

Sep. 7, 2024

Status
🙂 Pretty good
Summary

"Published a dataset with sentiments Development of a basic model trained with a limited dataset Development of the API for exposing the functionality Finished all the automations (CI/CD) required to publish a model and make it available via API whenever we change the dataset version, implement model improvements or change the API"

Full Report

May. 5, 2024

Status
🙂 Pretty good
Summary

We were able to conclude the application for categorization. Now we are moving to create the API that will serve the model in production AND the training of the model itself. We are building the MLOps workflow yet but we have a clear objective that we can present in the next session. We are solving with MS an issue we have found when deploying our testing application and soon we will be able to make more progress.

Full Report

Video Updates

Sentiment Analysis – Celeste AI

9 February 2024

Sentiment Analysis – Celeste AI

30 January 2024

Project AI Services

No Service Available

Overview

Celeste AI, based in São Paulo, Brazil, is a pioneering startup specializing in advanced transcription services via speech-to-text recognition. The company aims to narrow the language gap in AI model development for Latin-descendant languages, particularly in the Latin American region. Since its beta launch in July, Celeste AI has onboarded over 1,000 users and achieved initial sales success. They are currently part of the Technology Park - São José dos Campos (Nexus) accelerator program, backed by Brazil's Ministry of Science and Technology. Their mission is to develop sentiment analysis tools tailored to Portuguese, addressing the underserved AI market in non-English languages. Celeste AI seeks funding to create a Sentiment Analysis model, targeting emotional and intent analysis in Portuguese audio and video content, promoting linguistic inclusivity and decentralization in AI.

Their marketing strategy includes targeting businesses, developers, and professionals in HR, sales, market research, and media in Portuguese-speaking countries through various platforms and activities. The project's milestones and cost breakdown involve the development and refinement of the Sentiment Analysis model in three phases. Celeste AI acknowledges potential risks such as bias in training data and overfitting and provides mitigation strategies. They also plan to contribute voluntary revenue back to the SingularityNET platform and have future plans to open source their APIs to support AI technology expansion in multiple languages.

Proposal Description

AI services (New or Existing)

Compnay Name

Celeste AI

Service Details

Celeste AI is a groundbreaking startup that offers advanced transcription services based on speech to text recognition. Solutions in our pipeline include AI-powered tools and audio analytics, transforming voice files into data-rich sources.

Celeste seeks to bridge the language gap for AI models development in countries with Latin descendant languages, most of which are in Latin America Region. 

Since our beta launch on July 1, we've onboarded over 1,000 users and have successfully completed our first sales. We are being accelerated by the prestigious

- São José dos Campos (Nexus), an arm of Brazil's Ministry of Science and Technology. 

We believe that AI solutions are driving a wealth transfer in the world, with Latin descendant countries lagging behind in the AI field due to English being the dominant language for training models.

With our models, we seek to automatically transcribe audio and video files, provide detailed analytics, and uncover trends and patterns in spoken content. Celeste is seeking grant funding to support the development and training of one of these models, which will bridge the language gap for AI-driven transcription and analytics in Portuguese.

We believe our models' development are strongly aligned with SingularityNET's purpose of driving the development of benevolent, democratic and inclusive Artificial General Intelligence. 

Problem Description

The global AI landscape has seen unprecedented growth over the last decade. However, this growth has been disproportionately oriented towards dominant languages like English. Latin-descended languages, including Portuguese, have largely been underserved in terms of AI solutions tailored to them. As a result, businesses, researchers, and developers who rely on accurate sentiment analysis for Portuguese content often find themselves lacking the tools they need. This gap in the market not only hinders the potential of AI to serve diverse linguistic groups but also curtails the holistic growth and decentralization of AI technologies.

Solution Description

Sentiment Analysis model

Our Sentiment Analysis model is tailored for Portuguese, ensuring accuracy and cultural relevance. It can empower developers and aid researchers in designing projects specifically to the Portuguese-speaking population, thus addressing a significant market void. This tool is intended to promote linguistic inclusivity, drive holistic growth, and decentralize the AI landscape.

We envision our sentiment analysis API being applied by business and developers for improving sales and human resources processes; analyzing user feedback, market research, and social media monitoring, among other never-before imagined needs as the technology evolves.  

 

 

Sentiment Analysis API Proposal:

  • Develop an AI model targeting sentiment analysis in audio/video files (not in text) analyzing the tone, amplitude, noise and interruptions of the speech by leveraging Automatic Speech Recognition (ASR) capabilities to provide a multi-label categorization of the speech.

  • Achieve an accuracy rate (F1-Score) above 75% in identifying speech categories.

  • Deploy models as an API in the SigularityNET ecosystem.

Our goal is to dive deep into different layers:

  • Polarity-based Analysis: Understand if the sentiment is positive, negative, or neutral.
  • Emotion Detection: Explore specific emotions like joy, anger, sadness, happiness, optimism, and more.
  • Aspect-Based Analysis: Determine sentiments related to particular aspects or features of a product or service.
  • Intent Detection: Gauge user intentions such as purchase intent or churn signals.

Milestone & Budget

Milestone 1:

Goal: Early development of the sentiment analysis tool and initial categorization.

Deliverables:

1. Functional API interfaces equipped with test data.

2. Curated dataset for initial testing.

Budget Breakdown:

- API development, model development, data collection & processing: $7,000

- Computational costs for model training: $3,000

- Provisioning of machines: $2,650

- Marketing Campaigns (3 months): $3,000

Estimated total hours: ~400 hours 

Total Milestone 1 Cost: $15,650

Milestone 2:

Goal: Onboarding of model capable of binary categorization on the SNET marketplace.

Deliverables: Deployment of the model on the SNET marketplace.

Estimated total hours: ~200 hours 

Total Milestone 2 Cost: $6,500 (10% of the proposal)

Milestone 3:

Goal: Model enrichment, incorporation of additional sentiments, and dataset expansion.

Deliverables:

1. Enlarged dataset with more category-specific data.

2. Retrained model accommodating additional sentiments.

3. Integration of the Boring and Anger sentiments into the model.

4. Analysis and introduction of the next set of emotions.

Budget Breakdown:

- Data enrichment & augmentation: $3,000

- Model retraining with new sentiments: $4,000

- Computational costs for model refinement: $3,000

- Provisioning of machines: $1,800

- Marketing Campaigns (4 months): $3,000

Estimated total hours: ~500

Total Milestone 3 Cost: $14,800

Milestone 4:

Goal: Model finalization, ensuring a comprehensive cover of sentiments.

Deliverables:

1. Model training and integration for the remaining 2 or 3 sentiments.

2. A detailed performance report outlining the model's efficiency across all sentiments.

Budget Breakdown:

- Model training for final sentiments: $4,500

- Computational costs for final model training: $2,500

- Provisioning of machines: $1,800

- Marketing Campaigns (3 months): $3,000

Estimated total hours: ~300

Total Milestone 4 Cost: $11,800

Milestone 5: Hosting/API calls

Goal: Support costs related to cloud hosting and users trial.

Estimated cost: $16,250 (25% of the proposal)

Total for All Milestones: $65,000 *It may be necessary to hire a developer and a linguist consultant in order to achieve these milestones.

Deliverables

Deliverables 1:

1. Functional API interfaces equipped with test data.

2. Curated dataset for initial testing.

Deliverables 2: Deployment of the model on the SNET marketplace.

Deliverables 3:

1. Enlarged dataset with more category-specific data.

2. Retrained model accommodating additional sentiments.

3. Integration of the Boring and Anger sentiments into the model.

4. Analysis and introduction of the next set of emotions.

Deliverables 4:

1. Model training and integration for the remaining 2 or 3 sentiments.

2. A detailed performance report outlining the model's efficiency across all sentiments.

 

Revenue Sharing

We will onboard the service on the platform. If the service crosses the threshold of $1000, revenue per month, 5% of the additional revenue will be fed back into the SNET/DeepFunding wallets. This condition will remain valid for 5 years after first onboarding the service and will be applicable to this service or any subsequent iteration of this service on the platform.

Marketing & Competition

Target Audience:

  • Business: Enterprises seeking customer feedback analysis.

  • Developers: Portuguese speaker developers emphasizing AI and sentiment analysis.

  • Business Professionals: Individuals in HR, sales, market research, and media working in Portuguese-speaking countries.

Platforms: Google Ads, LinkedIn Ads, Meta Ads, Twitter & GitHub. 

Activities: Blog posts, community webinars, presence on social media, workshops and trials with universities and tech communities. Social media marketing campaigns to generate leads and sign up trial users.

Related Links

www.celeste-ai.com

Long Description

Company Name

Celeste AI

São Paulo, Brazil

Summary

Celeste AI is a groundbreaking startup that offers advanced transcription services based on speech to text recognition. Solutions in our pipeline include AI-powered tools and audio analytics, transforming voice files into data-rich sources.

Celeste seeks to bridge the language gap for AI models development in countries with Latin descendant languages, most of which are in Latin America Region. 

Since our beta launch on July 1, we've onboarded over 1,000 users and have successfully completed our first sales. We are being accelerated by the prestigious

- São José dos Campos (Nexus), an arm of Brazil's Ministry of Science and Technology. 

We believe that AI solutions are driving a wealth transfer in the world, with Latin descendant countries lagging behind in the AI field due to English being the dominant language for training models.

With our models, we seek to automatically transcribe audio and video files, provide detailed analytics, and uncover trends and patterns in spoken content. Celeste is seeking grant funding to support the development and training of one of these models, which will bridge the language gap for AI-driven transcription and analytics in Portuguese.

We believe our models' development are strongly aligned with SingularityNET's purpose of driving the development of benevolent, democratic and inclusive Artificial General Intelligence. 

Funding Amount

$65,000

The Problem to be Solved

The global AI landscape has seen unprecedented growth over the last decade. However, this growth has been disproportionately oriented towards dominant languages like English. Latin-descended languages, including Portuguese, have largely been underserved in terms of AI solutions tailored to them. As a result, businesses, researchers, and developers who rely on accurate sentiment analysis for Portuguese content often find themselves lacking the tools they need. This gap in the market not only hinders the potential of AI to serve diverse linguistic groups but also curtails the holistic growth and decentralization of AI technologies.

 

 

Our Solution

Sentiment Analysis model

Our Sentiment Analysis model is tailored for Portuguese, ensuring accuracy and cultural relevance. It can empower developers and aid researchers in designing projects specifically to the Portuguese-speaking population, thus addressing a significant market void. This tool is intended to promote linguistic inclusivity, drive holistic growth, and decentralize the AI landscape.

We envision our sentiment analysis API being applied by business and developers for improving sales and human resources processes; analyzing user feedback, market research, and social media monitoring, among other never-before imagined needs as the technology evolves.  

 

 

Sentiment Analysis API Proposal:

  • Develop an AI model targeting sentiment analysis in audio/video files (not in text) analyzing the tone, amplitude, noise and interruptions of the speech by leveraging Automatic Speech Recognition (ASR) capabilities to provide a multi-label categorization of the speech.

  • Achieve an accuracy rate (F1-Score) above 75% in identifying speech categories.

  • Deploy models as an API in the SigularityNET ecosystem.

Our goal is to dive deep into different layers:

  • Polarity-based Analysis: Understand if the sentiment is positive, negative, or neutral.
  • Emotion Detection: Explore specific emotions like joy, anger, sadness, happiness, optimism, and more.
  • Aspect-Based Analysis: Determine sentiments related to particular aspects or features of a product or service.
  • Intent Detection: Gauge user intentions such as purchase intent or churn signals.

Marketing Strategy

Target Audience:

  • Business: Enterprises seeking customer feedback analysis.

  • Developers: Portuguese speaker developers emphasizing AI and sentiment analysis.

  • Business Professionals: Individuals in HR, sales, market research, and media working in Portuguese-speaking countries.

Platforms: Google Ads, LinkedIn Ads, Meta Ads, Twitter & GitHub. 

Activities: Blog posts, community webinars, presence on social media, workshops and trials with universities and tech communities. Social media marketing campaigns to generate leads and sign up trial users.

Our Project Milestones and Cost Breakdown

Milestone 1:

Goal: Early development of the sentiment analysis tool and initial categorization.

Deliverables:

1. Functional API interfaces equipped with test data.

2. Curated dataset for initial testing.

Budget Breakdown:

- API development, model development, data collection & processing: $7,000

- Computational costs for model training: $3,000

- Provisioning of machines: $2,650

- Marketing Campaigns (3 months): $3,000

Estimated total hours: ~400 hours 

Total Milestone 1 Cost: $15,650

Milestone 2:

Goal: Onboarding of model capable of binary categorization on the SNET marketplace.

Deliverables: Deployment of the model on the SNET marketplace.

Estimated total hours: ~200 hours 

Total Milestone 2 Cost: $6,500 (10% of the proposal)

Milestone 3:

Goal: Model enrichment, incorporation of additional sentiments, and dataset expansion.

Deliverables:

1. Enlarged dataset with more category-specific data.

2. Retrained model accommodating additional sentiments.

3. Integration of the Boring and Anger sentiments into the model.

4. Analysis and introduction of the next set of emotions.

Budget Breakdown:

- Data enrichment & augmentation: $3,000

- Model retraining with new sentiments: $4,000

- Computational costs for model refinement: $3,000

- Provisioning of machines: $1,800

- Marketing Campaigns (4 months): $3,000

Estimated total hours: ~500

Total Milestone 3 Cost: $14,800

Milestone 4:

Goal: Model finalization, ensuring a comprehensive cover of sentiments.

Deliverables:

1. Model training and integration for the remaining 2 or 3 sentiments.

2. A detailed performance report outlining the model's efficiency across all sentiments.

Budget Breakdown:

- Model training for final sentiments: $4,500

- Computational costs for final model training: $2,500

- Provisioning of machines: $1,800

- Marketing Campaigns (3 months): $3,000

Estimated total hours: ~300

Total Milestone 4 Cost: $11,800

Milestone 5: Hosting/API calls

Goal: Support costs related to cloud hosting and users trial.

Estimated cost: $16,250 (25% of the proposal)

Total for All Milestones: $65,000 *It may be necessary to hire a developer and a linguist consultant in order to achieve these milestones.

Risk and Mitigation

1. Risk: Bias in the Training Data

If the data used to train the sentiment analysis model is biased, the model's predictions could be skewed and not representative of diverse real-world situations.

Mitigation:

  - Use a diverse and comprehensive dataset that captures various sentiments across different contexts and demographics.

  - Employ techniques like data augmentation to create synthetic training examples and diversify the training set.

  - Continuously review and audit model predictions and refine the training dataset accordingly.

2. Risk: Overfitting

The model might perform exceptionally well on the training data but fail to generalize on unseen data.

Mitigation:

  - Utilize techniques like cross-validation to make sure the model's performance is consistent across different subsets of the data.

  - Employ regularization techniques.

  - Use dropout in neural network architectures.

3. Risk: Misinterpretation of Complex Sentiments Sentiment is multifaceted and can be subtle; AI models may not be able to capture sarcasm, irony, or mixed emotions.

Mitigation:

  - Curate a dataset specifically targeting complex sentiments for training.

  - Use ensemble models or hybrid models that combine rule-based and machine learning methods.

  - Continuously gather feedback from real-world deployments and refine the model.

4. Risk: High Computational Costs

Training deep learning models for sentiment analysis can be computationally expensive.

Mitigation:

  - Explore transfer learning, where a pre-trained model is fine-tuned for the specific sentiment analysis task, reducing the training time and resources.

  - Utilize cloud-based services that offer scalable computational resources.

5. Risk: Data Privacy Concerns

Using real-world data might breach privacy regulations if not handled properly.

Mitigation:

  - Ensure all training data is anonymized and stripped of personally identifiable information.

  - Use synthetic or simulated data where possible.

  - Comply with local data protection regulations, such as Brazil's LGPD.

Voluntary Revenue

We will onboard the service on the platform. If the service crosses the threshold of $1000, revenue per month, 5% of the additional revenue will be fed back into the SNET/DeepFunding wallets. This condition will remain valid for 5 years after first onboarding the service and will be applicable to this service or any subsequent iteration of this service on the platform.

Open Source

Celeste is not open sourced yet, but we intend to open source our APIs as we develop our own AI models, supporting the expansion of AI technology in Portuguese, Spanish, Italian, French and Romanian languages.

Our Team

Marcos Lima: co-founder, +15 years as software engineer, scrum master, working with solutions architecture. Postgraduate studies in Big Data and Complex Data Mining at Universidade Estadual de Campinas. 

Eder Rosa: co-founder, +15 years as software engineer working with web and mobile solutions architecture. Computer Science Degree. NodeJs, React, React Native, Typescript, AWS.

Artur Rosa: co-founder, +7 year as full stack developer working with web systems, rest APIs, and databases. Information Systems degree. Python, Angular JS, TypeScript, React JS, Node JS.

Ana Paula Pereira: co-founder, +13 years as financial journalist and corporate communications. Degrees in both journalism and economics. +8 years of experience in team coordination and project management.  

Vinicius Abreu: full-stack developer, +3 years working w/ C#, Delphi, React, Node.js, REST, MySQL, PostgreSQL. Software engineer degree. Tech Resident in Data Science and Python at PUC-Campinas.

Thalita Peres: +12 years in business development, serving as sales manager for publishing companies.

Beatriz Gimenez: +10 years working with performance marketing, digital content, marketing strategies.

Paula Rocha: +10 years in corporate communications, crisis management, media training, and PR strategies.

Isabella Velleda: +2 years working on startups operations. Provides comms and operations support. Master of Business and Administration at Universidade de São Paulo.  

 

 

Q&A

Learn more about our proposal based on SNET's community concerns:

Data Privacy - Legal compliance is one of our top priorities, especially in Brazil, where consumer data privacy regulations have been implemented over the past years. We have a team of three lawyers with expertise in data privacy, LGPD, taxation, and blockchain technology assisting our operations. We believe our counsel team is able to provide adequate support for continuous compliance as AI regulations evolve. Additionally, all information will be used exclusively to set up the parameters for training the models. The training will be done in a restricted environment with data that is already anonymized (by the platform by default). Our platform was created with privacy at its core, so the chances to leak any sensitive information are really low.

Potential Bias - Our platform itself is a datasource of information since users can review their transcription content, meaning our learning process includes human reinforcement as a core component, therefore improving model training.  We also plan to use journalistic data sources, such as data, audio, text and videos that can serve as a complementary resource of pre-classified data. We have two journalists on the team to provide technical support with the process.  Further, the model can be optimized through cross-validation with text-based sentiment analysis from transcriptions, which usually offers a broader level of accuracy on sentiment analysis. 

Cultural Relevance - For this sentiment analysis model, we would work with a diverse and segmented database, including different "variants" of Portuguese, covering regional and cultural expressions/nuances. We believe community feedback is also crucial to keep the model culturally relevant and accurate, so we'd like to incorporate feedback from our users and the SNET community. We just went live in beta 8 weeks ago (with over 2k users so far), so we want to foster community feedback not only for refining our models, but for marketing engagement as well.

Related Links

www.celeste-ai.com

AI Services

Proposal Video

Placeholder for Spotlight Day Pitch-presentations. Video's will be added by the DF team when available.

Reviews & Rating

New reviews and ratings are disabled for Awarded Projects

Overall Community

0

from 0 reviews
  • 5
    0
  • 4
    0
  • 3
    0
  • 2
    0
  • 1
    0

Feasibility

0

from 0 reviews

Viability

0

from 0 reviews

Desirabilty

0

from 0 reviews

Usefulness

0

from 0 reviews

Sort by

0 ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

  • Total Milestones

    5

  • Total Budget

    $65,000 USD

  • Last Updated

    6 Oct 2024

Milestone 1 - 1

Status
😀 Completed
Description

Early development of the sentiment analysis tool and initial categorization.

Deliverables

Budget

$15,650 USD

Milestone 2 - 2

Status
😀 Completed
Description

Deliver the completed training flow and roll out the first version of the API for sentiment analysis

Deliverables

Budget

$14,800 USD

Milestone 3 - 3

Status
🧐 In Progress
Description

Deploy the API into the SNET marketplace.

Deliverables

Budget

$6,500 USD

Link URL

Milestone 4 - 4

Status
😐 Not Started
Description

Deliver the dataset and the fine-tuned API that can catch emotions like frustration and enthusiasm.

Deliverables

Budget

$11,800 USD

Link URL

Milestone 5 - Hosting/API calls

Status
😐 Not Started
Description

Support costs related to cloud hosting and users trial.

Deliverables

Budget

$16,250 USD

Link URL

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

New reviews and ratings are disabled for Awarded Projects

    No Reviews Avaliable

    Check back later by refreshing the page.