Universal Sentence Parser

chevron-icon
Back
project-presentation-img
Sierk Rosema
Project Owner

Universal Sentence Parser

Funding Awarded

$22,000 USD

Expert Review
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 0
Community
Star Filled Image Star Filled Image Star Filled Image Star Filled Image Star Filled Image 0 (0)

Status

  • Overall Status

    🛠️ In Progress

  • Funding Transfered

    $5,000 USD

  • Max Funding Amount

    $22,000 USD

Funding Schedule

View Milestones
Milestone Release 1
$5,000 USD Transfer Complete TBD
Milestone Release 2
$2,000 USD Pending TBD
Milestone Release 3
$6,000 USD Pending TBD
Milestone Release 4
$2,500 USD Pending TBD
Milestone Release 5
$5,500 USD Pending TBD
Milestone Release 6
$1,000 USD Pending TBD

Status Reports

Mar. 13, 2023

Status
🧐 Fair, but could have been better
Summary

The team made some progress toward the first version of the algorithm.

Full Report

Oct. 17, 2022

Status
🧐 Fair, but could have been better
Summary

Getting started.

Full Report

Aug. 20, 2022

Status
🧐 Fair, but could have been better
Summary

We have discussed financial issues and I am now ready to sign the contract to get started.

Full Report

Jan. 23, 2022

Status
🤔 We encountered some issues
Summary

I have been busy with other things, progress on the project is slow.

Full Report

Video Updates

Sierk Rosema – Kickoff presentation

15 March 2023

Project AI Services

No Service Available

Overview

I want to create a service that splits a text into sentences, according to a given dictionary, in a fast and robust way. It can be applied to natural languages and artificial ones.

 

Proposal Description

AI services (New or Existing)

Compnay Name

Lizuca Programming Solutions

Problem Description

One of the first steps in automatic text understanding is splitting the text into separate sentences. There already exist many software libraries that can do this, but these rarely offer a lot of options and are not optimised for large texts. They often assume a certain (natural) language is used, such as English or German.

Solution Description

I will develop a service that has as input a set of letters (the alphabet), some language rules and a dictionary containing all words for that language. That way the user has a lot of flexibility and the service can be applied to any language, natural or artificial. I will write the core of the program in C so that it will deal with a large text very fast.

Project Benefits for SNET AI platform

Being able to split a large text into sentences quickly is a first step for automatic language understanding and is useful for different fields such as summarising, translating, classifying. By offering this service on SNET it will help the platform grow and inspire new ideas and applications.

Competitive Landscape

There are many libraries that can do the splitting of text into sentences. But they are rarely so generally written that they can deal with any language and often not optimised for speed and size.

 

Marketing & Competition

I will create a website to document the project and post information about the service on different channels such as GitHub, Stack Exchange.

Long Description

To have a truly universal sentence splitter, the program should work independent of language or even alphabet. Therefore, the first step in using the service will be to define certain language-specific things, by answering the following questions.

  • Which alphabet is used?
  • Which characters end a sentence?
  • Do sentences start with a capital letter?

Next, a dictionary must be defined. The service will offer certain standard dictionaries for common languages such as English, but the user is free to define a subset of such a dictionary or even define a completely new dictionary. A few remarks.

  1. The dictionary may contain words that end with a character that is used to end a sentence. Consider for example, abbreviations in the English language such as ‘Mr.’ or ‘Mt.’
  2. The dictionary may contain idioms such as ‘a piece of cake’ or ‘once in a blue moon’. In other words, the dictionary may contain entries that comprise multiple words, where each individual word may or may not be an entry.
  3. Besides the dictionary defining the words of the language, a second dictionary may be defined that contains a list of names (starting with a capital letter).

Once the properties of the language have been defined as described above, the program will use algorithms based on graph theory to determine all ways a given text can be split in sentences.

Note that it may be possible that there exists more than one way to correctly split a text into sentences. This program will output ALL the possible solutions.

AI Services

Proposal Video

Placeholder for Spotlight Day Pitch-presentations. Video's will be added by the DF team when available.

  • Total Milestones

    6

  • Total Budget

    $22,000 USD

  • Last Updated

    11 Mar 2024

Milestone 1 - Project start

Status
😀 Completed
Description

Signed Contract

Deliverables

Budget

$5,000 USD

Link URL

Milestone 2 - Basic algorithm

Status
🧐 In Progress
Description

Source code

Deliverables

Budget

$2,000 USD

Link URL

Milestone 3 - Refined algorithm

Status
😀 Completed
Description

Source code

Deliverables

Budget

$6,000 USD

Link URL

Milestone 4 - Build web service

Status
😐 Not Started
Description

Source code

Deliverables

Budget

$2,500 USD

Link URL

Milestone 5 - Hosting/API calls

Status
😐 Not Started
Description

Running web service

Deliverables

Budget

$5,500 USD

Link URL

Milestone 6 - Deployment on SNet

Status
😐 Not Started
Description

Service in SNet marketplace

Deliverables

Budget

$1,000 USD

Link URL

Join the Discussion (0)

Reviews & Rating

New reviews and ratings are disabled for Awarded Projects

Sort by

0 ratings

Summary

Overall Community

0

from 0 reviews
  • 5
    0
  • 4
    0
  • 3
    0
  • 2
    0
  • 1
    0

Feasibility

0

from 0 reviews

Viability

0

from 0 reviews

Desirabilty

0

from 0 reviews

Usefulness

0

from 0 reviews