Transformers Architecture via PH interpolators

chevron-icon
Back
Top
chevron-icon
project-presentation-img
kounchev
Project Owner

Transformers Architecture via PH interpolators

Expert Rating

n/a
  • Proposal for BGI Nexus 1
  • Funding Request $50,000 USD
  • Funding Pools Beneficial AI Solutions
  • Total 4 Milestones

Overview

The present project focuses on the main algorithm of LLMs. We reconsider the LLM architecture in the Transformers' part by replacing the standard FFNs (feed-forward networks) with a different type of multivariate interpolator called PHI (polyharmonic interpolators). This would enormously increase the flexibility of the Transformers' architecture, the speed and capacity of the training and embedding phase, as well as the inference. The PHIs leverage the well-established multivariate theory of interpolation using solutions or piecewise solutions of elliptic equations. A very big advantage of the PHI paradigm is the interpretability of the parameters.

Proposal Description

How Our Project Will Contribute To The Growth Of The Decentralized AI Platform

As the BGI mission is to empower anyone to obtain resources towards developing responsible AI and AGI, our research project directly serves this purpose. 

Our Team

  1. Ognyan Kounchev
  2. Georgi Simeonov 

AI services (New or Existing)

Extraordinary Enhanced LLM architecture

Type

New AI service

Purpose

We provide an enormous improvement of the LLMs in terms of the speed and computation efficiency of the Learning/Training phase as well as the interpretability of the parameters.

AI inputs

Input: LLM with standard Transformer architecture.

AI outputs

LLM with innovative PHI-Transformer architecture.

Company Name (if applicable)

Institute of Mathematics and Informatics, Bulgarian Academy of Sciences

The core problem we are aiming to solve

The core problem which we are treating is the unsatisfactory state of the present day LLMs which consume an enormous computational power and time.

Our specific solution to this problem

The crux of our approach is the reconsideration of the architecture of the Transformers in the present day LLMs. We replace the usual FFNs in the Transformers by new multidimensional interpolators which have been studied in the theory of interpolation during the last decades; they are called PHI (polyharmonic interpolators), and their theory has been developed in the academic research of the present project proposers. These interpolators are solutions (or piece-wise solutions) to elliptic differential equations. The learnable parameters of the PHI are represented by the boundary values of the solutions of the elliptic equations on some interface surfaces (in particular, having spherical symmetry, or having translation symmetry as hyperplanes, in Euclidean space); the interface surfaces and their number also  represent  learnable parameters. The big advantage of the PHI paradigm is the interpretability of their parameters, unlike the standard MLPs (multi-layer perceptrons). The universality of the approximation by PHI has been justified  theoretically. On the other hand the interpolation properties of the solutions of elliptic equations guarantee "stability of the parameters" of the PHI: we have a nice rate of approximation to a smooth data function, which depends on the number of interface surfaces, the order of the elliptic equations, as well as on the rate of  approximation of data on a hypersphere by spherical harmonics, or data on a hyperplane by exponential functions.  

Existing resources

  1. HPC, Supercomputers Avitohol and Hemus
  2. office space 

Open Source Licensing

LGPL - Lesser General Public License

Online event

Proposal Video

Placeholder for Spotlight Day Pitch-presentations. Video's will be added by the DF team when available.

  • Total Milestones

    4

  • Total Budget

    $50,000 USD

  • Last Updated

    20 Feb 2025

Milestone 1 - Design and setting of the PHIs structure.

Description

We construct the PHI interpolators: first in the case of spherically symmetric interfaces and boundaries; second in the case of translation invariant hyperplane interface symmetry and boundaries. Description of the basis function models - spherical harmonics and exponential functions.

Deliverables

Algorithms for the construction of the basis of functions on the interface surfaces in the case of hyperspheres and the parameters of the PHI in the spherically symmetric case; similar construction of the basis functions on the hyperplanes and the related parameters of the PHI.

Budget

$14,000 USD

Success Criterion

Successful tests for special interfaces and data.

Milestone 2 - Comparison between FFN and PHI

Description

We will provide research proving the advantage of the PHI paradigm over the usual NN paradigm. We will make numerous experiments for justification.

Deliverables

Results of experiments proving the advantage of the PHI over the standard NNs for interpolation purposes.

Budget

$14,000 USD

Success Criterion

The experiments have to show that PHI provides enormous improvement in speed, power of computation, and interpretability of the parameters, compared with the usual NNs.

Milestone 3 - Intergration in a Transformers' Architecture

Description

Integration of the PHI machine in the Transformers' architecture. We provide a smooth mechanism for the complete architecture.

Deliverables

PHI-Transformer architecture. Results of experiments on synthetic and real data.

Budget

$14,000 USD

Success Criterion

Delivery of a fully functioning PHI-Transformer architecture, showcasing an essential advantage of PHI-Transformer over usual Transformer in tests on synthetic and real data.

Milestone 4 - PHI-Transformer Learning/Training and Inference

Description

We use the PHI-Transformer architecture for Learning/Training on real data to show advantage over the usual Transformers' architecture. We showcase some results with Inference.

Deliverables

Results of experiments with Learning/Training on real data. Experiments with Inference on real data.

Budget

$8,000 USD

Success Criterion

Essential advantage of the PHI-Transformers over the usual Transformers' architecture, in terms of speed and computational cost.

Join the Discussion (0)

Expert Ratings

Reviews & Ratings

    No Reviews Avaliable

    Check back later by refreshing the page.

feedback_icon