Introducing Giotto

Our portable model and AI operating system.

Transitioning to  Next-Generation Intelligent Models

Giotto transitions from monolithic "black box" models to a distributed AI model architecture where intelligence is orchestrated through a coordinated system of small models.

Model Architecture

Giotto is a portable, configurable model and AI operating system with advanced reasoning capabilities, combining open and proprietary weights, datasets, and tools to deliver high performance, adaptability, robustness, and multi-agency support.

Reasoning Modalities

Test-time training

Test-time training extends model capabilities by adapting it on-the-fly to the specific query context.

We re-train the model in real time, improving accuracy and reducing hallucinations in the final response.

Decoding

Decoding is the process by which a language model generates responses token by token, shaping both quality and diversity of outputs.

At Giotto we move away from the standard next-most-probable token paradigm, and explore token paths as branching structure, dynamically expanding the most informative directions based on uncertainty.

Scoring

Once multiple candidates are generated during decoding, scoring determines which output is most reliable without relying on external supervision.

Our proprietary scoring system relies on sophisticated ranking methods, based on the intrinsic markers that characterise output quality.

Our Work on the ARC-AGI Benchmark

The Abstraction and Reasoning Corpus (ARC) is a benchmark designed to measure progress toward Artificial General Intelligence (AGI). Created by François Chollet in 2019, ARC evaluates a system's ability to acquire new skills and key traits of general intelligence. Unlike typical AI benchmarks that test specific skills, ARC challenges AI to reason and abstract in ways that come naturally to humans but are exceptionally difficult for machines.

In 2025, we achieved unprecedented results on the ARC benchmark leveraging our proprietary approach and technology.

Ranking 2nd out of ~1.5k teams in the 2025 competition
200M-parameter model
$0.20 average inference cost per task

We evaluate Giotto across a set of widely used reasoning and knowledge benchmarks. To provide a fair comparison, we report results against major models that can run on a single GPU.

Across these benchmarks, Giotto achieves leading performance among single-GPU models, combining strong mathematical reasoning, scientific understanding, and broad academic knowledge. These results position Giotto as the smartest model available for single-GPU deployment, delivering frontier-level capabilities without requiring multi-GPU infrastructure.

AIME24

A benchmark based on the 2024 American Invitational Mathematics Examination, measuring advanced mathematical reasoning on olympiad-style problems across algebra, geometry, combinatorics, and number theory.

AIME25

A benchmark based on the 2025 American Invitational Mathematics Examination, evaluating a model’s ability to solve challenging multi-step math problems with exact integer answers.

AIME26

A benchmark based on the 2026 American Invitational Mathematics Examination, testing frontier models on difficult high-school competition math problems requiring structured symbolic reasoning.

GPQA Diamond

A graduate-level, “Google-proof” multiple-choice benchmark of expert-written questions in biology, physics, and chemistry, designed to test deep scientific reasoning rather than simple retrieval.

MATH-500

A 500-problem subset of the MATH dataset, covering competition-level mathematics across domains such as algebra, geometry, number theory, probability, and precalculus.

Humanity's Last Exam (HLE)

A frontier-level multimodal academic benchmark with expert-vetted questions across mathematics, science, humanities, and other disciplines, designed to assess broad expert-level reasoning.

Gemma 4 : Gemma 4 31B

NVIDIA Nemotron 3 : NVIDIA Nemotron 3 Nano 30B-A3B

Ministral 3 : Ministral 3 14B

GPT-OSS-120B : GPT OSS 120B, medium reasoning

DeepSeek R1 32B : DeepSeek-R1-Distill-Qwen-32B

 

Sources: official model cards reported by providers, or other sources from https://huggingface.co or artificialanalysis.ai.

Get in Touch with Giotto.ai

Have a question for us? Please fill out the form below, and our team will get back to you promptly.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
ISO certification badge