Yon: a new research language compiling to native code via MLIR and LLVM

anthem87 · June 5, 2026, 8:00am

Hi all. I’d like to present Yon, a research programming language I built over the past weeks. Its type system is grounded in topos theory (intuitionistic core, elements of HoTT, algebraic effects), and the whole point of the project was to take that kind of semantics all the way down to native code instead of stopping at an interpreter.

The pipeline: an OCaml frontend type-checks .yon source and emits a custom out-of-tree MLIR dialect (“topos”), whose passes lower through the standard dialects to the LLVM dialect and out to a native executable. One driver command runs the chain, and you can stop at any intermediate stage to inspect what a construct becomes along the way. Built and tested against LLVM/MLIR 18, on Linux x86-64 and macOS Apple Silicon.

One unusual choice worth mentioning: the runtime heap is content-addressed. Every value is canonicalized to a point of the Leech lattice (via the Conway group Co0), so identical content always gets the same handle. Equality is one machine comparison regardless of value size, deduplication is global and automatic, and there is no GC, since cells are immutable and content-addressed.

Full disclosure: I’m a developer, not a mathematician or a compiler veteran, and I used AI as a research and coding aid, with a critical eye. What I stand behind is the artifact: a regression suite of 112 examples plus a multi-process scenario suite, exit codes identical on both platforms, on the on book on the site every snippet was compiled and run before being written down. Known limits are documented in the repo rather than hidden.

Site and book: https://yon-lang.org
Repo: GitHub - yon-language/yon · GitHub (dialect under mlir/)

It’s still very much a work in progress. Feedback and critique are welcome.

programmerjake · June 5, 2026, 5:41pm

you will probably want to add GC since you can easily have enough different datastructures to fill memory, e.g. counting from 0 to 2^40 and putting that inside enough of a wrapper that it isn’t optimized-out.
something equivalent to:

/// move T to the heap and return the pointer to all `T`s with the same value
template<typename T>
const T *intern(T &&value);

void use_all_memory() {
    // assuming you have less than ~48TiB of ram, this will eventually run out unless you have GC
    for(uint64_t i = 0; i < (1ULL << 40); i++) {
        auto *ignore = intern(vector<uint64_t>{i, i, i});
    }
}

anthem87 · June 5, 2026, 9:05pm

Hi @programmerjake,

Thanks for your intervention. You are 100% correct: if a single execution thread generates 2^40 distinct canonical contents, no deduplication strategy on Earth can save you, and the heap chain will eventually exhaust its allocation pools.

However, the way Yon handles this under the hood is tied to its core philosophy: I prefer a hard, predictable specification over a soft, non-deterministic runtime degradation (like a GC pause).

Here is how Yon deals with the scenario you described today, and how I am extending it to support ephemeral computation without losing the categorical properties:

1. The Space Isolation Model (MMU-enforced)

In Yon, a program doesn’t live in a single global heap. It splits into isolated “Spaces” (separate OS processes enforced by the hardware MMU, kinda like Postgres model). Each Space operates on a fixed-size baseline constraint: a chain of up to 256 heaps of 196,560 slots each, roughly 50M distinct contents per process (196,560 is the kissing number of the Leech lattice: each heap in the chain is one shell).

If a specific Space runs a rogue loop like your use_all_memory() generating 2^40 distinct coordinates on Λ₂₄, it doesn’t destabilize the system or trigger a 10-second GC stop-the-world pause. It hits its own process boundary: slot exhaustion returns a structured failure, and raw payload memory is bounded by RAM, enforced by the OS at process granularity. The damage stays confined to that worker, and its memory is reclaimed wholesale via the standard MMU page tables; an orchestration API to catch the failure and restart the subsystem is part of the roadmap below.

2. Real-World Target Workloads & Benchmarks

The content-addressed heap on Λ₂₄/Co₀ is optimized for state-exploration, graph algorithms, and data pipelines where structural sharing is massive.

I just benched an immutable state-tracking scenario (“Magazzino” test) checking and inserting 100,000 deep states in a generative loop. Thanks to global deduplication and the O(1) handle lookup, the execution runs flat at 185 ns/state (~5.4 million states/sec on a single core), with memory remaining stable because re-visited or symmetric states collapse instantly into the same machine handles. On complex data like Merkle Trees (3 trees, 16,384 leaves each, ~98,000 nodes total), building identical trees twice results in zero new allocations.

3. Immediate Roadmap: Ephemeral Sub-Runtimes via `spawn` / `promote`

Your stress-test highlights the exact need for sandboxed, unpredictable pipelines. Adding a naive free() or temporary keyword at the function level would violate the language’s mathematical contracts and extensionality (since multiple parts of a program may share identical structures via their handles).

Instead, I am treating sub-runtimes as isolated blocks with a single exit point. I am currently implementing a spawn { ... } block that hooks directly into the frontend reducer and MLIR lowering pipeline:

Isolation: a spawn block instantiates an ephemeral under-Space with an un-indexed linear allocation arena.
The promote operation: when the calculation finishes, only the final return value is intercepted. Its canonical coordinate is calculated, and it is promoted up to the parent Space’s heap chain via a global put_chain verification.
O(1) hardware drop: immediately after promotion, the entire memory of the ephemeral under-Space is wiped in O(1) via hardware (munmap), with no element-by-element tracing.

A sketch of the upcoming syntax (illustrative, may still change). Your loop, confined and streamed:

fun analyze(seed: number): number {
  be result holds spawn {
    heavy_search(seed)        // 2^40 intermediates live and die in here
  } follow.map(score).collect()
  return result               // only the promoted value survives
}

And explicit Space lifecycle management:

be worker holds Space.spawn(Heavy)
be partial holds worker.call(batch)
be _ holds Space.drop(worker)   // process exits, OS reclaims its whole heap

The core primitive for spawn/promote is simple enough in the MLIR infrastructure that I expect to land it in the repo very soon (possibly by tonight or tomorrow).

Thanks for pushing the boundaries of the language, this kind of architectural stress-testing is precisely why I open-sourced Yon

anthem87 · June 5, 2026, 9:26pm

However, the scope keyword is there from previous layers of implementation. It parses today and lowers to an isolated MLIR region. What you will also find are traces of an earlier arena-based memory model attached to it; that runtime never fully materialized, so I excised the dead lowering paths and kept the formal construct.

I opted to ship 1.0 with the hermetic-isolation half working and verified. The spawn/promote work above is exactly that second half landing on foundations that already exist.

Topic		Replies	Views
LLVM and managed languages LLVM Dev List Archives	16	252	December 10, 2011
Secure Virtual Machine LLVM Dev List Archives	9	154	June 15, 2007
[RFC] "Stack" dialect MLIR mlir	10	403	February 20, 2025
RFC: Introduce ml_program dialect (deprecated v1 proposal) MLIR	33	2565	March 12, 2022
Criticism of garbage collection support in LLVM LLVM Dev List Archives	13	241	January 19, 2009

Yon: a new research language compiling to native code via MLIR and LLVM

1. The Space Isolation Model (MMU-enforced)

2. Real-World Target Workloads & Benchmarks

3. Immediate Roadmap: Ephemeral Sub-Runtimes via spawn / promote

Related topics

3. Immediate Roadmap: Ephemeral Sub-Runtimes via `spawn` / `promote`