SPy: a vision for its impact on the Python community

.

I first encountered SPy a while ago and my immediate reaction was: “interesting research project, but probably not useful for me.” I maintain several scientific Python packages — FluidSim , FluidFFT , pyfftw , and others in the FluidDyn ecosystem — and my day-to-day concerns are numerical performance, packaging, and making it possible for scientists to write high-performance code in Python. SPy seemed like someone else’s problem.

I’ve changed my mind. The more I understand SPy’s design, the more I think it could matter — not just for researchers or language implementers, but for a large part of the Python community. This post is an attempt to articulate why. It is not an announcement, and it is not a roadmap: SPy is still early-stage software with solid foundations but a long road ahead. What follows is a vision of what a mature SPy could make possible.


The two-language wall

Python’s greatest success story is also its deepest structural problem.

The language became dominant in scientific computing, data science, and machine learning not because Python is fast, but because it became a wonderful orchestration layer on top of fast C, C++, and Fortran code. NumPy, SciPy, PyTorch, and thousands of other packages are, at their core, C extension modules with a Python face. Python provides the developer experience; the other language provides the performance.

This works. But it comes at a steep cost.

Writing and maintaining C extensions requires mastering the CPython C API — a complex interface that exposes CPython’s internal implementation details. This has deep consequences. Alternative implementations like PyPy and GraalPy, which could offer significant speed improvements for pure Python code, struggle to gain adoption precisely because so much of the ecosystem depends on CPython-specific extension mechanisms — they must emulate CPython internals to run these extensions, which often negates their performance advantages. CPython’s own JIT compiler (part of the Faster CPython project) faces a similar obstacle: meaningful internal optimizations are blocked by assumptions that the C API currently makes public — for example, exposed reference count semantics prevent reference count optimizations, and exposed runtime struct details for types like list prevent changes to their internal representation.

There is active work to address this. A draft PEP proposes the Python Native Interface (Py-NI): a modern C API and universal ABI for CPython extensions. Extensions built against Py-NI’s universal ABI would run across CPython versions and across alternative implementations without recompilation. Tools like Cython, PyO3, and nanobind would build on top of Py-NI. This is an important step forward — but Py-NI, on its own, does not solve the deeper problem: there is currently no good tool for implementing something like NumPy without writing C directly. Cython is excellent for wrapping native libraries and writing numerical kernels, but it is not designed to let you implement a full array library in Python. That gap is where SPy becomes relevant.

The two-language problem has not gone unnoticed at the language level either. Julia was designed from the ground up to eliminate it — you write performance-critical code in Julia itself, not in C. More recently, Mojo has taken an ambitious, Python-inspired approach to the same goal. Cython is an older and more pragmatic attempt: it extends Python syntax with C types and gives you a compiled extension module. All of these are serious efforts. But Julia and Mojo require leaving the Python ecosystem, learning a new language and runtime, and rebuilding tooling from scratch. Cython, while staying within the Python world, has significant limitations: it feels like C, offers no dynamism, provides a poor developer experience (no interpreter), and can only produce Python extensions — it cannot generate standalone programs independent of a Python runtime.

SPy takes a different path. Rather than replacing Python, it extends it — giving Python code a route to native performance while remaining part of the Python ecosystem. A mature SPy would not ask the community to abandon what it has built. It would let the community keep building in Python.

A word of clarification before going further: the vision described in this post is emphatically not about rewriting the whole scientific Python ecosystem in SPy. It is entirely reasonable — and healthy — for a large open-source ecosystem to be built on multiple languages. C, C++, Fortran, and Rust all have legitimate roles to play.


What a mature SPy could enable

1. Python-native scientific packages: the NumSPy vision

The most radical implication of SPy, and the one I find most compelling, is this: it would become possible to write packages like NumPy in Python — not C with a Python wrapper, but genuine Python (with type annotations), compiled by SPy to native code.

Call it NumSPy for clarity: a reimplementation of the Python Array API standard written in Python + SPy, with Python bindings generated automatically. Such a library would have properties that the current NumPy cannot have.

One language, not two. Today, contributing to NumPy’s internals requires understanding both Python and C, plus the CPython C API. This is a significant barrier. A NumSPy written in Python + SPy would be readable and modifiable by anyone who knows Python. Lowering this barrier matters: it affects who can contribute, how fast bugs get fixed, and how easily students can learn from the source.

Not tightly coupled to the CPython C API. In the current situation, extensions must be compiled specifically for each CPython version. Until Py-NI exists, Python extensions written with SPy will still need to interface with the CPython C API — but SPy code would not be structured around CPython internals the way handwritten C extensions are. Once Py-NI’s universal ABI becomes available, NumSPy could target it directly, running unchanged on PyPy, GraalPy, or any future Python implementation that supports it. This is not a minor point: one of the main reasons PyPy has never displaced CPython despite its superior performance for pure Python code is that the ecosystem’s C extensions don’t run well on it. SPy and Py-NI together could change that calculus.

Reuse by compilation tools. Projects like Pythran work by ahead-of-time compiling annotated Python code to C++. Currently Pythran has to maintain its own backend for array operations, separate from NumPy. If NumSPy existed, both the runtime Python path and the Pythran compilation path could share the same implementation. One codebase, multiple use cases. The same applies to Numba, and more generally to any tool that needs a NumPy-compatible layer: today each has to maintain its own implementation; with NumSPy, they could all converge on a single shared one.

Whole-program optimization. There is a subtler advantage that is easy to overlook. When SPy compiles a program that uses NumSPy, the C compiler sees the entire codebase end-to-end — both the application logic and the array library. This opens the door to powerful optimizations like Profile-Guided Optimization (PGO) and Link-Time Optimization (LTO), which can yield substantial performance gains by reasoning across module boundaries. Even the best Python JIT combined with Py-NI would struggle to achieve this, because the boundary between interpreted Python and native extensions remains opaque to the optimizer. A fully SPy-compiled program dissolves that boundary.

Enabling CPython to evolve. CPython’s JIT and other internal improvements are currently constrained by the need to preserve C API compatibility. The ecosystem has to move. If the most widely used packages migrated towards implementations compatible with Py-NI, CPython’s developers would gain more freedom to restructure internals. SPy could be part of what allows CPython itself to become significantly faster over the next decade.

A similar logic applies to any Python project whose core is implemented directly in C or C++ using the CPython C API — Matplotlib is a prominent example. Projects like scikit-learn, h5py, mpi4py, and pyfftw already use Cython, which is a reasonable approach and will be able to adopt Py-NI when it becomes available. But for the subset of the ecosystem that still relies on handwritten C tightly coupled to CPython internals, SPy offers a path toward a cleaner, more portable implementation. This is not a project for next year. It is a direction that becomes possible — and perhaps eventually inevitable — once SPy matures.

2. A better Cython: writing extensions and numerical kernels

Cython has been the workhorse for writing Python extensions for two decades. It is used for two main tasks: wrapping native libraries (exposing C or Fortran APIs to Python), and writing efficient numerical kernels (the kind of inner loops found in scikit-learn, scikit-image, or similar packages). It is powerful, but awkward — a hybrid language that is neither fully Python nor fully C, with its own syntax, its own quirks, and a learning curve that catches even experienced developers. Critically, it has no interpreter: you write Cython, you compile, you wait. And it can only produce Python extensions — it cannot generate a standalone program independent of a Python runtime. Debugging is another pain point: when things go wrong at the C level, you reach for gdb , which is a long way from the Python developer experience.

SPy is a proper language, not a hybrid syntax. It has both an interpreter and a solid compiler stack. This means you can develop and test SPy code interactively, in the same way you work with Python, and compile it when you need performance. The developer experience is qualitatively better. SPy also comes with spdb , a debugger that works at the SPy level rather than dropping you into C — a significant quality-of-life improvement over the Cython debugging story. SPy can target Python extensions, but also native binaries and WebAssembly — it is not constrained to a single output format. For the numerical kernel use case in particular, SPy would give library authors the same performance as Cython with far less friction and a more familiar language.

Porting Cython code to SPy is not a zero-cost migration — .pyx files become .spy files, and the semantics differ. But for new code, or for projects willing to invest in the transition, the result would be a more maintainable and more capable tool.

3. Native entry points with fast startup

There is a category of Python programs that suffer from a specific problem: startup time. Every time you invoke a Python script — a CLI tool, a build step, a developer utility — Python has to initialize its runtime before doing anything. For short-lived commands this overhead is noticeable.

Mercurial ( hg ) is a good example: a capable version control system written in Python, which sometimes feels slower to invoke than Git for this reason. It is worth noting that modern Mercurial, with its Rust extensions, is actually faster than Git for many long-running operations — but achieving this required implementing performance-critical parts in Rust. SPy could offer a similar path without leaving Python: a command-line tool written in SPy could be compiled to a native binary with startup time comparable to a Go or Rust program. For operations that fit within SPy’s pure subset, no Python interpreter is needed at all. For more complex cases, the native entry point could selectively initialize a Python runtime only when needed.

A concrete early demonstration of this potential: an SPy demo targeting AWS Lambda has already shown very positive results, combining fast cold-start times with the conciseness of Python code.

The broader picture is illustrated by the trajectory of tools like uv . Created by Astral — recently acquired by OpenAI — uv is a Python package manager written in Rust that the community has embraced precisely for its speed and its ability to bootstrap and manage Python installations without depending on Python itself. A tool like PDM reimagined in SPy could share many of these properties — self-contained, fast to start, independent of a pre-existing Python installation — but written in static Python rather than Rust, making it accessible to a much larger fraction of the Python community.

This has implications for developer tooling, system utilities, and any Python-authored software where startup latency matters. It also means that SPy programs could be distributed in the same way as Go or Rust binaries: compiled once, shipped as a static executable, deployed to any Linux distribution, to AWS Lambda, to the browser via WebAssembly, or to microcontrollers. The Python ecosystem’s logic and libraries, freed from the requirement of a Python runtime on the target machine.

4. Large applications: prototyping speed with production robustness

Python’s dynamic nature makes it one of the best languages for rapid prototyping. Its lack of static types and its runtime dynamism make large codebases harder to maintain and optimize. These two facts are in tension, and the Python community has been navigating the tradeoff for years — through type checkers like mypy and pyright, and through gradual typing (PEP 484 and successors).

SPy occupies an interesting position here. Because SPy is a variant of Python rather than a different language, the gap between exploratory code and production-quality SPy code is smaller than the gap between Python and, say, Rust or C++. You can prototype in Python, progressively add type annotations, and let SPy enforce them at compile time. The result is code that is both fast and robust — without a rewrite in another language.

For large scientific or engineering applications where the same codebase needs to be both approachable and performant, this is a compelling model.

5. A better RPython: implementing interpreters and runtimes

This use case is more specialized but worth noting. RPython — the restricted Python subset used to write the PyPy interpreter — demonstrated something important: a statically compilable Python variant is a remarkably good language for writing language runtimes and interpreters. The expressiveness of Python makes it easier to write and reason about complex runtime logic; the static compilation makes the result fast. And RPython went further: by writing an interpreter in RPython and annotating the hot paths, you could get a JIT compiler nearly for free , through a technique called meta-tracing. This is a remarkable property — it means that the effort of building a fast, JIT-compiled language runtime is dramatically reduced.

Python turns out to be a genuinely nice language for this kind of work, once you remove the parts that resist static analysis. RPython proved the concept, but it was always a by-product of the PyPy project rather than a designed tool, and it shows in the roughness of its interfaces and constraints.

The name SPy is not accidental: “S” is for “static” but it is also the successor of “R”. SPy’s design is informed by the RPython experience, and one of its stated goals is to eventually use SPy to implement its own interpreter. More broadly, SPy could be a better RPython: a platform for writing runtimes, virtual machines, and compilers in Python, with a cleaner design and a proper developer experience. This matters beyond SPy itself: future experimental Python implementations, domain-specific language runtimes, and bytecode compilers could be written in something that genuinely looks and feels like Python.


What still needs to happen

None of this is available today. SPy is early-stage software — the foundations are solid, the design is coherent, and it already compiles non-trivial programs to native code and WebAssembly. But significant work remains before any of the use cases above are practical: a richer standard library, stable interop with CPython extension modules, a mature foreign function interface, and packaging tooling. The SPy roadmap gives a concrete sense of what is being worked on — better stdlib support, array primitives, cImport for calling into C, and SPy/Python interoperability are all on the path.

NumSPy is a natural milestone to aim for once these foundations are in place. It is worth noting that implementing the Python Array API standard in SPy is probably more tractable than it might first appear: the standard has a comprehensive test suite, and LLM-assisted development could help significantly. This is not necessarily a project requiring years of work or a large engineering team — it is a concrete, well-specified target that could serve as a proof of concept for everything this post describes.

The vision sketched here also assumes that the Python community — package maintainers, CPython core developers, tool authors — would engage with SPy and find it worth adopting. That is not guaranteed. New technology does not succeed just by being technically sound.

One encouraging aspect of SPy is that contributing to it is accessible to good Python developers. SPy is written mostly in Python and SPy itself, with a small amount of modern, clean C. Some parts require deep technical knowledge and careful design thinking, but others are genuinely approachable for anyone comfortable with Python. SPy is approaching an interesting inflection point: it can already be tried for real, useful applications; a growing share of its own development is happening in SPy, building out the standard library; and the technical and human environment is becoming much more welcoming. The challenge for the coming months is to advance on two fronts simultaneously — building the technical foundations (language, interpreter, compiler, packaging) and growing a large, motivated, diverse, international community around the project.

What I do believe is that the direction is right. Python’s two-language wall is a real problem, and the solutions on offer each come with significant constraints: Julia and Mojo ask you to leave Python; Cython keeps you in Python but limits what you can express and where you can deploy; raw C extensions give you full power but at the cost of complexity, fragility, and exclusion. SPy’s bet is that you can have static compilation without leaving Python, and that this is worth doing carefully rather than quickly.

For the Python community, that bet is worth watching closely.

Get involved

If this post has piqued your interest, here are some good next steps:


Pierre Augier is a researcher in fluid mechanics, co-author of the draft PEP about Python Native Interface , and the maintainer of the FluidDyn project, a collection of Python packages for computational fluid dynamics research.