was: Universal Python extensions: performance, compatibility, sustainability, and less CO₂
Pierre Augier
PyConFR, 2nd November 2025




Maintain software for research in fluid mechanics 
Practical research on how people can
CPython is still VERY slow
Is it an issue?
YES (long term).
Strongly limits what can be written in Python and how they can be written
Can it be fixed?
Partly, but deep changes are needed, ecosystem-wide
A depressing situation
A potential realistic solution (Python Native Interface)
A lot of work needed (ecosystem-wide investment) but seems doable
| Python | the language |
|---|---|
| CPython | the reference implementation (written in C & Python) |
| PyPy | an alternative implementation (written in Python) |
| GraalPy | an alternative implementation (written in Java) |
| Extensions | libraries (written in C, Rust, …) usable as a module |
| Python C API | Application Programming Interface set of C functions to interact with the interpreter |
| ABI | Application Binary Interface |
| HPy | project proposing an alternative C API |
| Cython | a language (superset of Python) and a compiler |
relatively small improvements…
but still very slow compared to …
other dynamic languages
(JavaScript, Julia, PHP, Matlab, …)
alternative Python interpreters oriented towards performance (PyPy and GraalPy)
$ $(uv python find 3.11) bench_loops_sum.py
3.11.2 (main, Sep 14 2024, 03:00:30) [GCC 12.2.0]
JIT Compiler: unsupported
Number of long_comp per second: 56.83 ± 0.92
$ $(uv python find 3.14) bench_loops_sum.py
3.14.0 (main, Oct 10 2025, 12:47:49) [Clang 20.1.4 ]
JIT Compiler: disabled
Number of long_comp per second: 72.63 ± 2.53
$ PYTHON_JIT=1 $(uv python find 3.14) bench_loops_sum.py
3.14.0 (main, Oct 10 2025, 12:47:49) [Clang 20.1.4 ]
JIT Compiler: enabled ✨
Number of long_comp per second: 60.55 ± 2.23
$ $(uv python find pypy) bench_loops_sum.py
3.11.11 (0253c85bf5f8, Feb 26 2025, 10:42:42)
[PyPy 7.3.19 with GCC 10.2.1 20210130 (Red Hat 10.2.1-11)]
JIT Compiler: enabled ✨
Number of long_comp per second: 919.85 ± 170.55
$ $(uv python find graalpy) bench_loops_sum.py
3.11.7 (Wed Apr 02 19:57:13 UTC 2025)
[Graal, Oracle GraalVM, Java 24.0.1 (amd64)]
JIT Compiler: enabled ✨
Number of long_comp per second: 1017.03 ± 21.86
Speedup versus CPython 3.11
| without JIT | with JIT | |
|---|---|---|
| CPy 3.9 | 0.74 | |
| CPy 3.11 | 1.00 | |
| CPy 3.14 | 1.28 | 1.07 |
| PyPy | 16.2 | |
| GraalPy | 17.9 |
Extensions using the Python C API
Performance oriented Python implementations based on JIT
Warning
Incompatible strategies
Interacts with the Python interpreter from C code
A cause of the great Python success
Used nearly everywhere
Several historical issues
C API Working Group: several improvements
Focus for this presentation
Incompatibility with performance oriented Python implementations
Python implementations using JIT…
Focus on full Python implementations
Nothing on Numba (Python-NumPy method-based JIT compiler based on LLVM)
Bad interaction with extensions
| Framework | language | |
|---|---|---|
| PyPy | RPython | Python 2.7 |
| GraalPy | Truffle/GraalVM | Java |
traces of micro-ops
Copy & Patch method to produce the machine code
not mature yet
fundamentally more limited than meta JITs
Several Python functions written in C
Performance strongly depends on the interpreter implementation.
Garbage collection: moving GC versus ref counting
Internal representation of objects
For example, [1, 2, 3] represented in PyPy by an array of native integers.
…
PyPerformance typically x4 faster
On many cases, typically x20
Zero-cost abstraction + small objects
Strong incompatibility with the ecosystem based on native extensions
Warning
Strong and wrong hypotheses about the implementations
No opaque PyObject: direct access to the structure
Reference counting (Py_INCREF, Py_DECREF)
*PyObject incompatible with moving GC
Functions defined in extensions not typed!
Useless boxing/unboxing
For arr[i]
native int -> Py int -> C int -> C float -> Py float -> native float
By PyPy and GraalPy devs
A new C API
context argument*PyObjectHPy_Dup/HPy_Close instead of Py_INCREF/Py_DECREFNote
Compatible with moving garbage collection
standard ABI (example cpy311-cpy311)
universal ABI
Stalled :-(
Limited API: a subset of the C API
Stable ABI
Evolution for cp315-abi3 (Python 3.15)
2 incompatible build modes for CPython
(Free-threading and GIL-enabled)
Extensions compatible with both modes (PyObject fully opaque)
Petr’s idea (CPython code dev, C API WG)
fully new, clean and complete CPython API
“native” (different languages, in particular C and Rust)
not focused on human usability (for tools like Cython)
C API still supported
Same as old, plus
strongly inspired by HPy (context argument, handles, …)
universal ABI
driven by CPython needs
(in particular C API WG and Faster CPython)
CPython, PyPy and GraalPy devs together
official (PEPs)
2025
PyPy & GraalPy 3.12
2026
2027
2028
2029
Python very badly funded!
Especially for such long-term ecosystem-wide projects
Companies and public sector
Reasonable investments & not so expensive!
Lack of mechanisms to allow one to support Py & co
(think research projects, CNRS, CEA, universities, …)
Organization problem, community problem
Current Python C API inhibits perf improvements
Good and fast alternative Python implementations
A technical solution: Python Native Interface
A long and ecosystem-wide project
CPython is technically ready
Good compatibility extensions & fast Python interpreters
A lot of positive effects in few years
Time, €, CO₂; more Python and better Python, …
Funding, support and organization
CPython, PyPy, Cython, NumPy, …