IRx is a Python library that lowers
ARXLang ASTx nodes to LLVM IR using
llvmlite. It provides a visitor-based codegen pipeline and a small builder API
that can translate ASTs to LLVM IR text or produce runnable executables
via clang.
Status: early but functional. Arithmetic, variables, functions, returns, structured control flow with canonical loop lowering, and a few system-level expressions (e.g.
PrintExpr) are supported.
-
ASTx → LLVM IR via multiple-dispatch visitors (
plum). -
Back end: IR construction and object emission with llvmlite.
-
Native build: links with
clangto produce an executable. -
Optional runtime features: native capabilities are feature-gated per compilation unit instead of being linked into every binary.
-
PIE-friendly objects: emits PIC-compatible objects by default to work with modern PIE-default linkers.
-
Supported nodes (subset; exact ASTx class names):
- Literals:
LiteralInt16,LiteralInt32,LiteralString - Variables:
Variable,VariableDeclaration,InlineVariableDeclaration - Ops:
UnaryOp(++,--),BinaryOp(+ - * / < >) with documented scalar numeric promotion and cast rules - Flow:
IfStmt,WhileStmt,ForCountLoopStmt,ForRangeLoopStmt,BreakStmt,ContinueStmt - Functions:
FunctionPrototype,Function,FunctionReturn,FunctionCall - System:
system.PrintExpr(string printing)
- Literals:
-
Built-ins:
putchar,putchard(emitted as IR);putsdeclaration when needed. -
Optional native runtimes:
libcexterns are routed through the runtime feature layer, feature-backed externs can requestlibm, and Arrow is now available as an optional native runtime feature.
- Python 3.10 – 3.13.
- A recent LLVM/Clang toolchain available on
PATH. - A working C standard library (e.g., system libc) for linking calls like
puts. - Python deps:
llvmlite,pytest, etc. (seepyproject.toml/requirements.txt).- Note: llvmlite has specific Python/LLVM compatibility windows; see its docs.
git clone https://github.com/arxlang/irx.git
cd irx
conda env create --file conda/dev.yaml
conda activate irx
poetry installYou can also install it from PyPI: pip install pyirx.
More details: https://irx.arxlang.org/installation/
import astx
from irx.builder import Builder
builder = Builder()
module = builder.module()
# int main() { return 0; }
proto = astx.FunctionPrototype("main", astx.Arguments(), astx.Int32())
body = astx.Block()
body.append(astx.FunctionReturn(astx.LiteralInt32(0)))
module.block.append(astx.Function(prototype=proto, body=body))
ir_text = builder.translate(module)
print(ir_text) # LLVM IR text (str)translate returns a str with LLVM IR. It does not produce an object file
or binary; use it for inspection, tests, or feeding another tool.
import astx
from irx.builder import Builder
from irx.system import PrintExpr
builder = Builder()
module = builder.module()
# int main() { print("Hello, IRx!"); return 0; }
main_proto = astx.FunctionPrototype("main", astx.Arguments(), astx.Int32())
body = astx.Block()
body.append(PrintExpr(astx.LiteralString("Hello, IRx!")))
body.append(astx.FunctionReturn(astx.LiteralInt32(0)))
module.block.append(astx.Function(prototype=main_proto, body=body))
builder.build(module, "hello") # emits object + links with clang
result = builder.run() # executes ./hello → CommandResult
print(result.stdout) # "Hello, IRx!"-
Builder(public API)translate(ast) -> str— generate LLVM IR text.build(ast, output_path)— emit object via llvmlite and link withclang.run()— execute the produced binary; returns aCommandResultwith.stdout,.stderr,.returncode, and.success.
-
Visitor(codegen)- Uses
@dispatchto visit each ASTx node type. - Maintains a value stack (
result_stack) and symbol table (named_values). - Emits LLVM IR with
llvmlite.ir.IRBuilder.
- Uses
Loop lowering now follows one canonical control-flow shape per loop form:
WhileStmt:while.cond -> while.body -> while.exitForCountLoopStmt:for.count.cond -> for.count.body -> for.count.update -> for.count.exitForRangeLoopStmt:for.range.cond -> for.range.body -> for.range.step -> for.range.exit
Semantic invariants:
breakexits the nearest enclosing loopcontinuetargets the canonical re-entry block for that loop form- for-count initializer symbols are loop-scoped and visible only to the loop condition, body, and update
- for-range induction variables are loop-scoped, body-visible, mutable inside the body, and not visible after the loop
- for-range
start,end, andstepare observed before the first iteration; body mutation of the induction variable feeds the step block
PrintExpr is an astx.Expr holding a LiteralString. Its lowering:
- Create a global constant for the string (with
\0). - GEP to an
i8*pointer. - Declare (or reuse)
i32 @puts(i8*). - Call
puts.
IRx now has a generic runtime-feature system for native integrations that do not belong as handwritten LLVM container logic.
- Features are registered by name, such as
libc,libm, andarrow. - Features can declare external symbols, native C sources, objects, or static libraries.
- The linker only compiles and links artifacts for features that are active in the current compilation unit.
- This is intentionally separate from any future Arx import/module layer.
Public extern declarations integrate with the same layer:
- plain externs emit an LLVM declaration and rely on the system linker
- externs with
runtime_feature/runtime_featuresactivate the named runtime features for that compilation unit - known feature-owned symbols are declared through the runtime registry instead of a separate ad hoc native path
Arrow uses this path as its first substantial consumer:
- native runtime implemented in C under
src/irx/builder/runtime/arrow/ - opaque
irx_arrow_*handles for schemas, builders, and arrays - Arrow C Data import/export boundary with explicit copy and move/adopt import modes
- supported primitive storage types:
int8,int16,int32,int64,uint8,uint16,uint32,uint64,float32,float64, andbool - explicit Arrow-side nullability inspection plus a readonly value-buffer bridge
into the generic
irx_buffer_viewsubstrate for fixed-width numeric arrays - Python
nanoarrowinstalled by default for interop and tests arx-nanoarrow-sourcesinstalled by default for native runtime builds
The Arrow layer remains intentionally low-level: handles, lifecycle, inspection, C Data interop, and a conservative buffer/view bridge. IRx still does not encode dataframe semantics, query/table APIs, or direct Arrow containers in LLVM IR.
IRx now treats scalar numerics as a stable substrate instead of an ad hoc "simple promotion" layer:
- one canonical promotion table for signed integers, unsigned integers, and floats
- one canonical implicit-promotion vs explicit-cast policy
- comparisons always resolve to
Boolean/ LLVMi1
The full contract lives in docs/semantic-contract.md.
IRx now treats callable semantics as a stable semantic contract instead of reconstructing function meaning during lowering:
- every declared or defined callable is normalized into one canonical semantic signature before codegen
- parameter order is semantic and preserved exactly as declared
- IRx-defined functions default to calling convention
irx_default - explicit extern/native declarations default to calling convention
c - lowering preserves the semantic calling-convention classification even when LLVM emission is currently the same
- calls are validated semantically before lowering: callee resolution, arity, narrow extern varargs policy, and canonical implicit argument conversions all happen in one path
- returns are validated semantically before lowering:
return expris only for non-void functions, barereturnis only for void functions, and non-void fallthrough is rejected - void calls may be used as statements, but not as values in assignments, returns, operators, or other expressions
mainis now explicit and deterministic: it must beInt32 main(), it may not be variadic or extern, and it must return along every path
The current ASTx surface remains intentionally small. When present, IRx reads
the following FunctionPrototype attributes during semantic predeclaration:
is_extern, calling_convention, is_variadic, symbol_name,
runtime_feature, and runtime_features.
IRx now exposes one explicit public FFI contract that Arx can target for native scientific libraries:
- explicit extern declarations are the public entrypoint
PointerTypeandOpaqueHandleTypeprovide stable public pointer/handle types- ABI-safe structs are validated semantically before lowering
- symbol-name overrides and runtime-feature dependencies are part of the canonical semantic signature
- plain externs and feature-backed externs share one lowering path and one link/runtime story
Minimal examples:
puts = astx.FunctionPrototype(
"puts",
args=astx.Arguments(astx.Argument("message", astx.UTF8String())),
return_type=astx.Int32(),
)
puts.is_extern = True
puts.calling_convention = "c"
puts.symbol_name = "puts"sqrt = astx.FunctionPrototype(
"sqrt",
args=astx.Arguments(astx.Argument("value", astx.Float64())),
return_type=astx.Float64(),
)
sqrt.is_extern = True
sqrt.calling_convention = "c"
sqrt.symbol_name = "sqrt"
sqrt.runtime_feature = "libm"open_handle = astx.FunctionPrototype(
"open_handle",
args=astx.Arguments(),
return_type=astx.OpaqueHandleType("demo_handle"),
)
open_handle.is_extern = True
open_handle.calling_convention = "c"
open_handle.symbol_name = "open_handle"See docs/semantic-contract.md for the exact admissible FFI type subset and
symbol-resolution rules.
pytest -vvExample style (simplified):
def test_binary_op_basic():
builder = Builder()
module = builder.module()
decl_a = astx.VariableDeclaration("a", astx.Int32(), astx.LiteralInt32(1))
decl_b = astx.VariableDeclaration("b", astx.Int32(), astx.LiteralInt32(2))
a, b = astx.Variable("a"), astx.Variable("b")
expr = astx.LiteralInt32(1) + b - a * b / a
proto = astx.FunctionPrototype("main", astx.Arguments(), astx.Int32())
block = astx.Block()
block.append(decl_a); block.append(decl_b)
block.append(astx.FunctionReturn(expr))
module.block.append(astx.Function(proto, block))
ir_text = builder.translate(module)
assert "add" in ir_text-
Ensure Xcode Command Line Tools are installed:
xcode-select --install. -
Verify
clang --versionworks. -
If needed:
export SDKROOT="$(xcrun --sdk macosx --show-sdk-path)"
-
CI note: macOS jobs currently run on Python 3.12 only.
- IRx now requires
mainto beInt32 main()with a deterministic return on every control-flow path. void main, variadicmain, externmain, and non-returning non-voidmainbodies are rejected before lowering.
- A visitor is missing
@dispatchor is typed against a different class than the one instantiated. Ensure signatures match the exact runtime class (e.g.,visit(self, node: PrintExpr)).
- Install a recent LLVM/Clang. On Linux, use distro packages.
- On macOS, install Xcode CLT.
- On Windows, ensure LLVM’s
bindirectory is onPATH.
-
This usually means your linker is enforcing PIE while the object was compiled with non-PIE relocations.
-
Current IRx defaults to PIC-compatible object emission, which should work with PIE-default linkers.
-
If you are using an older ARX/IRX stack, update first.
-
If you must link externally as a workaround, use:
clang -no-pie file.o -o program
- Linux & macOS: supported and used in CI.
- Windows: expected to work with a proper LLVM/Clang setup; consider it
experimental.
builder.run()will executehello.exe.
- More ASTx coverage (booleans, arrays, structs, varargs/options).
- Richer stdlib bindings (I/O, math).
- Optimization toggles/passes.
- Alternative backends and/or JIT runner.
- Better diagnostics and source locations in IR.
- Expand optional Apache Arrow runtime support: streams, variable-width primitives, and higher-level interop handles.
Please see the contributing guide. Add tests for new features and keep visitors isolated (avoid special-casing derived nodes inside generic visitors).
- LLVM and llvmlite for the IR infrastructure.
- ASTx / ARXLang for the front-end AST.
- Contributors and users experimenting with IRx.
License: BSD-3-Clause. See LICENSE.