-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Mypyc Intermediate Representation (IR)
Mypyc first type checks the code to be compiled using mypy, which produces a mypy AST. The mypy AST isn't very well suited for compilation, so one of the first things mypyc does is to convert it into an intermediate representation (IR) better suited for doing transforms, data flow analyses, code generation, etc.
Each function gets translated into one or more basic blocks. Each basic block has some ops (operations). Ops are simple, designed to be easy to compile into C or transform to other low-level representations.
For example, consider this function:
def f(x: int) -> int:
return x + 1
This will be translated to a single basic block which we can pretty-print like this:
def f(x):
x, r0 :: int
L0:
r0 = CPyTagged_Add(x, 2)
return r0
The basic block has the label L0
. There are two values in the basic blocks, x
and r0
(x
is a register, and r0
is a temporary holding the result of an op).
There are two ops in the basic block -- a call to a primitive function CPyTagged_Add
that adds two integers, and a return op.
Note that integers are represented using a tagged pointer. If the integer value is small, the value is shifted left by one and represented directly. If the value is large, the lower bit is set and the integer is represented by a pointer to a Python int
object. This is why the second argument to CPyTagged_Add
is 2 instead of 1.
Each op can take some values as operands, and it produces a value. There are three categories of value in IR:
- Registers are mutable values, and these are used to represent locals and function arguments. They are also used for some temporaries.
- The result of an op is also a value. These are always initialized exactly once and cannot be mutated. This resembles the SSA form, but we don't use Phi functions.
- Integer literals are also values.
Most function do some branching and have multiple basic blocks. Here is a simple function that has an if statement:
def g(b: bool) -> int:
if b:
return 1
return 2
The generated IR looks like this:
def g(b):
b :: bool
L0:
if b goto L1 else goto L2 :: bool
L1:
return 2
L2:
return 4
The IR has three basic blocks, labeled L0
, L1
and L2
.
Each IR op only performs a single, simple operation. More complex operations are represented as multiple ops. For example, consider this function:
def f(x: int, y: int) -> int:
return x + y + 1
The expression (represented as a tree in the mypy AST) is turned into a sequence of ops:
def f(x, y):
x, y, r0, r1 :: int
L0:
r0 = CPyTagged_Add(x, y)
r1 = CPyTagged_Add(r0, 2)
return r1
The mypyc IR is defined in mypyc.ir
. It covers several key concepts
that are essential to understand by all mypyc contributors:
-
mypyc.ir.ops.Op
is an Abstract Base Class for all IR ops. These are low-level and generally map to simple fragments of C each. Mypy expressions are translated to linear sequences of these ops. -
mypyc.ir.ops.BasicBlock
is a container of a sequence of ops with a branch/goto/return at the end, and no branch/goto/return ops in the middle. Each function is compiled to a bunch of basic blocks. -
mypyc.ir.rtypes.RType
and its subclasses are the types used for everything in the IR. These are lower-level and simpler than mypy or PEP 484 types. For example, there are no general-purpose generic types types here. EachList[X]
type (for anyX
) is represented by a singlelist
type, for example. -
Primitive types are special RTypes of which mypyc has some special understanding, and there are typically some specialized ops. Examples include
int
(referred to asint_rprimitive
in the code) andlist
(list_rprimitive
). Python types for which there is no specific RType type will be represented by the catch-allobject_rprimitive
type. -
Instances of compiled classes are generally represented using the
RInstance
type. Classes are compiled to C extension classes and contain vtables for fast method calls and fast attribute access. -
IR representations of functions and classes live in
mypyc.ir.func_ir
andmypyc.ir.class_ir
, respectively.
Look at the docstrings and comments in mypyc.ir
for additional
information. See the test cases in
mypyc/test-data/irbuild-basic.test
for more examples of what the IR looks
like in a pretty-printed form.