Tracing just-in-time compilation

Tracing just-in-time compilation is a technique used by virtual machines to optimize the execution of a program at runtime. This is done by recording a linear sequence of frequently executed operations, compiling them to native machine code and executing them. This is opposed to traditional just-in-time (JIT) compilers that work on a per-method basis.

Overview

Just-in-time compilation is a technique to increase execution speed of programs by compiling parts of a program to machine code at runtime. One way to categorize different JIT compilers is by their compilation scope. Whereas method-based JIT compilers translate one method at a time to machine code, tracing JITs use frequently executed loops as their unit of compilation. Tracing JITs are based on the assumptions that programs spend most of their time in some loops of the program ("hot loops") and subsequent loop iterations often take similar paths. Virtual machines that have a tracing JIT are often mixed-mode execution environments, meaning that they have either an interpreter or a method compiler in addition to the tracing JIT.

Technical details

A tracing JIT compiler goes through various phases at runtime. First, profiling information for loops is collected. After a hot loop has been identified, a special tracing phase is entered, which records all executed operations of that loop. This sequence of operations is called a trace. The trace is then optimized and compiled to machine code (trace). When this loop is executed again, the compiled trace is called instead of the program counterpart.

These steps are explained in detail in the following:

Profiling phase

The goal of profiling is to identify hot loops. This is often done by counting the number of iterations for every loop. After the count of a loop exceeds a certain threshold, the loop is considered to be hot, and tracing phase is entered.

Tracing phase

In the tracing phase the execution of the loop proceeds normally, but in addition every executed operation is recorded into a trace. The recorded operations are typically stored in trace tree, often in an intermediate representation (IR). Tracing follows function calls, which leads to them being inlined into the trace. Tracing continues until the loop reaches its end and jumps back to the start.

Since the trace is recorded by following one concrete execution path of the loop, later executions of that trace can diverge from that path. To identify the places where that can happen, special guard instructions are inserted into the trace. One example for such a place are if statements. The guard is a quick check to determine whether the original condition is still true. If a guard fails, the execution of the trace is aborted.

Since tracing is done during execution, the trace can be made to contain runtime information (e.g. type information). This information can later be used in the optimization phase to increase code efficiency.

Optimization and code-generation phase

Traces are easy to optimize, since they represent only one execution path, which means that no control flow exists and needs no handling. Typical optimizations include constant-subexpression elimination, dead code elimination, register allocation, invariant-code motion, constant folding, and escape analysis.^[1]

After the optimization, the trace is turned into machine code. Similarly to optimization, this is easy due to the linear nature of traces.

Execution

After the trace has been compiled to machine code, it can be executed in subsequent iterations of the loop. Trace execution continues until a guard fails.

History

Whereas the idea of JITs reaches back to the 1960s, tracing JITs have become used more often only recently. The first mention of an idea that is similar to today's idea of tracing JITs was in 1970.^[2] It was observed that compiled code could be derived from an interpreter at run-time by simply storing the actions performed during interpretation.

The first implementation of tracing is Dynamo, "a software dynamic optimization system that is capable of transparently improving the performance of a native instruction stream as it executes on the processor".^[3] To do this, the native instruction stream is interpreted until a "hot" instruction sequence is found. For this sequence an optimized version is generated, cached and executed.

Dynamo was later extended to DynamoRIO. One DynamoRIO-based project was a framework for interpreter construction that combines tracing and partial evaluation. It was used to "dynamically remove interpreter overhead from language implementations".^[4]

In 2006, HotpathVM, the first tracing JIT compiler for a high-level language^{[citation needed]} was developed.^[5] This VM was capable of dynamically identifying frequently executed bytecode instructions, which are traced and then compiled to machine code using static single assignment (SSA) construction. The motivation for HotpathVM was to have an efficient JVM for resource constrained mobile devices.

Another example of a tracing JIT is TraceMonkey, one of Mozilla’s JavaScript implementations for Firefox (2009).^[6] TraceMonkey compiles frequently executed loop traces in the dynamic language JavaScript at run-time and specializes the generated code for the actual dynamic types occurring on each path.

Another project that utilizes tracing JITs is PyPy. It enables the use of tracing JITs for language implementations that were written with PyPy's translation toolchain, thus improving the performance of any program that is executed using that interpreter. This is possible by tracing the interpreter itself, instead of the program that is executed by the interpreter.^[7]

Tracing JITs have also been explored by Microsoft in the SPUR project for their Common Intermediate Language (CIL). SPUR is a generic tracer for CIL, which can also be used to trace through a JavaScript implementation.^[8]

Example of a trace

Consider the following Python program that computes a sum of squares of successive whole numbers until that sum exceeds 100000:

def square(x):
    return x * x

i = 0
y = 0
while True:
    y += square(i)
    if y > 100000:
        break
    i = i + 1

A trace for this program could look something like this:

 loopstart(i1, y1)
 i2 = int_mul(i1, i1)		# i*i
 y2 = int_add(y1, i2)		# y += i*i
 b1 = int_gt(y2, 100000)
 guard_false(b1)
 i3 = int_add(i1, 1)		# i = i+1
 jump(i3, y2)

Note how the function call to square is inlined into the trace and how the if statement is turned into a guard_false.

References

^ Bolz, Carl Friedrich; Cuni, Antonio; FijaBkowski, Maciej; Leuschel, Michael; Pedroni, Samuele; Rigo, Armin (January 2011). "Allocation Removal by Partial Evaluation in a Tracing JIT" (PDF). Proceedings of the 20th ACM SIGPLAN workshop on Partial evaluation and program manipulation. PEPM '11. pp. 43–52. doi:10.1145/1929501.1929508. S2CID 15871223. Retrieved 2020-12-13.
^ Mitchell, James G. (June 29, 1970). The Design and Construction of Flexible and Efficient Interactive Programming Systems (PhD). Carnegie Mellon University. ISBN 978-0-8240-4414-5. LCCN 79050563. OCLC 633313022. S2CID 36249021. Docket AAI7104538. Retrieved 2020-12-13.
^ Bala, Vasanth; Duesterwald, Evelyn; Banerjia, Sanjeev (May 2000). "Dynamo: A Transparent Dynamic Optimization System" (PDF). Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation. PLDI '00. pp. 1–12. doi:10.1145/349299.349303. ISBN 978-1-58113-199-4. S2CID 53223267. Retrieved 2020-12-13.
^ Sullivan, Gregory T.; Bruening, Derek L.; Baron, Iris; Garnett, Timothy; Amarasinghe, Saman (June 2003). "Dynamic Native Optimization of Interpreters" (PDF). Proceedings of the 2003 workshop on Interpreters, virtual machines and emulators. IVME '03. pp. 50–57. CiteSeerX 10.1.1.14.9819. doi:10.1145/858570.858576. ISBN 978-1-58113-655-5. S2CID 509405. Retrieved 2020-12-13.
^ Gal, Andreas; Probst, Christian W.; Franz, Michael (June 2006). "HotpathVM: An Effective JIT Compiler for Resource-constrained Devices" (PDF). Proceedings of the 2nd international conference on Virtual execution environments. VEE '06. pp. 144–153. doi:10.1145/1134760.1134780. ISBN 978-1-59593-332-4. S2CID 17846788. QID 56580114. Retrieved 2020-12-13.
^ Gal, Andreas; Orendorff, Jason; Ruderman, Jesse; Smith, Edwin W.; Reitmaier, Rick; Bebenita, Michael; Chang, Mason; Franz, Michael; Eich, Brendan; Shaver, Mike; Anderson, David; Mandelin, David; Haghighat, Mohammad R.; Kaplan, Blake; Hoare, Graydon; Zbarsky, Boris (June 2009). "Trace-based Just-in-Time Type Specialization for Dynamic Languages" (PDF). Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation. PLDI '09. pp. 465–478. doi:10.1145/1542476.1542528. ISBN 978-1-60558-392-1. S2CID 207172806. Retrieved 2020-12-13.
^ Bolz, Carl Friedrich; Cuni, Antonio; Fijalkowski, Maciej; Rigo, Armin (July 2009). "Tracing the Meta-Level: PyPy's Tracing JIT Compiler" (PDF). Proceedings of the 4th workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems. ICOOOLPS '09. pp. 18–25. doi:10.1145/1565824.1565827. ISBN 978-1-60558-541-3. S2CID 7478596. Retrieved 2020-12-13.
^ Bebenita, Michael; Brandner, Florian; Fahndrich, Manuel; Logozzo, Francesco; Schulte, Wolfram; Tillmann, Nikolai; Venter, Herman (October 2010). "SPUR: A Trace-Based JIT Compiler for CIL" (PDF). Proceedings of the ACM international conference on Object oriented programming systems languages and applications. OOPSLA '10. pp. 708–725. doi:10.1145/1869459.1869517. ISBN 978-1-4503-0203-6. S2CID 3395746. Retrieved 2020-12-13.

External links

Official website of LuaJIT

[BolzCuni2011-1] Bolz, Carl Friedrich; Cuni, Antonio; FijaBkowski, Maciej; Leuschel, Michael; Pedroni, Samuele; Rigo, Armin (January 2011). "Allocation Removal by Partial Evaluation in a Tracing JIT" (PDF). Proceedings of the 20th ACM SIGPLAN workshop on Partial evaluation and program manipulation. PEPM '11. pp. 43–52. doi:10.1145/1929501.1929508. S2CID 15871223. Retrieved 2020-12-13.

[Mitchell1970-2] Mitchell, James G. (June 29, 1970). The Design and Construction of Flexible and Efficient Interactive Programming Systems (PhD). Carnegie Mellon University. ISBN 978-0-8240-4414-5. LCCN 79050563. OCLC 633313022. S2CID 36249021. Docket AAI7104538. Retrieved 2020-12-13.

[BalaDuesterwald2000-3] Bala, Vasanth; Duesterwald, Evelyn; Banerjia, Sanjeev (May 2000). "Dynamo: A Transparent Dynamic Optimization System" (PDF). Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation. PLDI '00. pp. 1–12. doi:10.1145/349299.349303. ISBN 978-1-58113-199-4. S2CID 53223267. Retrieved 2020-12-13.

[SullivanBruening2003-4] Sullivan, Gregory T.; Bruening, Derek L.; Baron, Iris; Garnett, Timothy; Amarasinghe, Saman (June 2003). "Dynamic Native Optimization of Interpreters" (PDF). Proceedings of the 2003 workshop on Interpreters, virtual machines and emulators. IVME '03. pp. 50–57. CiteSeerX 10.1.1.14.9819. doi:10.1145/858570.858576. ISBN 978-1-58113-655-5. S2CID 509405. Retrieved 2020-12-13.

[GalProbst2006-5] Gal, Andreas; Probst, Christian W.; Franz, Michael (June 2006). "HotpathVM: An Effective JIT Compiler for Resource-constrained Devices" (PDF). Proceedings of the 2nd international conference on Virtual execution environments. VEE '06. pp. 144–153. doi:10.1145/1134760.1134780. ISBN 978-1-59593-332-4. S2CID 17846788. QID 56580114. Retrieved 2020-12-13.

[GalOrendorff2009-6] Gal, Andreas; Orendorff, Jason; Ruderman, Jesse; Smith, Edwin W.; Reitmaier, Rick; Bebenita, Michael; Chang, Mason; Franz, Michael; Eich, Brendan; Shaver, Mike; Anderson, David; Mandelin, David; Haghighat, Mohammad R.; Kaplan, Blake; Hoare, Graydon; Zbarsky, Boris (June 2009). "Trace-based Just-in-Time Type Specialization for Dynamic Languages" (PDF). Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation. PLDI '09. pp. 465–478. doi:10.1145/1542476.1542528. ISBN 978-1-60558-392-1. S2CID 207172806. Retrieved 2020-12-13.

[BolzCuni2009-7] Bolz, Carl Friedrich; Cuni, Antonio; Fijalkowski, Maciej; Rigo, Armin (July 2009). "Tracing the Meta-Level: PyPy's Tracing JIT Compiler" (PDF). Proceedings of the 4th workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems. ICOOOLPS '09. pp. 18–25. doi:10.1145/1565824.1565827. ISBN 978-1-60558-541-3. S2CID 7478596. Retrieved 2020-12-13.

[BebenitaBrandner2010-8] Bebenita, Michael; Brandner, Florian; Fahndrich, Manuel; Logozzo, Francesco; Schulte, Wolfram; Tillmann, Nikolai; Venter, Herman (October 2010). "SPUR: A Trace-Based JIT Compiler for CIL" (PDF). Proceedings of the ACM international conference on Object oriented programming systems languages and applications. OOPSLA '10. pp. 708–725. doi:10.1145/1869459.1869517. ISBN 978-1-4503-0203-6. S2CID 3395746. Retrieved 2020-12-13.