One Compiler. Christian Wimmer. VM Research Group, Oracle Labs. Copyright 2016, Oracle and/or its affiliates. All rights reserved. - PDF

Please download to get full document.

View again

of 38
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Politics

Published:

Views: 5 | Pages: 38

Extension: PDF | Download: 0

Share
Related documents
Description
One Compiler Christian Wimmer VM Research Group, Oracle Labs Safe Harbor Statement The following is intended to provide some insight into a line of research in Oracle Labs. It is intended for information
Transcript
One Compiler Christian Wimmer VM Research Group, Oracle Labs Safe Harbor Statement The following is intended to provide some insight into a line of research in Oracle Labs. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of Oracle. Any views expressed in this presentation are my own and do not necessarily reflect the views of Oracle. 3 Typical Stack of Java HotSpot VM Running Nashorn VM startup code Startup Java code JNI native code Startup JavaScript code Hot Java code Hot JavaScript code Deoptimized JavaScript code VM runtime code Stack frame layout: Bytcode interpreter Client compiler Server compiler Native code Source of code: Java code (bytecode from.java file) JavaScript (dynamically generated bytecode) Java HotSpot VM JDK native code How do you find all the GC root pointers? 4 Duplication: Everything Implement Three Times Bytecode interpreter Compiled bytecode Native code (C/C++) Stack frame layout Close to JVM spec Spill slots Unspecified Stack frame size variable fixed per method unknown Root pointers for GC Bytecode liveness (expensive to compute) Pointer map from compiler Explicit handles (error prone) Exception handling Interpret metadata Compiled in (mostly) Explicit checks (error prone) Porting to new architecture Write assembly code Write client compiler and server compiler backends Debugging Java debugger Java debugger gdb Write gcc backend 5 Truffle System Structure AST Interpreter for every language JavaScript R Ruby LLVM Your language should be here! Common API separates language implementation, optimization system, and tools (debugger) Tools Truffle Graal Language agnostic dynamic compiler Graal VM Substrate VM Integrate with Java applications Low-footprint VM, also suitable for embedding 6 Graal and Truffle Tutorials https://wiki.openjdk.java.net/display/graal/publications+and+presentations 7 Speculate and Optimize U Node Specialization for Profiling Feedback G Compilation using Partial Evaluation G U U Node Transitions I G I G U U U Uninitialized Integer I I I I I AST Interpreter Uninitialized Nodes S String G D Double Generic AST Interpreter Specialized Nodes Compiled Code 8 and Transfer to Interpreter and Reoptimize! Transfer back to AST Interpreter G Node Specialization to Update Profiling Feedback G Recompilation using Partial Evaluation G G I G I G D G D G I I I D I I I D 9 Performance: Graal VM Speedup, higher is better Graal Best Specialized Competition Java Scala Ruby R Native JavaScript Performance relative to: HotSpot/Server, HotSpot/Server running JRuby, GNU R, LLVM AOT compiled, V8 10 Possible Stack of Java HotSpot VM Running Truffle VM startup code Startup Java code JNI native code Startup JavaScript code Hot Java code Hot JavaScript code Deoptimized JavaScript code VM runtime code Stack frame layout: Bytcode interpreter Graal compiler Native code Source of code: Java code (bytecode from.java file) Java HotSpot VM JDK native code Our default configuration of Truffle still uses Client and Server compiler for Java code 11 The Substrate VM is an embeddable VM for, and written in, a subset of Java optimized to execute Truffle languages ahead-of-time compiled using Graal integrating with native development tools. 12 Typical Stack of Substrate VM Running Truffle VM startup code Startup Java code SystemJava code Startup JavaScript code Stack frame layout: Graal compiler Source of code: Java code (bytecode from.java file) Hot Java code Hot JavaScript code Deoptimized JavaScript code VM runtime code Substrate VM runtime is written in Java Same compiler for ahead-of-time compiled Java code and dynamically compiled Truffle AST Transfer to AST interpreter (deoptimization) to Graal compiled code with extra deoptimization entry points 13 Substrate VM: Execution Model Points-To Analysis Ahead-of-Time Compilation Truffle Language JDK Substrate VM Machine Code Initial Heap DWARF Info ELF / MachO Binary All Java classes from Truffle language (or any application), JDK, and Substrate VM Reachable methods, fields, and classes Application running without dependency on JDK and without Java class loading 14 Substrate VM Building Blocks Reduced runtime system, all written in Java Stack walking, exception handling, garbage collector, deoptimization Graal for ahead-of-time compilation and dynamic compilation Points-to analysis Closed-world assumption: no dynamic class loading, no reflection Using Graal for bytecode parsing Fixed-point iteration: propagate type states through methods SystemJava for integration with C code Machine-word sized value, represented as Java interface, but unboxed by compiler Import of C functions and C structs to Java Substitutions for JDK methods that use unsupported features JNI code replaced with SystemJava code that directly calls to C library 15 Key Features of Graal Designed for speculative optimizations and deoptimization Metadata for deoptimization is propagated through all optimization phases Designed for exact garbage collection Read/write barriers, pointer maps for garbage collector Aggressive high-level optimizations Example: partial escape analysis Modular architecture Configurable compiler phases Compiler-VM separation: snippets, provider interfaces Written in Java to lower the entry barrier Graal compiling and optimizing itself is also a good optimization opportunity 16 Deoptimization 17 Deoptimization Transfer from optimized machine code back to unoptimized code Enables speculative optimizations Optimized code does not need to deal with corner cases No control flow merges from slow-path code back into the fast path More potential for optimizations Optimized code does not need to check assumptions Instead, it gets invalidated externally when assumption is no longer valid Speculative optimizations are essential for optimizing dynamic languages Speculate on JavaScript type stability Speculate that Ruby operators for primitive types are not changed by program Polymorphic inline caches for function calls, property accesses,... 18 Deoptimization on HotSpot VM Mapping from optimized to bytecode interpreter frames Method to Deoptimize Target Methods f1 inlined f2 inlined f3 f1() bci 42 f2() bci f1() f2() f3() bci f3() physical stack match Java bytecode frames fixed layout physical stack 19 Deoptimization on Substrate VM Mapping from optimized to unoptimized stack frames Method to Deoptimize Target Methods f1 inlined f2 inlined f3 f1() bci 42 f2() bci 7 f1() f2() f3() bci 11 f3() physical stack match Java bytecode frames match physical stack 20 Deoptimization on Substrate VM Source and target are Graal compiled frames Both have metadata that describes the layout with respect to JVM specification Stack frame location of all used local variables and expression stack elements Source and target describe the same bytecode index (bci), i.e., a matching state Source is a fully optimized Graal frame Method inlining: multiple target frames for one source frame Escape analysis: virtual objects that are re-allocated during deoptimization Global value numbering: elimination of duplicate computations Targets are Graal frames with limited optimizations No method inlining: multiple target frames restored when source frame has inlined methods No escape analysis: all objects are re-allocated during deoptimization Limited value numbering: only values in Java frame state can be live across a deoptimization entry point 21 Example: Graal IR for Deoptimization Java source code: public class BasicDeoptTest { static int field; Graal IR for compilation with deoptimization entry points: } static int proxyneeded(int x) { field = x * 2; return x * 2; } Graal IR for optimized compilation: Two explicit DeoptEntry points No elimination of second multiplication 22 SystemJava 23 SystemJava Call Java from C Preexisting C Code Legacy C Code Integration New System Java Code Legacy Java Code Integration Preexisting Java Code Legacy C code integration Need a convenient way to access preexisting C functions and structures Example: libc, database Legacy Java code integration Leverage preexisting Java libraries Patch violations of our reduced Java rules Example: JDK class library Call Java from C code Entry points into our Java code 24 SystemJava vs. JNI Java Native Interface (JNI) Write custom C code to integrate existing C code with Java C code knows about Java types Java objects passed to C code using handles SystemJava Write custom Java code to integrate existing C code with Java Java code knows about C types No need to pass Java objects to C code 25 Word type for low-level memory access Requirements Support raw memory access and pointer arithmetic No extension of the Java programming language Pointer type modeled as a class to prevent mixing with, e.g., long Transparent bit width (32 bit or 64 bit) in code using it Base interface Word Looks like an object to the Java IDE, but is a primitive value at run time Graal does the transformation Subclasses for type safety Pointer: C equivalent void* Unsigned: C equivalent size_t Signed: C equivalent ssize_t public static Unsigned strlen(charpointer str) { Unsigned n = Word.zero(); while (str.read(n)!= 0) { n = n.add(1); } return n; } 26 Java Annotations to Import C static native int clock_gettime(int clock_id, timespec static native int interface timespec extends PointerBase long long tv_nsec(); interface CIntPointer extends PointerBase { int read(); void write(int value); int clock_gettime(clockid_t clock_id, struct timespec * tp) #define CLOCK_MONOTONIC 1 struct timespec { time_t tv_sec; syscall_slong_t tv_nsec; }; int* pint; int** ppint; #include time.h -lrt Implementation of System.nanoTime() using SystemJava: static long nanotime() { timespec tp = StackValue.get(SizeOf.get(timespec.class)); clock_gettime(clock_monotonic(), tp); return tp.tv_sec() * 1_000_000_000L + tp.tv_nsec(); } 27 Points-To Analysis 28 Graal as a Static Analysis Framework Graal and the hosting Java VM provide Class loading (parse the class file) Access the bytecodes of a method Access to the Java type hierarchy, type checks Build a high-level IR graph in SSA form Linking / method resolution of method calls Static points-to analysis and compilation use same intermediate representation Simplifies applying the analysis results for optimizations Goals of points-to analysis Identify all methods reachable from a root method Identify the types assigned to each field Identify all instantiated types Fixed point iteration of type flows: Types are propagated from sources (allocations) to usages 29 Example Type Flow Graph Object f; allocate new Point [Point] void foo() { allocate(); bar(); } f [Point] putfield f Object allocate() { f = new Point() } int bar() { return f.hashcode(); } [Point] bar getfield f obj [Point] [Point] vcall hashcode Point.hashCode Analysis is context insensitive: One type state per field this 30 Example Type Flow Graph allocate new Point Object f; f = abc ; [Point] void foo() { allocate(); bar(); } [String] f [Point] putfield f Object allocate() { f = new Point() } int bar() { return f.hashcode(); } [Point, String] bar getfield f [Point, String] [Point, String] obj vcall hashcode Point.hashCode Analysis is context insensitive: One type state per field this this String.hashCode 31 Results 32 Microbenchmark for Startup and Peak Performance (1) function benchmark(n) { var obj = {i: 0, result: 0}; while (obj.i = n) { obj.result = obj.result + obj.i; obj.i = obj.i + 1; } return obj.result; } Function benchmark is invoked in a loop by harness (0 to iterations) n fixed to for all iterations JavaScript VM Version Command Line Flags Google V8 Version [none] Mozilla Spidermonkey Version JavaScript-C45.0a1 [none] Nashorn JDK 8 update 60 build 1.8.0_60-b27 -J-Xmx256M Truffle on HotSpot VM graal-js changeset a fd1e from Nov 30, 2015 graal-enterprise changeset f47fff503e49 from Nov 30, 2015 Truffle on Substrate VM substratevm changeset 45c61d192d43 from Dec 1, 2015 graal-enterprise changeset d8ee392c83e3 from Nov 21, J-Xmx256M [none] 33 Microbenchmark for Startup and Peak Performance (2) Execution Time [Seconds] Background compilation Iterations Background compilation finished Iterations Iterations Memory Footprint [MByte] Google V8 Mozilla Spidermonkey Nashorn JDK u60 Truffle on HotSpot 0 VM Truffle on Substrate VM 34 Summary Substrate VM uses a One Compiler approach For ahead-of-time compilation and dynamic compilation For all levels: Java, SystemJava, JavaScript, all other Truffle languages For deoptimization entry points For static points-to analysis Graal is flexible enough to support all these use cases Snippets for compiler-vm separation Configuration of phases 35 Acknowledgements Oracle Danilo Ansaloni Stefan Anzinger Cosmin Basca Daniele Bonetta Matthias Brantner Petr Chalupa Jürgen Christ Laurent Daynès Gilles Duboscq Martin Entlicher Bastian Hossbach Christian Humer Mick Jordan Vojin Jovanovic Peter Kessler David Leopoldseder Kevin Menard Jakub Podlešák Aleksandar Prokopec Tom Rodriguez Oracle (continued) Roland Schatz Chris Seaton Doug Simon Štěpán Šindelář Zbyněk Šlajchrt Lukas Stadler Codrut Stancu Jan Štola Jaroslav Tulach Michael Van De Vanter Adam Welc Christian Wimmer Christian Wirth Paul Wögerer Mario Wolczko Andreas Wöß Thomas Würthinger Oracle Interns Brian Belleville Miguel Garcia Shams Imam Alexey Karyakin Stephen Kell Andreas Kunft Volker Lanting Gero Leinemann Julian Lettner Joe Nash David Piorkowski Gregor Richards Robert Seilbeck Rifat Shariyar Alumni Erik Eckstein Michael Haupt Christos Kotselidis Hyunjin Lee David Leibs Chris Thalinger Till Westmann JKU Linz Prof. Hanspeter Mössenböck Benoit Daloze Josef Eisl Thomas Feichtinger Matthias Grimmer Christian Häubl Josef Haider Christian Huber Stefan Marr Manuel Rigger Stefan Rumzucker Bernhard Urban University of Edinburgh Christophe Dubach Juan José Fumero Alfonso Ranjeet Singh Toomas Remmelg LaBRI Floréal Morandat University of California, Irvine Prof. Michael Franz Gulfem Savrun Yeniceri Wei Zhang Purdue University Prof. Jan Vitek Tomas Kalibera Petr Maj Lei Zhao T. U. Dortmund Prof. Peter Marwedel Helena Kotthaus Ingo Korb University of California, Davis Prof. Duncan Temple Lang Nicholas Ulle University of Lugano, Switzerland Prof. Walter Binder Sun Haiyang Yudi Zheng 36 37
Recommended
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x