Stage Polymorphism Based on Types for a Typeless Language: MATLAB in LMS
MATLAB is a high-level programming language that is widely used in many domains in engineering and science. Consequently, there has been extensive research in automatically optimizing, interpreting, translating and compiling MATLAB. However, the lack of type and shape information, the dynamic aspects of the language, and the heavy overloading of functions impose significant challenges in compiling MATLAB for efficient execution. As a result, there is no compiler that supports the complete specification, even on a reduced subset of the language.
In this work, we take a novel approach: we stage an evaluator for a subset of the MATLAB language using the Lightweight Modular Staging (LMS)  framework to produce a compiler that generates low-level C code. We introduce a stage polymorphic data structure, that we refer to as metacontainer, to represent MATLAB tensors and its type and shape information, i.e., its metadata, and use it to apply high-level analysis and transformations and subsequent lowering of computations.
We use metacontainers to “inject” code generation into a high-level intermediate representation (IR) of a MATLAB program, creating constructs that represent shape and type propagation in addition to the existing MATLAB computations. The injected code, coupled with the initial program control-flow, is separated from the numeric computations, leading to a simplified IR used for further analysis. Instead of tensor-specific analysis, this allows us to apply generic reaching definition analysis and constant propagation on integer computations to infer types and dimensions respectively. Finally, the result of the analysis is then applied to specialize each metacontainer with the inferred information, without needing complicated rewrites.
Once type and shape information are inferred, metacontainers are also used as the primary abstraction for lowering the computation, including for type and ISA specialization, through a stage-polymorphic array representation  for each tensor. Our prototype Matlab compiler MGen produces static C code that supports MATLAB type-specific numerical computations on all 12 primitive types, including floating point computation and saturation arithmetics. It handles all dynamic aspects of a subset of the language and overloaded computation, and generates correct code with explicit vectorization for Intel SIMD architectures.
 Tiark Rompf and Martin Odersky. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. Proc. of the 9th International Conference on Generative Programming And Component Engineering (GPCE) 2010.
 Alen Stojanov, Georg Ofenbeck, Tiark Rompf, and Markus Pueschel. Abstracting Vector Architectures in Library Generators: Case Study Convolution Filters. Proc. of the ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY) 2014.
Wed 17 Jul
|15:40 - 16:10|
|16:10 - 16:30|
Aleksandar ProkopecOracle Labs
|16:30 - 16:50|