This is the 14th article of a series (table of contents) about compiler development with LLVM using OCaml. We intend to develop a compiler for a subset of OCaml large enough to allow our compiler to compile itself.
In this article, we improve the compilation of
As unit value are unequivocally known solely from their type, there is no need to physically store or return unit values, or to pass them as arguments.
In the current photon compiler, we use a 1-bit integer type to represent unit values. To avoid storing needlessly many times unit values, we use one
.unit constant for this purpose. Unit arguments are also ignored, and when
unit is the return type of a function, it is replaced by
Other than having a dubious unit-typed constant, our code has a serious bug regarding units arguments, which are fully ignored, i.e. not even evaluated. Here is an example of the surprising behaviour:
# extern puts : string -> unit = "puts";; extern puts : string -> unit = "puts" # let f (u : unit) : unit = puts "f called";; val f : unit -> unit = <fun> # f (puts "Hello");; f called - : unit = ()
Contrarily to what one could expect, no “Hello” message is printed.
Rather than using 1-bit integer type for
unit, we will use the
There is no longer something special to do for function return types, which will already be converted to
When unit values are needed, we will use undefined values through LLVM's
undef. These values of course cannot be used, but we will never do. If we do, an error will quickly remind us which is better than stupidly mishandling 1-bit values.
As unit values are now direct LLVM values rather than a pointer to
.unit global unit constant, we modify
Llvm_utils.is_function to accept non-pointer arguments. We also add a
is_pointer function to discriminate between references and direct values. The new variable lookup procedure is now as follows:
(* Look up variable [id]. *) let lookup_var comp id = try Hashtbl.find comp.args id with Not_found -> let g = Hashtbl.find comp.globals id in if is_function_pointer g then g else if is_global_constant g then global_initializer g else if is_pointer g then build_load g id comp.builder else g
Unit arguments are still filtered out for function calls, but now only after being compiled. We compile all arguments (whether unit-typed or not) from left to right. Although one should not rely on this evaluation order, this is the most intuitive one if you rely on it anyway.
let rec compile comp ast = match ast.node with ... | Call(f, args) -> (* Compile arguments, discarding unit-typed ones once compiled *) let handle_arg args arg = let arg' = compile comp arg in if type_of_ast arg = Unit then args else arg' :: args in let args = List.rev (List.fold_left handle_arg  args) in (* Compile and call function with compiled and filtered arguments *) let f' = compile comp f in build_call f' (Array.of_list args) "" comp.builder
Our example session now runs fine: