This is the 14th article of a series (table of contents) about compiler development with LLVM using OCaml. We intend to develop a compiler for a subset of OCaml large enough to allow our compiler to compile itself.
In this article, we improve the compilation of unit
values.
We already discussed the Unit type when we considered external declarations. This type can only represent a single value, also named unit
and denoted ()
in OCaml.
As unit value are unequivocally known solely from their type, there is no need to physically store or return unit values, or to pass them as arguments.
In the current photon compiler, we use a 1-bit integer type to represent unit values. To avoid storing needlessly many times unit values, we use one .unit
constant for this purpose. Unit arguments are also ignored, and when unit
is the return type of a function, it is replaced by void
.
Other than having a dubious unit-typed constant, our code has a serious bug regarding units arguments, which are fully ignored, i.e. not even evaluated. Here is an example of the surprising behaviour:
# extern puts : string -> unit = "puts";; extern puts : string -> unit = "puts" # let f (u : unit) : unit = puts "f called";; val f : unit -> unit = <fun> # f (puts "Hello");; f called - : unit = ()
Contrarily to what one could expect, no “Hello” message is printed.
Rather than using 1-bit integer type for unit
, we will use the void_type
.
There is no longer something special to do for function return types, which will already be converted to void_type
by llvm_type_of
.
When unit values are needed, we will use undefined values through LLVM's undef
. These values of course cannot be used, but we will never do. If we do, an error will quickly remind us which is better than stupidly mishandling 1-bit values.
As unit values are now direct LLVM values rather than a pointer to .unit
global unit constant, we modify Llvm_utils.is_function
to accept non-pointer arguments. We also add a is_pointer
function to discriminate between references and direct values. The new variable lookup procedure is now as follows:
(* Look up variable [id]. *) let lookup_var comp id = try Hashtbl.find comp.args id with Not_found -> let g = Hashtbl.find comp.globals id in if is_function_pointer g then g else if is_global_constant g then global_initializer g else if is_pointer g then build_load g id comp.builder else g
Unit arguments are still filtered out for function calls, but now only after being compiled. We compile all arguments (whether unit-typed or not) from left to right. Although one should not rely on this evaluation order, this is the most intuitive one if you rely on it anyway.
let rec compile comp ast = match ast.node with ... | Call(f, args) -> (* Compile arguments, discarding unit-typed ones once compiled *) let handle_arg args arg = let arg' = compile comp arg in if type_of_ast arg = Unit then args else arg' :: args in let args = List.rev (List.fold_left handle_arg [] args) in (* Compile and call function with compiled and filtered arguments *) let f' = compile comp f in build_call f' (Array.of_list args) "" comp.builder
Our example session now runs fine:
# extern puts : string -> unit = "puts";; extern puts : string -> unit = "puts" # let f (u : unit) : unit = puts "f called";; val f : unit -> unit = <fun> # f (puts "Hello");; Hello f called - : unit = ()
The code accompanying this article is available in archive photon-tut-14.tar.xz or through the git repository:
git clone http://git.legiasoft.com/photon.git cd photon git checkout 14-improved_unit_handling
In the next installment, we will cover batch processing.
Discussion