User Tools

Site Tools


blog:2012:01:24-improved_unit_handling

Photon Compiler Development: Improved Unit Handling

This is the 14th article of a series (table of contents) about compiler development with LLVM using OCaml. We intend to develop a compiler for a subset of OCaml large enough to allow our compiler to compile itself.

In this article, we improve the compilation of unit values.

What Is Unit?

We already discussed the Unit type when we considered external declarations. This type can only represent a single value, also named unit and denoted () in OCaml.

As unit value are unequivocally known solely from their type, there is no need to physically store or return unit values, or to pass them as arguments.

Current Implementation

In the current photon compiler, we use a 1-bit integer type to represent unit values. To avoid storing needlessly many times unit values, we use one .unit constant for this purpose. Unit arguments are also ignored, and when unit is the return type of a function, it is replaced by void.

Other than having a dubious unit-typed constant, our code has a serious bug regarding units arguments, which are fully ignored, i.e. not even evaluated. Here is an example of the surprising behaviour:

# extern puts : string -> unit = "puts";;
extern puts : string -> unit = "puts"
# let f (u : unit) : unit = puts "f called";;
val f : unit -> unit = <fun>
# f (puts "Hello");;
f called
- : unit = ()

Contrarily to what one could expect, no “Hello” message is printed.

Improvement

Rather than using 1-bit integer type for unit, we will use the void_type.

There is no longer something special to do for function return types, which will already be converted to void_type by llvm_type_of.

When unit values are needed, we will use undefined values through LLVM's undef. These values of course cannot be used, but we will never do. If we do, an error will quickly remind us which is better than stupidly mishandling 1-bit values.

As unit values are now direct LLVM values rather than a pointer to .unit global unit constant, we modify Llvm_utils.is_function to accept non-pointer arguments. We also add a is_pointer function to discriminate between references and direct values. The new variable lookup procedure is now as follows:

(* Look up variable [id]. *)
let lookup_var comp id =
   try Hashtbl.find comp.args id
   with Not_found ->
      let g = Hashtbl.find comp.globals id in
      if is_function_pointer g then g
      else if is_global_constant g then global_initializer g
      else if is_pointer g then build_load g id comp.builder
      else g

Unit arguments are still filtered out for function calls, but now only after being compiled. We compile all arguments (whether unit-typed or not) from left to right. Although one should not rely on this evaluation order, this is the most intuitive one if you rely on it anyway.

let rec compile comp ast =
   match ast.node with
     ...
   | Call(f, args) ->
        (* Compile arguments, discarding unit-typed ones once compiled *)
        let handle_arg args arg =
           let arg' = compile comp arg in
           if type_of_ast arg = Unit then args
           else arg' :: args in
        let args = List.rev (List.fold_left handle_arg [] args) in
        (* Compile and call function with compiled and filtered arguments *)
        let f' = compile comp f in
        build_call f' (Array.of_list args) "" comp.builder

Our example session now runs fine:

# extern puts : string -> unit = "puts";;
extern puts : string -> unit = "puts"
# let f (u : unit) : unit = puts "f called";;
val f : unit -> unit = <fun>
# f (puts "Hello");;
Hello
f called
- : unit = ()

Source Code

The code accompanying this article is available in archive photon-tut-14.tar.xz or through the git repository:

git clone http://git.legiasoft.com/photon.git
cd photon
git checkout 14-improved_unit_handling

In the next installment, we will cover batch processing.

Discussion

Enter your comment. Wiki syntax is allowed:
 
blog/2012/01/24-improved_unit_handling.txt · Last modified: 2012/01/24 22:05 by csoldani