Conditional compilation with "if" and "use" in Elixir

# TL; DR

If an optional module requires you to use use Module in your code, you will need to check the availability of that module before defmodule and use Module:

# The condition here does have to be Code.ensure_loaded?(OptionalModule)
# It can be Application.compile_env(:my_app, :enable_optional_feature)
# or any other conditions that can be evaluted at compile-time
if Code.ensure_loaded?(OptionalModule) do
  defmodule MyLibrary do
    use OptionalModule, option: value
    def my_function do
      OptionalModule.function()
    end
  end
else
  defmodule MyLibrary do
    def my_function do
      raise RuntimeError, """
      `OptionalModule` does not exist. To use this feature,
      please add `:optional_module` to the dependency list.
      """
    end
  end
end

and following code won't work, although it looks simpler and is seemingly correct.

defmodule MyLibrary do
  if Code.ensure_loaded?(OptionalModule) do
    use OptionalModule, option: value
    def my_function do
      OptionalModule.function()
    end
  else
    def my_function do
      raise RuntimeError, """
      `OptionalModule` does not exist. To use this feature,
      please add `:optional_module` to the dependency list.
      """
    end
  end
end

# Environment for This Post

At the time of writing, I'm using Elixir version v1.14.1. OTP version and operating system are not related to the things discussed below.

This behaviour is unlikely to change in the future, however, if you are using a future version of Elixir, please always verify the correctness/validity of the code and the conclusion in the post.

# A careless mistake?

A few hours ago I received an issue report related to evision v0.1.14 on elixirforum, and the issue can be reproduced in one line:

$ iex
iex> Mix.install({:evision, "== 0.1.14"})
...
==> evision
Compiling 179 files (.ex)

== Compilation error in file lib/smartcell/ml_traindata.ex ==
** (CompileError) lib/smartcell/ml_traindata.ex:3: module Kino.JS is not loaded and could not be found. This may be happening because the module you are trying to load directly or indirectly depends on the current module
    (elixir 1.14.0) expanding macro: Kernel.use/2
    lib/smartcell/ml_traindata.ex:3: Evision.SmartCell.ML.TrainData (module)
    (elixir 1.14.0) expanding macro: Kernel.if/2
    lib/smartcell/ml_traindata.ex:2: Evision.SmartCell.ML.TrainData (module)
could not compile dependency :evision, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile evision", update it with "mix deps.update evision" or clean it with "mix deps.clean evision"

And how did I fail to catch this, or how did this bug hide under the seemingly correct code? Is this simply a careless mistake or is there something deeper inside it? Let me unroll this story about if and use for you.

# Use if at Compile-time

Using if at compile-time is pretty common to see in Elixir. It is mainly used to define some functions under some conditions, for example

defmodule A do
  if 1 < 2 do
    def print, do: IO.puts("Yes, 1 < 2")
  else
    def print, do: IO.puts("Oh...")
  end
end

A.print

And if we run it, the expected behaviour is that the program prints Yes, 1 < 2 and exits.

$ mix run --no-mix-exs exmaple.exs
Yes, 1 < 2

Also, we can verify that the elixir compiler will remove the false branch of the if-statement (if the condition can be determined at compile-time)

defmodule A do
  if 1 < 2 do
    def should_exist, do: IO.puts("ok")
  else
    def should_not_exist, do: IO.puts("what")
  end
end

A.should_exist
A.should_not_exist
$ mix run --no-mix-exs false-branch.exs
ok
** (UndefinedFunctionError) function A.should_not_exist/0 is undefined or private. Did you mean:

      * should_exist/0

    A.should_not_exist()
    (elixir 1.14.0) lib/code.ex:1245: Code.require_file/2
    (mix 1.14.0) lib/mix/tasks/run.ex:144: Mix.Tasks.Run.run/5

Lastly, we can verify that if we use the return value of Code.ensure_loaded?/1, it will do the same thing as the above examples -- we can expect only the code under the true branch will be compiled:

# mix new ensure_loaded_example
# cd ensure_loaded_example

defmodule A do
  if Code.ensure_loaded?(B) do
    def has_b?, do: true
    def should_not_exists, do: IO.puts("what")
  else
    def has_b?, do: false
    def should_exists, do: IO.puts("ok")
  end
end

Let's try it in the IEx session, iex -S mix

$ iex -S mix
iex> A.has_b?
false
iex> defmodule B do
...> end
iex> A.has_b?
false
iex> A.should_exists
ok
:ok
iex> A.should_not_exists
** (UndefinedFunctionError) function A.should_not_exists/0 is undefined or private. Did you mean:

      * should_exists/0

    (task1b 0.1.0) A.should_not_exists()
    iex:5: (file)

# Apply What We Have Learnt So Far

Now, a quick background on the module that caused the compilation error: we'd like to write a module that implements Kino.SmartCell, and we'd like this to be an optional feature.

Therefore, we should check if Kino.SmartCell is loaded, if yes, we then follow kino's SmartCell tutorial, define our own functions and get the job done; otherwise, no functions will be defined in this module.

With the above idea in mind, we have the following code:

defmodule Evision.SmartCell.ML.TrainData do
  if Code.ensure_loaded?(Kino.SmartCell) do
    use Kino.JS, assets_path: "lib/assets"
    # ...
  end
end

At first glance, this code looks good to me because

  1. Kino.SmartCell is built on top of Kino.JS. Therefore, if Kino.SmartCell is loaded, then Kino.JS must be loaded too.
  2. we use use after we have ensured that Kino.JS is loaded.

Of course, :kino is indeed listed in deps in evision's mix.exs file.

# What Went Wrong?

Well then, let's face the inevitable question -- what went wrong?

Apparently, the elixir compiler tried to evaluate the use-statement on line 3, although we expected the compiler to completely avoid evaluating this false branch. However, since the elixir compiler has to check all the code, including the code in the false branch, is syntactically correct, e.g., the following code will not pass the syntax check:

defmodule Foo do
  if false do
    Bar.(
  end
end
$ mix run --no-mix-exs foo.exs
** (SyntaxError) foo.exs:9:3: unexpected reserved word: end. The "(" at line 8 is missing terminator ")"
    (elixir 1.14.0) lib/code.ex:1245: Code.require_file/2
    (mix 1.14.0) lib/mix/tasks/run.ex:144: Mix.Tasks.Run.run/5
    (mix 1.14.0) lib/mix/tasks/run.ex:84: Mix.Tasks.Run.run/1
    (mix 1.14.0) lib/mix/task.ex:421: anonymous fn/3 in Mix.Task.run_task/4
    (mix 1.14.0) lib/mix/cli.ex:84: Mix.CLI.run_task/2

During this process, macros will be expanded. You might have already known that use is a macro. So yes, that's where the problem emerges.

Based on the information given in the Getting Started manual on elixir-lang.org, the following code

defmodule A do
  use Kino.JS, assets: "lib/assets"
end

will be compiled into

defmodule A do
  require Kino.JS
  Kino.JS.__using__(assets: "lib/assets")
end

Since __using__ is a macro, we have to call require to bring in all the macros defined in that module.

You see, if the application :kino is optional and the user didn't list :kino in their deps, then require will surely fail because it cannot find the Kino.JS module let alone the macros defined in Kino.JS.

And require will fail even if the use-statement (or the require-statement, after the expansion) is technically inside the false branch of an if-statement, and even when the condition is absolutely false:

defmodule A do
  if false do
    use NOT_EXISTS
  end
end
$ mix run --no-mix-exs absolutely-false.exs
** (CompileError) absolutely-false.exs:3: module NOT_EXISTS is not loaded and could not be found
    (elixir 1.14.0) expanding macro: Kernel.use/1
    example.exs:3: A (module)
    (elixir 1.14.0) expanding macro: Kernel.if/2
    example.exs:2: A (module)

Any Solution to This?

The simplest way is, of course, listing :kino as a required dependency, but I found another way to solve this issue while keeping :kino as an optional dependency.

We just saw that the following code would cause a compilation error,

defmodule A do
  if false do
    use NOT_EXISTS
  end
end

but if we slightly re-arrange these three macros (yes, defmodule, if and use they are all macros), we can achieve what we want:

if false do
  defmodule A do
    use NOT_EXISTS
  end
end

As for evision, we can change the code from

defmodule Evision.SmartCell.ML.TrainData do
  if Code.ensure_loaded?(Kino.SmartCell) do
    use Kino.JS, assets_path: "lib/assets"
    # ...
  end
end

to the following

if Code.ensure_loaded?(Kino.SmartCell) do
  defmodule Evision.SmartCell.ML.TrainData do
    use Kino.JS, assets_path: "lib/assets"
    # ...
  end
end

And...problem solved! However, this post will not end here without knowing its reasons -- why this happened in the first place? why it works after re-arranging if, defmodule and use.

# The Reasons and Explanations

After digging into Elixir's source code for a few hours, I found some clues that might be related to this. Since we placed the if-statement first and it solved the problem, I was guessing that it might have something to do with the Elixir compiler and how the compiler behaves when traversing the AST (abstract syntax tree).

  1. The first clue is about macro expansion, like when macros are expanded by the compiler.
  2. The second one is the optimize_boolean/1 function in build_if/2 in kernel.ex. Maybe the if-statement here didn't get optimised, which could cause the compilation error.
  3. The last clue I suspected is allows_fast_compilation/1 for defmodule, in elixir_compiler.erl. Because the name of this function is kinda sus, and based on the name, it seems to allow the compiler to skip/defer the evaluation of defmodule.
allows_fast_compilation({defmodule, _, [_, [{do, _}]]}) ->
  true;

Then I decided to ask @josevalim about this question, and he replied kindly, and his answers solved this puzzle. I'll now connect all the dots along with his answers and explain this mystery below.

A big thanks to José!

In Elixir's source code (v1.14.1), defmodule is defined in kernel.ex,

defmacro defmodule(alias, do_block)

use is in kernel.ex too

defmacro use(module, opts \\ [])

and, of course, as spoiled above, if is also a macro

defmacro if(condition, clauses)

And macros are always expanded before the branches (e.g., an if-statement has two branches, true and false) are evaluated. That is to say, when the compiler sees the following code

defmodule A do
  if false do
    use NOT_EXISTS
  end
end

It will expand the macros first:

defmodule A do
  if false do
    require NOT_EXISTS
    NOT_EXISTS.__using__([])
  end
end

But it will fail when evaluating require NOT_EXISTS because the module does not exist. We can see this based on the traced stack:

When the compiler sees an if-statement, it will go inside and expand macros (if any). That's why the Kernel.use/1 is on top of Kernel.if/2 in the traced stack:

$ mix run --no-mix-exs absolutely-false.exs
** (CompileError) absolutely-false.exs:3: module NOT_EXISTS is not loaded and could not be found
    (elixir 1.14.0) expanding macro: Kernel.use/1
    example.exs:3: A (module)
    (elixir 1.14.0) expanding macro: Kernel.if/2
    example.exs:2: A (module)

And then I asked why putting if before defmodule will not cause the same compilation error. And José answered:

Because there are a few things that delay macro expansion, such as defining new modules or defining functions. Because you can do this:

module = Foo
defmodule module do
  ...
end

Therefore, to define the module, you must execute the line that defines it. So only after you execute the line, the macros are expanded. defmodule A do ... end compiles to something like this:

:elixir_module.compile(A, ast_of_the_module_body)

that AST will only be expanded if the module is in fact defined.

Therefore, the following code

defmodule A do
  if false do
    use NOT_EXISTS
  end
end

is first compiled to the following quoted code in kernel.ex

quote do
  unquote(with_alias)
  :elixir_module.compile(unquote(expanded), unquote(escaped), unquote(module_vars), __ENV__)
end

And this emitted code will be executed because the defmodule in this example is unconditionally written in the source code instead of existing in the false branch of an if-statement. When :elixir_module.compile/4 is invoked, the if-statement will be evaluated, and the use macro will be expanded during the evaluation process of if.

Again, we can verify this by looking at the stacktrace:

$ mix run --no-mix-exs absolutely-false.exs
** (CompileError) absolutely-false.exs:3: module NOT_EXISTS is not loaded and could not be found
    (elixir 1.14.0) expanding macro: Kernel.use/1
    example.exs:3: A (module)
    (elixir 1.14.0) expanding macro: Kernel.if/2
    example.exs:2: A (module)

Searching is not loaded and could not be found in Elixir's source code will bring us to format_error/1 in elixir_aliases.erl.

format_error({unloaded_module, Module}) ->
  io_lib:format("module ~ts is not loaded and could not be found", [inspect(Module)]);

Since unloaded_module is an atom, we can search it and trace back where it was produced. And it only appears in the same file in ensure_loaded/3.

ensure_loaded(Meta, Module, E) ->
  case code:ensure_loaded(Module) of
    {module, Module} ->
      ok;

    _ ->
      case wait_for_module(Module) of
        found ->
          ok;

        Wait ->
          Kind = case lists:member(Module, ?key(E, context_modules)) of
            true ->
              case ?key(E, module) of
                Module -> circular_module;
                _ -> scheduled_module
              end;
            false when Wait == deadlock ->
              deadlock_module;
            false ->
              unloaded_module
          end,

          elixir_errors:form_error(Meta, E, ?MODULE, {Kind, Module})
      end
  end.

This function will first check if the module is already loaded, if not, it will invoke wait_for_module/1 and wait for the module to be compiled if that module exists and can be compiled.

wait_for_module(Module) ->
  case erlang:get(elixir_compiler_info) of
    undefined -> not_found;
    _ -> 'Elixir.Kernel.ErrorHandler':ensure_compiled(Module, module, hard)
  end.

Yet obviously module NOT_EXISTS does not exist, and 'Elixir.Kernel.ErrorHandler':ensure_compiled/3 will return unloaded_module, which will be subsequently passed to elixir_errors:form_error/4.

form_error(Meta, #{file := File}, Module, Desc) ->
  compile_error(Meta, File, Module:format_error(Desc));
form_error(Meta, File, Module, Desc) ->
  compile_error(Meta, File, Module:format_error(Desc)).

and become the compilation error message.

** (CompileError) absolutely-false.exs:3: module NOT_EXISTS is not loaded and could not be found

And to dig deeper, we can search :ensure_loaded in the code base and the one in elixir_expand.erl invoked in function expand/3 are what we are interested in.

expand({require, Meta, [Ref, Opts]}, S, E) ->
  assert_no_match_or_guard_scope(Meta, "require", S, E),

  {ERef, SR, ER} = expand_without_aliases_report(Ref, S, E),
  {EOpts, ST, ET}  = expand_opts(Meta, require, [as, warn], no_alias_opts(Opts), SR, ER),

  if
    is_atom(ERef) ->
      elixir_aliases:ensure_loaded(Meta, ERef, ET),
      {ERef, ST, expand_require(Meta, ERef, EOpts, ET)};
    true ->
      form_error(Meta, E, ?MODULE, {expected_compile_time_module, require, Ref})
  end;

And this is exactly where the compilation error happens.

defmodule A do
  if false do
    require NOT_EXISTS       # <= elixir_aliases:ensure_loaded(Meta, ERef, ET),
    NOT_EXISTS.__using__([])
  end
end

And we can continue to search elixir_expand:expand in the source code, and we will find the function expand_quoted/7 in elixir_dispatch.erl.

expand_quoted(Meta, Receiver, Name, Arity, Quoted, S, E) ->
  Next = elixir_module:next_counter(?key(E, module)),

  try
    ToExpand = elixir_quote:linify_with_context_counter(Meta, {Receiver, Next}, Quoted),
    elixir_expand:expand(ToExpand, S, E)
  catch
    Kind:Reason:Stacktrace ->
      MFA  = {Receiver, elixir_utils:macro_name(Name), Arity+1},
      Info = [{Receiver, Name, Arity, [{file, "expanding macro"}]}, caller(?line(Meta), E)],
      erlang:raise(Kind, Reason, prune_stacktrace(Stacktrace, MFA, Info, error))
  end.

Note that the Info on line 248 will become

    ...
    (elixir 1.14.0) expanding macro: Kernel.use/1
    ...
    (elixir 1.14.0) expanding macro: Kernel.if/2
    ...

in the stacktrace. One more step and we will find that expand_quoted/7 is called in two dispatch functions: dispatch_import/6 and dispatch_require/7 in the same file.

dispatch_require(Meta, Receiver, Name, Args, S, E, Callback) when is_atom(Receiver) ->
  Arity = length(Args),

  case elixir_rewrite:inline(Receiver, Name, Arity) of
    {AR, AN} ->
      Callback(AR, AN, Args);
    false ->
      case expand_require(Meta, Receiver, {Name, Arity}, Args, S, E) of
        {ok, Receiver, Quoted} -> expand_quoted(Meta, Receiver, Name, Arity, Quoted, S, E);
        error -> Callback(Receiver, Name, Args)
      end
  end;

dispatch_require(_Meta, Receiver, Name, Args, _S, _E, Callback) ->
  Callback(Receiver, Name, Args).

As for elixir_dispatch:dispatch_require/7, it is invoked in two places: the first one is when processing remote calls in elixir_expand:expand/3. A remote call is to invoke/call a function in another module other than the current one. For example,

defmodule A do
  def foo, do: :ok

  def bar do
    # this is a local call
    foo()
  end
end

defmodule B do
  def baz do
    # this is a remote call
    A.foo()
  end
end

The second call to elixir_dispatch:dispatch_require/7 can be found in elixir_module:expand_callback/6.

expand_callback(Line, M, F, Args, Acc, Fun) ->
  E = elixir_env:reset_vars(Acc),
  S = elixir_env:env_to_ex(E),
  Meta = [{line, Line}, {required, true}],

  {EE, _S, ET} =
    elixir_dispatch:dispatch_require(Meta, M, F, Args, S, E, fun(AM, AF, AA) ->
      Fun(AM, AF, AA),
      {ok, S, E}
    end),

  if
    is_atom(EE) ->
      ET;
    true ->
      try
        {_Value, _Binding, EF} = elixir:eval_forms(EE, [], ET),
        EF
      catch
        Kind:Reason:Stacktrace ->
          Info = {M, F, length(Args), location(Line, E)},
          erlang:raise(Kind, Reason, prune_stacktrace(Info, Stacktrace))
      end
  end.

elixir_module:expand_callback/6 is used in two places: the first one is in Protocol.derive/5, and as the module name suggested, it is related to deriving a protocol for a module.

We can go off on a tangent here and explore what will happen if we want to derive a protocol optionally. You can skip this and jump to the second call to elixir_module:expand_callback/6.

Let's say we have already defined a protocol Derivable as the following:

defprotocol Derivable do
  def ok(arg)
end

defimpl Derivable, for: Any do
  defmacro __deriving__(module, struct, options) do
    quote do
      defimpl Derivable, for: unquote(module) do
        def ok(arg) do
          {:ok, arg, unquote(Macro.escape(struct)), unquote(options)}
        end
      end
    end
  end

  def ok(arg) do
    {:ok, arg}
  end
end

Then there are three ways to derive a protocol for a module, and we can start with the simplest one:

  • Deriving a protocol using the @derive tag.
defmodule A do
  @derive {Derivable, option: :some_value}
  defstruct a: 0, b: 0
end
  • The second way is using the defimpl/3 macro in kernel.ex.
defmodule A do
  defimpl Derivable do
    def ok(_arg), do: :my_args
  end
end
defmodule A do
  defstruct a: 0, b: 0

  require Protocol
  Protocol.derive(Derivable, A, option: :some_value)
end

What will happen if we'd like to derive a protocol from an optional module?

defmodule A do
  if Code.ensure_loaded?(NOT_EXISTS) do
    @derive {NOT_EXISTS, option: :some_value}
  end
  defstruct a: 0, b: 0
end

That's the first one, and the second one is shown below

defmodule A do
  if Code.ensure_loaded?(NOT_EXISTS) do
    defimpl NOT_EXISTS, for: A do
    end
  end
  defstruct a: 0, b: 0
end

And the last one

defmodule A do
  defstruct a: 0, b: 0
  
  if Code.ensure_loaded?(NOT_EXISTS) do
    require Protocol
    Protocol.derive(NOT_EXISTS, A, option: :some_value)
  end
end

All three examples that optionally derive a protocol from an optional module will compile without any issues and behave as expected. Of course, we'd like to ask the question -- why can we use if inside the defmodule to optionally derive a protocol?

Let me explain the reasons for the first way first. The @derive tag is a module attribute, and @derive tags in the same module will be collected accumulatively in a bag -- simply put, they will be stored in a list. Then the defstruct/1 macro will retrieve them from Kernel.Utils.defstruct/3 and rewrite them with Protocol.__derive__/3. (This also explains why all @derive tags must be set before defstruct/1)

defmacro defstruct(fields) do
  quote bind_quoted: [fields: fields, bootstrapped?: bootstrapped?(Enum)] do
    {struct, derive, kv, body} = Kernel.Utils.defstruct(__MODULE__, fields, bootstrapped?)

    case derive do
      [] -> :ok
      _ -> Protocol.__derive__(derive, __MODULE__, __ENV__)
    end

    def __struct__(), do: @__struct__
    def __struct__(unquote(kv)), do: unquote(body)

    Kernel.Utils.announce_struct(__MODULE__)
    struct
  end
end

Since Protocol.__derive__/3 is a function instead of a macro, it will not be evaluated/expanded when processing the branches of an if-statement. Therefore, the following code will compile and run as expected.

defmodule A do
  if Code.ensure_loaded?(NOT_EXISTS) do
    @derive {NOT_EXISTS, option: :some_value}
  end
  defstruct a: 0, b: 0
end

As for the second way that uses the defimpl/3 macro in kernel.ex, although defimpl/3 is a macro, it simply rewrites the code to call the Protocol.__impl__/4 function.

defmacro defimpl(name, opts, do_block \\ []) do
  Protocol.__impl__(name, opts, do_block, __CALLER__)
end

So, it's similar to the first one, the call to the function Protocol.__impl__/4 will not be evaluated when processing the branches of an if-statement.

The same reason goes for the last one: Protocol.derive/3 is a function, so it will not be evaluated when processing the branches of an if-statement.

Let's get back on track. The second call to elixir_module:expand_callback/6 can be found in eval_callbacks/5 in elixir_module.erl.

eval_callbacks(Line, DataBag, Name, Args, E) ->
  Callbacks = bag_lookup_element(DataBag, {accumulate, Name}, 2),
  lists:foldl(fun({M, F}, Acc) ->
    expand_callback(Line, M, F, Args, Acc, fun(AM, AF, AA) -> apply(AM, AF, AA) end)
  end, E, Callbacks).

eval_callbacks/5 is called in the same file in two different functions (actually one function, depending on how you see this). The first line it appears in the file is inside the compile/5 function (line 161). The second time it appears is in the eval_form/6 function (line 378).

However, eval_form/6 is only called once from the compile/5 function. The obvious difference is that the third argument passed by eval_form/6 is before_compile whereas compile/5 passes after_compile. So I guess you could say that eval_callbacks/5 is called in one function, compile/5.

compile(Line, Module, Block, Vars, E) ->
  File = ?key(E, file),
  check_module_availability(Line, File, Module),
  ModuleAsCharlist = validate_module_name(Line, File, Module),

  CompilerModules = compiler_modules(),
  {Tables, Ref} = build(Line, File, Module),
  {DataSet, DataBag} = Tables,

  try
    put_compiler_modules([Module | CompilerModules]),
    {Result, NE} = eval_form(Line, Module, DataBag, Block, Vars, E),
    CheckerInfo = checker_info(),
    ...

Based on the value of the third argument, and what we have is a compilation error, we can confirm that the compilation error is thrown when calling eval_form/6 with the third argument as before_compile.

And we are very close to connecting all the dots: eval_form/6 is called from elixir_module:compile/5, and elixir_module:compile/5 is called in elixir_module:compile/4, which is exactly the code that defmodule rewrites to!

So, if eval_callbacks/5 in eval_form/6 is successfully executed, and there are no other errors, we can expect the optimize_boolean/1 function in build_if/2

Connecting the Dots

Let's first reproduce the whole process with the code that will cause a compilation error.

defmodule A do
  if false do
    use NOT_EXISTS
  end
end

And we start from the shell command mix run --no-mix-exs absolutely-false.exs. The entry point for mix run [argv...] is Mix.Tasks.Run.run/1, and it will parse command line arguments and call run/5.

def run(args) do
  {opts, head} =
    OptionParser.parse_head!(
      args,
      aliases: [r: :require, p: :parallel, e: :eval, c: :config],
      strict: [
        parallel: :boolean,
        require: :keep,
        eval: :keep,
        config: :keep,
        mix_exs: :boolean,
        halt: :boolean,
        compile: :boolean,
        deps_check: :boolean,
        start: :boolean,
        archives_check: :boolean,
        elixir_version_check: :boolean,
        parallel_require: :keep,
        preload_modules: :boolean
      ]
    )

  run(args, opts, head, &Code.eval_string/1, &Code.require_file/1)
  unless Keyword.get(opts, :halt, true), do: System.no_halt(true)
  Mix.Task.reenable("run")
  :ok
end

In this example, {opts, head} will be

{[mix_exs: false], ["absolutely-false.exs"]}

In run/5, we can ignore other checks and focus on the call to the callback function file_evaulator.(file). And since file_evaulator is just &Code.require_file/1, we can jump to the function require_file/1 in the code.ex file.

def require_file(file, relative_to \\ nil) when is_binary(file) do
  {charlist, file} = find_file!(file, relative_to)

  case :elixir_code_server.call({:acquire, file}) do
    :required ->
      nil

    :proceed ->
      loaded =
        Module.ParallelChecker.verify(fn ->
          :elixir_compiler.string(charlist, file, fn _, _ -> :ok end)
        end)

      :elixir_code_server.cast({:required, file})
      loaded
  end
end

find_file!/2 expands the relative path to the absolute path of the file and reads the whole file into a char list. Then :elixir_code_server.call/1 will check if the file is already required (compiled), if yes, we will do nothing; if not compiled yet, :elixir_compiler.string/3 will be called to compile the code.

string(Contents, File, Callback) ->
  Forms = elixir:'string_to_quoted!'(Contents, 1, 1, File, elixir_config:get(parser_options)),
  quoted(Forms, File, Callback).

elixir:'string_to_quoted!'/5 will pass the file content to the tokenizer and convert tokens to their quoted form.

'string_to_quoted!'(String, StartLine, StartColumn, File, Opts) ->
  case string_to_tokens(String, StartLine, StartColumn, File, Opts) of
    {ok, Tokens} ->
      case tokens_to_quoted(Tokens, File, Opts) of
        {ok, Forms} ->
          Forms;
        {error, {Meta, Error, Token}} ->
          elixir_errors:parse_error(Meta, File, Error, Token, {String, StartLine, StartColumn})
      end;
    {error, {Meta, Error, Token}} ->
      elixir_errors:parse_error(Meta, File, Error, Token, {String, StartLine, StartColumn})
  end.

The following AST will be emitted for this code:

{:defmodule, [line: 1],
 [
   {:__aliases__, [line: 1], [:A]},
   [
     do: {:if, [line: 2],
      [
        false,
        [do: {:use, [line: 3], [{:__aliases__, [line: 3], [:NOT_EXISTS]}]}]
      ]}
   ]
 ]}

Next, elixir_compiler:quoted/3 will process the quoted form -- evaluate or compile the code with the local lexical environment in eval_or_compile/3.

quoted(Forms, File, Callback) ->
  Previous = get(elixir_module_binaries),

  try
    put(elixir_module_binaries, []),
    Env = (elixir_env:new())#{line := 1, file := File, tracers := elixir_config:get(tracers)},

    elixir_lexical:run(
      Env,
      fun (LexicalEnv) -> eval_or_compile(Forms, [], LexicalEnv) end,
      fun (#{lexical_tracker := Pid}) -> Callback(File, Pid) end
    ),

    lists:reverse(get(elixir_module_binaries))
  after
    put(elixir_module_binaries, Previous)
  end.

The implementation of eval_or_compile/3 is:

eval_or_compile(Forms, Args, E) ->
  case (?key(E, module) == nil) andalso allows_fast_compilation(Forms) andalso
        (not elixir_config:is_bootstrap()) of
    true  -> fast_compile(Forms, E);
    false -> compile(Forms, Args, E)
  end.

allows_fast_compilations/1 will check if this file always defines a module, if yes, we can skip some steps in compile/3 and always define the module.

allows_fast_compilation({'__block__', _, Exprs}) ->
  lists:all(fun allows_fast_compilation/1, Exprs);
allows_fast_compilation({defmodule, _, [_, [{do, _}]]}) ->
  true;
allows_fast_compilation(_) ->
  false.

And the AST shown above indeed matches the one in the middle, therefore, we can do fast_compile/2.

fast_compile({defmodule, Meta, [Mod, [{do, TailBlock}]]}, NoLineE) ->
  E = NoLineE#{line := ?line(Meta)},

  Block = {'__block__', Meta, [
    {'=', Meta, [{result, Meta, ?MODULE}, TailBlock]},
    {{'.', Meta, [elixir_utils, noop]}, Meta, []},
    {result, Meta, ?MODULE}
  ]},

  Expanded = case Mod of
    {'__aliases__', _, _} ->
      case elixir_aliases:expand_or_concat(Mod, E) of
        Receiver when is_atom(Receiver) -> Receiver;
        _ -> 'Elixir.Macro':expand(Mod, E)
      end;

    _ ->
      'Elixir.Macro':expand(Mod, E)
  end,

  ContextModules = [Expanded | ?key(E, context_modules)],
  elixir_module:compile(Expanded, Block, [], E#{context_modules := ContextModules}).

After the pattern matching, we get

Meta = [line: 1]
Mod = {:__aliases__, [line: 1], [:A]}
TailBlock = 
  {:if, [line: 2],
    [
      false,
      [do: {:use, [line: 3], [{:__aliases__, [line: 3], [:NOT_EXISTS]}]}]
    ]}

Now we can jump right into elixir_module:compile/4 which will call elixir_module:compile/5. And expect the compilation error right after calling elixir_module:eval_form/6 in elixir_module:compile/5. And you know the rest of the story.

What about the code that can achieve our goal?

if false do
  defmodule A do
    use NOT_EXISTS
  end
end

The AST of the above code is

{:if, [line: 1],
 [
   false,
   [
     do: {:defmodule, [line: 2],
      [
        {:__aliases__, [line: 2], [:A]},
        [do: {:use, [line: 3], [{:__aliases__, [line: 3], [:NOT_EXISTS]}]}]
      ]}
   ]
 ]}

And now the difference is that we cannot do fast_compile/2 because the file may not define a module. Therefore, now we will take the compile/3 route in eval_or_compile/3.

compile(Quoted, ArgsList, E) ->
  {Expanded, SE, EE} = elixir_expand:expand(Quoted, elixir_env:env_to_ex(E), E),
  elixir_env:check_unused_vars(SE, EE),

  {Module, Fun, Purgeable} =
    elixir_erl_compiler:spawn(fun() -> spawned_compile(Expanded, E) end),

  Args = list_to_tuple(ArgsList),
  {dispatch(Module, Fun, Args, Purgeable), EE}.

And the variable Expanded will be something like this

{:case, [line: 1, optimize_boolean: true],
 [
   false,
   [
     do: [
       {:->, [generated: true, line: 1], [[false], nil]},
       {:->, [generated: true, line: 1],
        [
          [true],
          {:__block__, [line: 2],
           [
             A,
             {{:., [line: 2], [:elixir_module, :compile]}, [line: 2],
              [
                ...
              ]}
           ]}
        ]}
     ]
   ]
 ]}

As we can see from this intermediate output, optimize_boolean/1 has already been called and gives us the tuple {:case, [line: 1, optimize_boolean: true], [false, [do: ...]]}.

In spawned_compile/2, we will translate quoted elixir expressions and variables to the forms that erlang can compile them into .beam binary.

spawned_compile(ExExprs, #{line := Line, file := File} = E) ->
  {Vars, S} = elixir_erl_var:from_env(E),
  {ErlExprs, _} = elixir_erl_pass:translate(ExExprs, erl_anno:new(Line), S),

  Module = retrieve_compiler_module(),
  Fun  = code_fun(?key(E, module)),
  Forms = code_mod(Fun, ErlExprs, Line, File, Module, Vars),

  {Module, Binary} = elixir_erl_compiler:noenv_forms(Forms, File, [nowarn_nomatch, no_bool_opt, no_ssa_opt]),
  code:load_binary(Module, "", Binary),
  {Module, Fun, is_purgeable(Module, Binary)}.

In function is_purgeable/2 we test if the beam binary has any labeled locals,

is_purgeable(Module, Binary) ->
  beam_lib:chunks(Binary, [labeled_locals]) == {ok, {Module, [{labeled_locals, []}]}}.

If not, we can evaluate the code and return the evaluated result as the compiled output (in function dispatch/4, and dispatch/4 was called in the compile/3 route).

dispatch(Module, Fun, Args, Purgeable) ->
  Res = Module:Fun(Args),
  code:delete(Module),
  Purgeable andalso code:purge(Module),
  return_compiler_module(Module, Purgeable),
  Res.

In this example, Res will be nil because the condition used in the if-statement is false (and the false branch can be purged); meanwhile, as there is no true branch, the result of the true branch will be evaluated to the default value, nil.

Therefore, if we were using Code.ensure_loaded?/1 as the condition, it would also be evaluated to either true or false at compile-time. If it is false, then everything in the false branch will be purged, so we won't encounter any compilation errors for the following code.

if Code.ensure_loaded?(NOT_EXISTS) do
  defmodule A do
    use NOT_EXISTS
  end
end

will be evaluated to

if false do
  defmodule A do
    use NOT_EXISTS
  end
else
  nil
end

# actually, it will be evaluated to something like this
case false do
  false -> 
    defmodule A do
      use NOT_EXISTS
    end
  true ->
    nil
end

and the false branch will be purged, which leaves us only nil

nil

Precomplation support in elixir_make

There is a growing number of Elixir libraries that come with functions implemented in foreign languages like C/C++, Rust, Zig and so on for different reasons. Compiling these foreign codes isn't a big issue on a powerful machine, but for less powerful devices (limited by size, thermal, power consumption, etc.) such as a Raspberry Pi, it can take a very long time.

Also, for livebook (.livemd file) that uses Mix.install to install dependent NIF libraries, such as:

Mix.install([
  {:nif_lib_a, "~> 0.1"}
])

the nif_lib_a will be compiled and cached based on the whole deps configuration. That means if we later would like to add another library, even the newly added one is implemented in pure elixir, nif_lib_a will have to be recompiled unless nif_lib_a explicitly uses a global location to cache itself.

Mix.install([
  # nif_lib_a will be compiled again
  # even if the newly added library is
  # implemented in pure elixir
  {:nif_lib_a, "~> 0.1"},
  {:pure_elixir_lib_b, "~> 0.1"}
])

There are some other reasons that one might want to use precompiled artefacts besides the above one,

  • when the compiling toolchain is not available (e.g, running livebook on some embedded devices, or running in the nerves environment, where a C compiler is not shipped)
  • a working C/C++, Rust, or Zig compiler will not be a strict requirement.
  • to save the compilation time.

Although it's possible to completely leave the task of reusing compiled artefacts to each NIF library, there should be a way to make it at least slightly easier.

Therefore I'm exploring adding a behaviour to elixir_make to support using precompiled artefacts. And here is the Mix.Tasks.ElixirMake.Precompile behaviour (elixir-lang/elixir_make#55).

It's a behaviour because this would allow elixir_make using different precompilers for different NIF libraries to suit the needs. Also, this allows the NIF library developers to change their preferred precompiler module. And lastly, as you might've guessed, anyone can write their own precompiler module if there is no existing precompiler that quite fits their compilation pipelines.

If you'd like to write a precompiler module yourself, there are 7 required and 2 optional callbacks. It's recommended to implement them in the following order:

  • all_supported_targets/0 and current_target/0.
  @typedoc """
  Target triplets
  """
  @type target :: String.t()

  @doc """
  This callback should return a list of triplets ("arch-os-abi") for all supported targets.
  """
  @callback all_supported_targets() :: [target]

  @doc """
  This callback should return the target triplet for current node.
  """
  @callback current_target() :: {:ok, target} | {:error, String.t()}

As their name suggests, the precompiler should return the identifier of all supported targets and the current target. Usually, the identifier is the arch-os-abi triplet, but it is not a strict requirement because these identifiers are only used within the same precompiler module.

Note that it is possible for the precompiler module to pick up other environment variables like TARGET_ARCH and use them to override the value of the current target.

  • build_native/1.
  @doc """
  This callback will be invoked when the user executes the `mix compile`
  (or `mix compile.elixir_make`) command.
  The precompiler should then compile the NIF library "natively". Note that
  it is possible for the precompiler module to pick up other environment variables
  like `TARGET_ARCH=aarch64` and adjust compile arguments correspondingly.
  """
  @callback build_native(OptionParser.argv()) :: :ok | {:ok, []} | no_return

This callback corresponds to the mix compile command. The precompiler should then compile the NIF library for the current target.

After implementing this one, you can try to compile the NIF library natively with the mix compile command.

  • precompile/2.
  @typedoc """
  A map that contains detailed info of a precompiled artefact.
  - `:path`, path to the archived build artefact.
  - `:checksum_algo`, name of the checksum algorithm.
  - `:checksum`, the checksum of the archived build artefact using `:checksum_algo`.
  """
  @type precompiled_artefact_detail :: %{
    :path => String.t(),
    :checksum => String.t(),
    :checksum_algo => atom
  }

  @typedoc """
  A tuple that indicates the target and the corresponding precompiled artefact detail info.
  `{target, precompiled_artefact_detail}`.
  """
  @type precompiled_artefact :: {target, precompiled_artefact_detail}

  @doc """
  This callback should precompile the library to the given target(s).
  Returns a list of `{target, acrhived_artefacts}` if successfully compiled.
  """
  @callback precompile(OptionParser.argv(), [target]) :: {:ok, [precompiled_artefact]} | no_return

There are two arguments passed to this callback, the first one is the command line args. For example, the value of the first one will be [arg1, arg2] if one executes mix elixir_make.precompile arg1 arg2.

The second argument passed to the callback is a list of target identifiers. The precompiler should compile the NIF library for all of these targets.Note that this list of targets is always a subset of the result returned by all_supported_targets/0 but not necessarily the same list. This is designed to avoid introducing a foreseeable breaking change for precompiler modules if we prefer to add a mechanism to filter out some targets in the future.

After implementing this one, you can try the mix elixir_make.precompile command to compile for all targets.

  • post_precompile/1.
  @doc """
  This optional callback will be invoked when all precompilation tasks are done,
  i.e., it will only be called at the end of the `mix elixir_make.precompile`
  command.
  Post actions to run after all precompilation tasks are done. For example,
  actions can be archiving all precompiled artefacts and uploading the archive
  file to an object storage server.
  """
  @callback post_precompile(context :: term()) :: :ok

This is an optional callback where the precompiler module can do something after precompile/2. The argument passed to this callback is the precompiler module context.

  • precompiler_context/1.
  @doc """
  This optional callback is designed to store the precompiler's state or context.
  The returned value will be used in the `download_or_reuse_nif_file/1` and
  `post_precompile/1` callback.
  """
  @callback precompiler_context(OptionParser.argv()) :: term()

This is an optional callback that returns a custom variable which is (perhaps) based on the list of the command line arguments passed to it. The returned variable may contain anything it needs for the post_precompile/1 and download_or_reuse_nif_file/1.

For example, a simple HTTP-based username and password that are passed by command line arguments. mix elixir_make.precompile --auth USERNAME:PASSWD or mix elixir_make.fetch --all --user hello --pass world.

  • available_nif_urls/0 and current_target_nif_url/0.
  @doc """
  This callback will be invoked when the user executes the following commands:
  - `mix elixir_make.fetch --all`
  - `mix elixir_make.fetch --all --print`
  The precompiler module should return all available URLs to precompiled artefacts
  of the NIF library.
  """
  @callback available_nif_urls() :: [String.t()]

  @doc """
  This callback will be invoked when the user executes the following commands:
  - `mix elixir_make.fetch --only-local`
  - `mix elixir_make.fetch --only-local --print`
  The precompiler module should return the URL to a precompiled artefact of
  the NIF library for current target (the "native" host).
  """
  @callback current_target_nif_url() :: String.t()

The first one should return a list of URLs to the precompiled artefacts of all available targets, and the second one should return a single URL for the current target.

  • download_or_reuse_nif_file/1.
  @doc """
  This callback will be invoked when the NIF library is trying to load functions
  from its shared library.
  The precompiler should download or reuse nif file for current target.
  ## Paramters
    - `context`: Precompiler context returned by the `precompiler_context` callback.
  """
  @callback download_or_reuse_nif_file(context :: term()) :: :ok | {:error, String.t()} | no_return

The precompiler module should either download the precompiler artefacts or reuse local caches for the current target. The argument passed to this callback is the precompiler context.

Below is the Mix.Tasks.ElixirMake.Precompile behaviour (as of 04 Aug 2022). And here is a complete demo precompiler, cocoa-xu/cc_precompiler and an example project that uses this demo precompiler, cocoa-xu/cc_precompiler_example.

defmodule Mix.Tasks.ElixirMake.Precompile do
  use Mix.Task

  @typedoc """
  Target triplets
  """
  @type target :: String.t()

  @doc """
  This callback should return a list of triplets ("arch-os-abi") for all supported targets.
  """
  @callback all_supported_targets() :: [target]

  @doc """
  This callback should return the target triplet for current node.
  """
  @callback current_target() :: {:ok, target} | {:error, String.t()}

  @doc """
  This callback will be invoked when the user executes the `mix compile`
  (or `mix compile.elixir_make`) command.
  The precompiler should then compile the NIF library "natively". Note that
  it is possible for the precompiler module to pick up other environment variables
  like `TARGET_ARCH=aarch64` and adjust compile arguments correspondingly.
  """
  @callback build_native(OptionParser.argv()) :: :ok | {:ok, []} | no_return

  @typedoc """
  A map that contains detailed info of a precompiled artefact.
  - `:path`, path to the archived build artefact.
  - `:checksum_algo`, name of the checksum algorithm.
  - `:checksum`, the checksum of the archived build artefact using `:checksum_algo`.
  """
  @type precompiled_artefact_detail :: %{
    :path => String.t(),
    :checksum => String.t(),
    :checksum_algo => atom
  }

  @typedoc """
  A tuple that indicates the target and the corresponding precompiled artefact detail info.
  `{target, precompiled_artefact_detail}`.
  """
  @type precompiled_artefact :: {target, precompiled_artefact_detail}

  @doc """
  This callback should precompile the library to the given target(s).
  Returns a list of `{target, acrhived_artefacts}` if successfully compiled.
  """
  @callback precompile(OptionParser.argv(), [target]) :: {:ok, [precompiled_artefact]} | no_return

  @doc """
  This callback will be invoked when the NIF library is trying to load functions
  from its shared library.
  The precompiler should download or reuse nif file for current target.
  ## Paramters
    - `context`: Precompiler context returned by the `precompiler_context` callback.
  """
  @callback download_or_reuse_nif_file(context :: term()) :: :ok | {:error, String.t()} | no_return

  @doc """
  This callback will be invoked when the user executes the following commands:
  - `mix elixir_make.fetch --all`
  - `mix elixir_make.fetch --all --print`
  The precompiler module should return all available URLs to precompiled artefacts
  of the NIF library.
  """
  @callback available_nif_urls() :: [String.t()]

  @doc """
  This callback will be invoked when the user executes the following commands:
  - `mix elixir_make.fetch --only-local`
  - `mix elixir_make.fetch --only-local --print`
  The precompiler module should return the URL to a precompiled artefact of
  the NIF library for current target (the "native" host).
  """
  @callback current_target_nif_url() :: String.t()

  @doc """
  This optional callback is designed to store the precompiler's state or context.
  The returned value will be used in the `download_or_reuse_nif_file/1` and
  `post_precompile/1` callback.
  """
  @callback precompiler_context(OptionParser.argv()) :: term()

  @doc """
  This optional callback will be invoked when all precompilation tasks are done,
  i.e., it will only be called at the end of the `mix elixir_make.precompile`
  command.
  Post actions to run after all precompilation tasks are done. For example,
  actions can be archiving all precompiled artefacts and uploading the archive
  file to an object storage server.
  """
  @callback post_precompile(context :: term()) :: :ok

  @optional_callbacks precompiler_context: 1, post_precompile: 1
end

Raspberry Pi Bluetooth Gamepad Setup

Setting up a Bluetooth gamepad on a Raspberry Pi may not be as straightforward as some were expected. This post documents how I set up a Bluetooth gamepad on a Raspberry Pi, and I hope it could help you if you encountered any difficulties when doing so.

Specifically, the gamepad I'm using is an Xbox controller. If you are using a joystick from other brands, there is a high chance that this post still works for you, but as I only have an Xbox controller, your mileage may vary.

1. Change Bluetooth Settings

The first thing to look at is /etc/bluetooth/main.conf. In my setup, I changed the following option values. The first five were in the General section and the last one was in the Policy section of this .conf file.

[General]
Class = 0x000100
ControllerMode = dual
FastConnectable = true
Privacy = device
JustWorksRepairing = always

[Policy]
AutoEnable=true

Then we can restart the bluetooth service or restart the pi.

sudo systemctl restart bluetooth
# or restart the pi
# sudo reboot

2. (Optional) Test

To test if everything works, we can use bluetoothctl to manually pair and connect to the joystick.

$ bluetoothctl
[bluetooth]# scan on
Discovery started
[NEW] Device AA:BB:CC:DD:EE:FF Controller Name
[bluetooth]# pair AA:BB:CC:DD:EE:FF
[bluetooth]# trust AA:BB:CC:DD:EE:FF
[bluetooth]# connect AA:BB:CC:DD:EE:FF

If there's still some issues connecting the joystick, then you may have a try to disable the Bluetooth ERTM feature.

echo <<EOF | sudo tee /etc/modprobe.d/bluetooth.conf
options bluetooth disable_ertm=1
EOF
sudo reboot

3. (Optional) Auto-connect Script

We can have an auto-connect script that tests whether the input device exists, and try to connect to the gamepad if the specified device is not presented.

#!/bin/sh
JS="$1"
ADDRESS="$2"
usage() {
	S="$(basename "$0")"
	echo "usage:  ${S} [input_name] [address]"
	echo "        ${S} js0 AA:BB:CC:DD:EE:FF"
	exit 1
}
if [ -z "${JS}"] || [ -z "${ADDRESS}" ]; then
	usage
fi

if [ -e "/dev/input/${JS}" ]; then
	echo "${JS} is connected"
else
	echo "try to connect ${ADDRESS}"
	echo "connect ${ADDRESS}" | bluetoothctl
fi

A cron task could be set to execute this script every minute. The following code shows how to auto-connect to a joystick at AA:BB:CC:DD:EE:FF, assuming only connect one gamepad to Raspberry Pi (thus testing /dev/input/js0).

* * * * * /usr/bin/autoconnect_joystick js0 AA:BB:CC:DD:EE:FF

Debug Erlang NIF library on macOS and Linux

Debug Erlang NIF library on macOS with Xcode

(Almost) Universal Steps

First of all, set CFLAGS that compiles the NIF library with debuginfo. (Of course, your compile script needs to pick up the CFLAGS environment variable and append/prepend it to other cflags).

# Compile NIF library with debug info (macOS and Linux)
export CFLAGS="-g3 -fno-omit-frame-pointer"

# Enable address sanitizer on Linux
#   We can use address sanitizer on macOS too
#   But it's more easy to use that in Xcode, I'll get to that bit later
#   Also, note that we cannot use static libasan, i.e., "-static-libasan"
#   For some reason, it won't compile if we specify that compile option
export CFLAGS="${CFLAGS} -fsanitize=address"

Next, get necessary info about the Erlang VM on your machine and set environment variables required by erlexec.

# Get full commands from mix
# It should be something like
#   "erl -pa ..."
export COMMANDS="$(ELIXIR_CLI_DRY_RUN=1 mix)"

# Only keep the arguments
# " -pa ..."
export CMD_ARGS="${COMMANDS:3}"

# Find the erl script
# The result could be something like
#   - If you/your package manager put erl in /usr/local
#     "/usr/local/bin/erl"
#   - If you installed erlang by asdf
#     "${HOME}/.asdf/shims/erl"
export ERL="$(which erl)"

# Either way, lets get the parent dir of the parent dir of the erl binary
#   - "/usr/local"
#   - "${HOME}/.asdf"
export ERL_BASE="$(dirname $(dirname ${ERL}))"

# Find the erlexec binary
#   erl is just a shell script that sets up the environment vars
#     and then starts erlexec
export ELREXEC="$(find "${ERL_BASE}" -name erlexec)"

# Set three required environment variables
#   1. BINDIR. 
#      The directory that erlexec resides in
export BINDIR="$(dirname ${ELREXEC})"

#   2. ROOTDIR. 
#      This one is a little bit tricky as there are some difference 
#        between the asdf version and others.
#      But it's just the directory where the file start.boot resides in.
#      Note that we might find two start.boot files in ${ERL_BASE},
#        one is used for release while not the other. 
#      We need the other one.      
export START_BOOT="$(find $ERL_BASE -name start.boot | grep -v release)"
export ROOTDIR="$(dirname $(dirname ${START_BOOT}))"

#   3. EMU
#      Should just be beam
export EMU=beam

Now you might need to make some minor changes to the Makefile to accommodate the debugging purpose. Let's take a simple Makefile as example.

PRIV_DIR = $(MIX_APP_PATH)/priv
NIF_SO = $(PRIV_DIR)/nif.so

C_SRC = $(shell pwd)/c_src
LIB_SRC = $(shell pwd)/lib
CFLAGS += -I$(ERTS_INCLUDE_DIR)
CPPFLAGS += $(CFLAGS) -std=c++14 -Wall -Wextra -pedantic
LDFLAGS += -shared

UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Darwin)
	LDFLAGS += -undefined dynamic_lookup -flat_namespace -undefined suppress
endif

.DEFAULT_GLOBAL := build

build: clean $(NIF_SO)

clean:
	@ if [ -z "${NOT_SKIP_COMPILE}" ]; then \
		rm -f $(NIF_SO) ; \
	fi

$(NIF_SO):
	@ mkdir -p $(PRIV_DIR)
	@ if [ -z "${NOT_SKIP_COMPILE}" ]; then \
		$(CC) $(CPPFLAGS) $(LDFLAGS) $(C_SRC)/nif.cpp -o $(NIF_SO) ; \
	fi

Note that we use the environment variable NOT_SKIP_COMPILE to indicate whether to recompile the shared library. If NOT_SKIP_COMPILE is empty or does not exist in env, then the shared library will be recompiled each time we call mix test (or other commands).

macOS

Debug with lldb in terminal

On macOS, we can use either lldb or Xcode (with lldb, of course). To test the NIF library with lldb, we need

# This is basically the full version of the "mix test" command
lldb -- ${ELREXEC} ${CMD_ARGS} test
(lldb) run
(lldb) c

Debug with Xcode in GUI

IMHO, this approach should make things much easier.

First, open Xcode and select "Debug executable..." in "Debug" from the menubar. Then choose erlexec in Finder. You can get the path to erlexec by

echo ${ERLEXEC}

After choosing erlexec, we'll need to set a few things in Xcode

Scheme - Run Debug

Add one entry in Arguments Passed On Launch, copy and paste everything in ${CMD_ARGS} to the new entry.

Add a second entry and write test. This is basically the full version of the mix test command.

And we need to set these environment variables, CFLAGS, BINDIR, ROOTDIR and EMU. You should have similar values as shown in the screenshot below.

Fill in launch arguments and set environment variables

To enable address sanitizer and/or enable other debug tools, you can click the Diagnostics tab and enable the ones you need.

Set working directory to the root directory of the mix project.

Also, we need to set the working directory in the Options tab to the root directory of the mix project.

With everything set correctly in the scheme panel, you can close it and click run or command+R to debug the NIF library in Xcode.

And because the NIF library is compiled with debuginfo, Xcode can jump to the line that triggers a crash. Moreover, you can also set breakpoints and rerun the test.

Various debug information is available

Furthermore, if you compiled erlang locally and you didn't delete the erlang source code, then stepping into erlang functions will also bring you to the corresponding line in the erlang source code! (Note, this feature is also available no matter we debug the NIF library in GUI or terminal, macOS or Linux. But to me, it's really easier to navigate and see variable values in Xcode.)

Step into Erlang functions and view variable values in Xcode.

Linux

To run address sanitizer on the NIF library, we need to do it in two steps. First, compile the shared library with the CFLAGS we just exported to the env. Second, inject libasan.so to erlexec. Sounds hard, but it is really simple.

# First, compile the shared library
#   Ignore the error message from libasan
bash -c "$ELREXEC $CMD_ARGS test" || true

# Secondly, get the path to libasan.so and set LD_PRELOAD
#   so that it will be loaded before erlexec
export LIBASAN_SO="$(gcc -print-file-name=libasan.so)"
bash -c "LD_PRELOAD=${LIBASAN_SO} NOT_SKIP_COMPILE=skip $ELREXEC $CMD_ARGS test"

If everything goes smoothly, libasan will print diagnosis messages after the process exits, for example

LD_PRELOAD=$(gcc -print-file-name=libasan.so) NOT_SKIP_COMPILE=skip /home/cocoa/.asdf/installs/erlang/24.2/erts-12.2/bin/erlexec -pa /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/eex/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/elixir/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/ex_unit/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/iex/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/logger/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/mix/ebin -elixir ansi_enabled true -noshell -s elixir start_cli -extra /home/cocoa/.asdf/installs/elixir/1.13.2/bin/mix test

=================================================================
==1663086==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 15 byte(s) in 1 object(s) allocated from:
    #0 0x7f2f67180bc8 in malloc (/usr/lib/gcc/x86_64-linux-gnu/9/libasan.so+0x10dbc8)
    #1 0x55b4f9b53498 in my_malloc /home/cocoa/.asdf/plugins/erlang/kerl-home/builds/asdf_24.2/otp_src_24.2/erts/etc/common/inet_gethost.c:2656

SUMMARY: AddressSanitizer: 15 byte(s) leaked in 1 allocation(s).
make: Nothing to be done for 'build'.
Compiling 1 file (.ex)
.........

Finished in 0.1 seconds (0.00s async, 0.1s sync)
9 tests, 0 failures

Randomized with seed 139197

=================================================================
==1663020==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 344064 byte(s) in 42 object(s) allocated from:
    #0 0x7f125d2b9bc8 in malloc (/usr/lib/gcc/x86_64-linux-gnu/9/libasan.so+0x10dbc8)
    #1 0x559e2393f3a1 in sys_thread_init_signal_stack sys/unix/sys_signal_stack.c:278

Direct leak of 8192 byte(s) in 1 object(s) allocated from:
    #0 0x7f125d2b9bc8 in malloc (/usr/lib/gcc/x86_64-linux-gnu/9/libasan.so+0x10dbc8)
    #1 0x559e2393f3ef in sys_thread_init_signal_stack sys/unix/sys_signal_stack.c:278
    #2 0x559e2393f3ef in sys_init_signal_stack sys/unix/sys_signal_stack.c:292

SUMMARY: AddressSanitizer: 352256 byte(s) leaked in 43 allocation(s).

Use rustler with nerves + cross-compile

TL;DR: adding ~21 lines of code can remove the need for pre-compiled NIF files if you want to cross-compile an Elixir library that relies on rustler to your nerves build.

Rustler is a Safe Rust bridge for creating Erlang NIF functions, and Nerves lets you Craft and deploy bulletproof embedded software in Elixir. Now, let's put them together, in a better and easier way.

There is an html5ever_elixir repo on the rusterlium GitHub organisation. It is a great project that allows you to use html5ever in Elixir. It also comes with some pre-compiled NIF files covering the most commonly seen operating systems, CPU architectures and ABIs.

If you want to want to deploy any Elixir library that relies on rustler to your Nerves project, you'll have to somehow obtain a pre-compiled NIF file that corresponds to your Nerves build target. For example, say you want to deploy to a Raspberry Pi 4, then you'll need to pre-compile your library with the target named aarch64-unknown-linux-gnu. You can do that by

  • pre-compiling the NIF file on your own machine
  • downloading the pre-compiled the NIF file from the library's provider, like html5ever_elixir

Either way, you have to do something manually. How do we avoid those tedious things?

Well, it's not quite hard to smooth out the process as rust can do cross-compile for you as long as you have (1) the target added to your host machine, (2) and the cross-compile toolchain.

The first requirement is super easy to satisfy:

# use aarch64-unknown-linux-gnu here
# you can see the full list by
# rustc --print target-list
rustup target add aarch64-unknown-linux-gnu

As for the second requirement, Nerves will automatically download the pre-built toolchain.

mix nerves.new nerves_project
cd nerves_project
export MIX_TARGET=rpi4
mix deps.get
# the root directory of this toolchain will be
# ~/.nerves/artifacts/nerves_toolchain_\${NERVES_TARGET_TRIPLE}-\${HOST_OS}_\${HOST_ARCH}-\${TOOLCHAIN_VER}/

${NERVES_TARGET_TRIPLE} here will be aarch64_nerves_linux_gnu. ${HOST_OS} is either linux or darwin, depends on which OS you're using. The same goes for ${HOST_ARCH}, depends on your host machine's CPU arch. Let denote the root directory of the toolchain as ${TOOLCHAIN_ROOT} (we will refer to this path later).

To cross-compile on the host, a few lines of code needs to be added to the library's mix.exs file.

Firstly, we can see that ${NERVES_TARGET_TRIPLE} is different from the target triple used in rustc, which is aarch64-unknown-linux-gnu. Therefore we need a compile-time map for the translation

  @nerves_rust_target_triple_mapping %{
    "armv6-nerves-linux-gnueabihf": "arm-unknown-linux-gnueabihf",
    "armv7-nerves-linux-gnueabihf": "armv7-unknown-linux-gnueabihf",
    "aarch64-nerves-linux-gnu": "aarch64-unknown-linux-gnu",
    "x86_64-nerves-linux-musl": "x86_64-unknown-linux-musl"
  }

Then we need to check if this library is being compiled to a Nerves build. Luckily, Nerves exports a number of environment variables, and here we will try to get the value of NERVES_SDK_SYSROOT. If we get a string, then this will be a Nerves build; otherwise, we don't need to do anything special.

def project do
  if is_binary(System.get_env("NERVES_SDK_SYSROOT")) do
    components = System.get_env("CC")
      |> tap(&System.put_env("RUSTFLAGS", "-C linker=#{&1}"))
      |> Path.basename()
      |> String.split("-")

    target_triple =
      components
      |> Enum.slice(0, Enum.count(components) - 1)
      |> Enum.join("-")

    mapping = Map.get(@nerves_rust_target_triple_mapping, String.to_atom(target_triple))
    if is_binary(mapping) do
      System.put_env("RUSTLER_TARGET", mapping)
    end
  end
  
  # ... no more changes below ...
  [
      app: ... ,
      version: ...,
      ...
  ]
end

In the code above, we put a new environment variable RUSTFLAGS and set its value to -C linker=${CC}. The value of ${CC} is also automatically set by Nerves. So, here we explicitly tell rustc which linker to use as we don't want to use the linker on the host machine. ${CC} in this example is

${TOOLCHAIN_ROOT}/${NERVES_TARGET_TRIPLE}/${NERVES_TARGET_TRIPLE}-gcc

The next step of the code in project/0 is to extract and map the ${NERVES_TARGET_TRIPLE} to corresponding target in rustc. The mapped value will be set to a new environment variable RUSTLER_TARGET.

Lastly, add the target option to the use clause in the correspondingly .ex file to tell rustlter which target we are going to build.

defmodule Example.Nif do
  @moduledoc false

  # defp make_target_flag(args, target) when is_binary(target)
  # so it is fine if we get `:nil`
  use Rustler, otp_app: :exmaple, crate: "example_nif", target: System.get_env("RUSTLER_TARGET")

  def your_nif_function, do: :erlang.nif_error(:not_loaded)
end

Here you go! No more pre-compiled NIF files are needed for different Nerves build (once PR #423 is approved and merged)! An example project can be found here and also its CI build logs for almost all Nerves targets.

Numerical Elixir Benchmark: CIFAR10 with 3-Layer DenseNN

TLDR:

  1. Use C libraries (via NIF) for matrix computation when performance is the top priority. Otherwise, it is about $10^3$ times slower in terms of matrix computation.
  2. OTP 25 introduces JIT on ARM64 and it shows 3-4% performance improvement (matrix computation).
  3. Almost linear speedup can be achieved when a large computation task can be divided into independent smaller ones.
  4. Apple M1 Max performs much better than its x86_64 competitors (Intel Core i9 8950HK and AMD Ryzen 9 3900XT).

Benchmark code here: https://github.com/cocoa-xu/CIFAR-10-livebook

Numerical Elixir

I started to use Elixir/Erlang about 2 months ago, and I learned the existence of Numerical Elixir (Nx) from my supervisor, Lito.

Basically, Nx to Elixir is like NumPy to Python. They implemented a number of numerical operations, especially for multi-dimensional arrays. It's worth noting that Nx comes with a built-in auto-grader, which means that we don't have to write the corresponding differentiate functions for backwards-propagating when training a neural network.

I explored the Nx and tried to write some benchmarks to evaluate its performance with different hardware (Raspberry Pi 4, x86_64 laptops and desktops, ARM64 laptops) and conditions (Allow calls to external C libraries vs. Pure Elixir implementation). And here I finally got some numbers!

P.S. The goal of this benchmark is only to evaluate the matrix computation performance, instead of getting a decent (or even acceptable) CIFAR-10 prediction accuracy.

Benchmark Settings

Hardware

  • Raspberry Pi 4, 8 GB of RAM. Ubuntu 20.04 aarch64.
  • x86_64 laptop. Intel 8th Gen Core i9 8950HK, 6 Cores 12 Threads, MacBook Pro (15-inch, 2018), 32 GB RAM. macOS Big Sur 11.1 x86_64.
  • x86_64 desktop. AMD Ryzen 9 3900XT, 12 Cores 24 Threads, Desktop PC, 64 GB RAM, NVIDIA RTX 3090. Ubuntu 20.04 x86_64.
  • ARM64 laptop. M1 Max, 10 Cores (8 Performance + 2 Effiency) 10 Threads, MacBook Pro (14-inch, 2021), 64 GB RAM. macOS Montery 12.0.1 aarch64.

Software

Dataset

CIFAR-10 binary version.

Method

  • 3-layer DenseNN.
    1. Input layer. Dense layer, size {nil, 1024, 64} + {nil, 64}, activation sigmoid.
    2. Hidden layer. Dense layer, size {nil, 64, 32} + {nil, 32}, activation sigmoid.
    3. Output layer. Dense layer, size {nil, 32, 10} + {nil, 10}, activation softmax.
  • Number of epochs: 5.
  • Batch size.
    • 300 when using Nx.BinaryBackend, single-thread
    • 250 * n_jobs when using Nx.BinaryBackend, multi-thread. n_jobs will be the number of available logical cores.
    • 300 when using Torchx.Backend.
  • Binary.
Benchmark.run(
  backend: Nx.BinaryBackend,
  batch_size: 300,
  n_jobs: 1
)
  • Binary MT.
Benchmark.run(
  backend: Nx.BinaryBackend,
  batch_size:250 * System.schedulers_online(),
  n_jobs: System.schedulers_online()
)
  • Torch CPU/GPU.
Benchmark.run(backend: Torchx.Backend, batch_size: 300)

Benchmark Results

Numbers are in seconds.

I'll fill in the empty cells when the rest benchmarks are done.

HardwareBackendOTPLoad DatasetTo Batched InputMean Epoch Time
Pi 4Binary24
Pi 4Binary MT24
Pi 4Binary25194.42711.91727336.010
Pi 4Binary MT25207.92311.85518210.347
Pi 4Torch CPU2415.3344.88017.170
Pi 4Torch CPU2516.3724.44216.207
8950HKBinary2417.9943.0364460.758
8950HKBinary MT2417.8262.9341471.090
8950HKTorch CPU242.1410.7780.841
3900XTBinary246.0582.3913670.930
3900XTBinary MT246.0342.536786.443
3900XTTorch CPU241.6530.6170.770
3900XTTorch GPU241.6300.6520.564
M1 MaxBinary2411.0902.1353003.321
M1 MaxBinary MT2410.9252.154453.536
M1 MaxBinary259.4581.5483257.853
M1 MaxBinary MT259.9491.527436.385
M1 MaxTorch CPU241.7021.9000.803
M1 MaxTorch CPU251.5990.7450.773

ROS 2 learning note, make_service

Just a quick note when I was learning this ROS 2 tutorial, Creating custom ROS 2 msg and srv files. Using C++ template to save time typing all the types and can use lambdas as well.

#include "rclcpp/rclcpp.hpp"
#include "tutorial_interfaces/srv/add_three_ints.hpp"

#include <memory>
#include <tuple>

template <typename T, typename F, typename S = typename rclcpp::Service<T>::SharedPtr>
auto make_service(const char *node_name, const char * service_name, F && func) 
  -> std::tuple<std::shared_ptr<rclcpp::Node>, S>
{
  std::shared_ptr<rclcpp::Node> node = rclcpp::Node::make_shared(node_name);
  S service = node->create_service<T>(service_name, std::forward<F>(func));   
  return std::make_tuple(node, service);
}

int main(int argc, char **argv) {
  rclcpp::init(argc, argv);
  using Service = tutorial_interfaces::srv::AddThreeInts;
  auto [node, service] = make_service<Service>(
    "add_three_ints_server", "add_three_ints",
    [](const std::shared_ptr<Service::Request> request, std::shared_ptr<Service::Response> response) {
      response->sum = request->a + request->b + request->c;
      RCLCPP_INFO(rclcpp::get_logger("rclcpp"), "Incoming request\na: %ld" " b: %ld" " c: %ld", 
        request->a, request->b, request->c);
      RCLCPP_INFO(rclcpp::get_logger("rclcpp"), "sending back response: [%ld]", (long int)response->sum);
  });
  RCLCPP_INFO(rclcpp::get_logger("rclcpp"), "Ready to add three ints");
  rclcpp::spin(node);
  rclcpp::shutdown();
}