
# TL; DR
If an optional module requires you to use use Module
in your code, you will need to check the availability of that module before defmodule
and use Module
:
and following code won't work, although it looks simpler and is seemingly correct.
# Environment for This Post
At the time of writing, I'm using Elixir version v1.14.1. OTP version and operating system are not related to the things discussed below.
This behaviour is unlikely to change in the future, however, if you are using a future version of Elixir, please always verify the correctness/validity of the code and the conclusion in the post.
# A careless mistake?
A few hours ago I received an issue report related to evision
v0.1.14 on elixirforum, and the issue can be reproduced in one line:
And how did I fail to catch this, or how did this bug hide under the seemingly correct code? Is this simply a careless mistake or is there something deeper inside it? Let me unroll this story about if
and use
for you.
# Use if
at Compile-time
Using if
at compile-time is pretty common to see in Elixir. It is mainly used to define some functions under some conditions, for example
And if we run it, the expected behaviour is that the program prints Yes, 1 < 2
and exits.
Also, we can verify that the elixir compiler will remove the false
branch of the if
-statement (if the condition can be determined at compile-time)
Lastly, we can verify that if we use the return value of Code.ensure_loaded?/1
, it will do the same thing as the above examples -- we can expect only the code under the true
branch will be compiled:
Let's try it in the IEx session, iex -S mix
# Apply What We Have Learnt So Far
Now, a quick background on the module that caused the compilation error: we'd like to write a module that implements Kino.SmartCell
, and we'd like this to be an optional feature.
Therefore, we should check if Kino.SmartCell
is loaded, if yes, we then follow kino
's SmartCell tutorial, define our own functions and get the job done; otherwise, no functions will be defined in this module.
With the above idea in mind, we have the following code:
At first glance, this code looks good to me because
Kino.SmartCell
is built on top ofKino.JS
. Therefore, ifKino.SmartCell
is loaded, thenKino.JS
must be loaded too.- we use
use
after we have ensured thatKino.JS
is loaded.
Of course, :kino
is indeed listed in deps
in evision
's mix.exs
file.
# What Went Wrong?
Well then, let's face the inevitable question -- what went wrong?
Apparently, the elixir compiler tried to evaluate the use
-statement on line 3, although we expected the compiler to completely avoid evaluating this false
branch. However, since the elixir compiler has to check all the code, including the code in the false
branch, is syntactically correct, e.g., the following code will not pass the syntax check:
During this process, macros will be expanded. You might have already known that use
is a macro. So yes, that's where the problem emerges.
Based on the information given in the Getting Started manual on elixir-lang.org, the following code
will be compiled into
Since __using__
is a macro, we have to call require
to bring in all the macros defined in that module.
You see, if the application :kino
is optional and the user didn't list :kino
in their deps
, then require
will surely fail because it cannot find the Kino.JS
module let alone the macros defined in Kino.JS
.
And require
will fail even if the use
-statement (or the require
-statement, after the expansion) is technically inside the false
branch of an if
-statement, and even when the condition is absolutely false
:
Any Solution to This?
The simplest way is, of course, listing :kino
as a required dependency, but I found another way to solve this issue while keeping :kino
as an optional dependency.
We just saw that the following code would cause a compilation error,
but if we slightly re-arrange these three macros (yes, defmodule
, if
and use
they are all macros), we can achieve what we want:
As for evision
, we can change the code from
to the following
And...problem solved! However, this post will not end here without knowing its reasons -- why this happened in the first place? why it works after re-arranging if
, defmodule
and use
.
# The Reasons and Explanations
After digging into Elixir's source code for a few hours, I found some clues that might be related to this. Since we placed the if
-statement first and it solved the problem, I was guessing that it might have something to do with the Elixir compiler and how the compiler behaves when traversing the AST (abstract syntax tree).
- The first clue is about macro expansion, like when macros are expanded by the compiler.
- The second one is the
optimize_boolean/1
function inbuild_if/2
inkernel.ex
. Maybe theif
-statement here didn't get optimised, which could cause the compilation error. - The last clue I suspected is
allows_fast_compilation/1
fordefmodule
, inelixir_compiler.erl
. Because the name of this function is kinda sus, and based on the name, it seems to allow the compiler to skip/defer the evaluation ofdefmodule
.
Then I decided to ask @josevalim about this question, and he replied kindly, and his answers solved this puzzle. I'll now connect all the dots along with his answers and explain this mystery below.
A big thanks to José!
In Elixir's source code (v1.14.1), defmodule
is defined in kernel.ex
,
use
is in kernel.ex
too
and, of course, as spoiled above, if
is also a macro
And macros are always expanded before the branches (e.g., an if
-statement has two branches, true
and false
) are evaluated. That is to say, when the compiler sees the following code
It will expand the macros first:
But it will fail when evaluating require NOT_EXISTS
because the module does not exist. We can see this based on the traced stack:
When the compiler sees an if
-statement, it will go inside and expand macros (if any). That's why the Kernel.use/1
is on top of Kernel.if/2
in the traced stack:
And then I asked why putting if
before defmodule
will not cause the same compilation error. And José answered:
Because there are a few things that delay macro expansion, such as defining new modules or defining functions. Because you can do this:
Therefore, to define the module, you must execute the line that defines it. So only after you execute the line, the macros are expanded. defmodule A do ... end
compiles to something like this:
that AST will only be expanded if the module is in fact defined.
Therefore, the following code
is first compiled to the following quoted code in kernel.ex
And this emitted code will be executed because the defmodule
in this example is unconditionally written in the source code instead of existing in the false
branch of an if
-statement. When :elixir_module.compile/4
is invoked, the if
-statement will be evaluated, and the use
macro will be expanded during the evaluation process of if
.
Again, we can verify this by looking at the stacktrace:
Searching is not loaded and could not be found
in Elixir's source code will bring us to format_error/1
in elixir_aliases.erl
.
Since unloaded_module
is an atom, we can search it and trace back where it was produced. And it only appears in the same file in ensure_loaded/3
.
This function will first check if the module is already loaded, if not, it will invoke wait_for_module/1
and wait for the module to be compiled if that module exists and can be compiled.
Yet obviously module NOT_EXISTS
does not exist, and 'Elixir.Kernel.ErrorHandler':ensure_compiled/3
will return unloaded_module
, which will be subsequently passed to elixir_errors:form_error/4
.
and become the compilation error message.
And to dig deeper, we can search :ensure_loaded
in the code base and the one in elixir_expand.erl
invoked in function expand/3
are what we are interested in.
And this is exactly where the compilation error happens.
And we can continue to search elixir_expand:expand
in the source code, and we will find the function expand_quoted/7
in elixir_dispatch.erl
.
Note that the Info
on line 248 will become
in the stacktrace. One more step and we will find that expand_quoted/7
is called in two dispatch functions: dispatch_import/6
and dispatch_require/7
in the same file.
As for elixir_dispatch:dispatch_require/7
, it is invoked in two places: the first one is when processing remote calls in elixir_expand:expand/3
. A remote call is to invoke/call a function in another module other than the current one. For example,
The second call to elixir_dispatch:dispatch_require/7
can be found in elixir_module:expand_callback/6
.
elixir_module:expand_callback/6
is used in two places: the first one is in Protocol.derive/5
, and as the module name suggested, it is related to deriving a protocol for a module.
We can go off on a tangent here and explore what will happen if we want to derive a protocol optionally. You can skip this and jump to the second call to elixir_module:expand_callback/6
.
Let's say we have already defined a protocol Derivable
as the following:
Then there are three ways to derive a protocol for a module, and we can start with the simplest one:
- Deriving a protocol using the
@derive
tag.
- The second way is using the
defimpl/3
macro inkernel.ex
.
- The last way to do this explicitly by API via
Protocol.derive/3
What will happen if we'd like to derive a protocol from an optional module?
That's the first one, and the second one is shown below
And the last one
All three examples that optionally derive a protocol from an optional module will compile without any issues and behave as expected. Of course, we'd like to ask the question -- why can we use if
inside the defmodule
to optionally derive a protocol?
Let me explain the reasons for the first way first. The @derive
tag is a module attribute, and @derive
tags in the same module will be collected accumulatively in a bag -- simply put, they will be stored in a list. Then the defstruct/1
macro will retrieve them from Kernel.Utils.defstruct/3
and rewrite them with Protocol.__derive__/3
. (This also explains why all @derive
tags must be set before defstruct/1
)
Since Protocol.__derive__/3
is a function instead of a macro, it will not be evaluated/expanded when processing the branches of an if
-statement. Therefore, the following code will compile and run as expected.
As for the second way that uses the defimpl/3
macro in kernel.ex
, although defimpl/3 is a macro, it simply rewrites the code to call the Protocol.__impl__/4
function.
So, it's similar to the first one, the call to the function Protocol.__impl__/4 will not be evaluated when processing the branches of an if
-statement.
The same reason goes for the last one: Protocol.derive/3
is a function, so it will not be evaluated when processing the branches of an if
-statement.
Let's get back on track. The second call to elixir_module:expand_callback/6
can be found in eval_callbacks/5
in elixir_module.erl
.
eval_callbacks/5
is called in the same file in two different functions (actually one function, depending on how you see this). The first line it appears in the file is inside the compile/5
function (line 161). The second time it appears is in the eval_form/6
function (line 378).
However, eval_form/6
is only called once from the compile/5
function. The obvious difference is that the third argument passed by eval_form/6
is before_compile
whereas compile/5
passes after_compile
. So I guess you could say that eval_callbacks/5
is called in one function, compile/5
.
Based on the value of the third argument, and what we have is a compilation error, we can confirm that the compilation error is thrown when calling eval_form/6
with the third argument as before_compile
.
And we are very close to connecting all the dots: eval_form/6
is called from elixir_module:compile/5
, and elixir_module:compile/5
is called in elixir_module:compile/4
, which is exactly the code that defmodule
rewrites to!
So, if eval_callbacks/5
in eval_form/6
is successfully executed, and there are no other errors, we can expect the optimize_boolean/1
function in build_if/2
Connecting the Dots
Let's first reproduce the whole process with the code that will cause a compilation error.
And we start from the shell command mix run --no-mix-exs absolutely-false.exs
. The entry point for mix run [argv...]
is Mix.Tasks.Run.run/1
, and it will parse command line arguments and call run/5
.
In this example, {opts, head}
will be
In run/5
, we can ignore other checks and focus on the call to the callback function file_evaulator.(file)
. And since file_evaulator
is just &Code.require_file/1
, we can jump to the function require_file/1
in the code.ex
file.
find_file!/2
expands the relative path to the absolute path of the file and reads the whole file into a char list. Then :elixir_code_server.call/1
will check if the file is already required (compiled), if yes, we will do nothing; if not compiled yet, :elixir_compiler.string/3
will be called to compile the code.
elixir:'string_to_quoted!'/5
will pass the file content to the tokenizer and convert tokens to their quoted form.
The following AST will be emitted for this code:
Next, elixir_compiler:quoted/3
will process the quoted form -- evaluate or compile the code with the local lexical environment in eval_or_compile/3
.
The implementation of eval_or_compile/3
is:
allows_fast_compilations/1
will check if this file always defines a module, if yes, we can skip some steps in compile/3
and always define the module.
And the AST shown above indeed matches the one in the middle, therefore, we can do fast_compile/2
.
After the pattern matching, we get
Now we can jump right into elixir_module:compile/4
which will call elixir_module:compile/5
. And expect the compilation error right after calling elixir_module:eval_form/6
in elixir_module:compile/5
. And you know the rest of the story.
What about the code that can achieve our goal?
The AST of the above code is
And now the difference is that we cannot do fast_compile/2
because the file may not define a module. Therefore, now we will take the compile/3
route in eval_or_compile/3
.
And the variable Expanded
will be something like this
As we can see from this intermediate output, optimize_boolean/1
has already been called and gives us the tuple {:case, [line: 1, optimize_boolean: true], [false, [do: ...]]}
.
In spawned_compile/2
, we will translate quoted elixir expressions and variables to the forms that erlang can compile them into .beam binary.
In function is_purgeable/2
we test if the beam binary has any labeled locals,
If not, we can evaluate the code and return the evaluated result as the compiled output (in function dispatch/4
, and dispatch/4
was called in the compile/3
route).
In this example, Res
will be nil
because the condition used in the if
-statement is false
(and the false
branch can be purged); meanwhile, as there is no true
branch, the result of the true
branch will be evaluated to the default value, nil
.
Therefore, if we were using Code.ensure_loaded?/1
as the condition, it would also be evaluated to either true
or false
at compile-time. If it is false
, then everything in the false
branch will be purged, so we won't encounter any compilation errors for the following code.
will be evaluated to
and the false
branch will be purged, which leaves us only nil