Debug Erlang NIF library on macOS and Linux

Debug Erlang NIF library on macOS with Xcode

(Almost) Universal Steps

First of all, set CFLAGS that compiles the NIF library with debuginfo. (Of course, your compile script needs to pick up the CFLAGS environment variable and append/prepend it to other cflags).

# Compile NIF library with debug info (macOS and Linux)
export CFLAGS="-g3 -fno-omit-frame-pointer"

# Enable address sanitizer on Linux
#   We can use address sanitizer on macOS too
#   But it's more easy to use that in Xcode, I'll get to that bit later
#   Also, note that we cannot use static libasan, i.e., "-static-libasan"
#   For some reason, it won't compile if we specify that compile option
export CFLAGS="${CFLAGS} -fsanitize=address"

Next, get necessary info about the Erlang VM on your machine and set environment variables required by erlexec.

# Get full commands from mix
# It should be something like
#   "erl -pa ..."
export COMMANDS="$(ELIXIR_CLI_DRY_RUN=1 mix)"

# Only keep the arguments
# " -pa ..."
export CMD_ARGS="${COMMANDS:3}"

# Find the erl script
# The result could be something like
#   - If you/your package manager put erl in /usr/local
#     "/usr/local/bin/erl"
#   - If you installed erlang by asdf
#     "${HOME}/.asdf/shims/erl"
export ERL="$(which erl)"

# Either way, lets get the parent dir of the parent dir of the erl binary
#   - "/usr/local"
#   - "${HOME}/.asdf"
export ERL_BASE="$(dirname $(dirname ${ERL}))"

# Find the erlexec binary
#   erl is just a shell script that sets up the environment vars
#     and then starts erlexec
export ERLEXEC="$(find "${ERL_BASE}" -name erlexec)"

# Set three required environment variables
#   1. BINDIR. 
#      The directory that erlexec resides in
export BINDIR="$(dirname ${ERLEXEC})"

#   2. ROOTDIR. 
#      This one is a little bit tricky as there are some difference 
#        between the asdf version and others.
#      But it's just the directory where the file start.boot resides in.
#      Note that we might find two start.boot files in ${ERL_BASE},
#        one is used for release while not the other. 
#      We need the other one.      
export START_BOOT="$(find $ERL_BASE -name start.boot | grep -v release)"
export ROOTDIR="$(dirname $(dirname ${START_BOOT}))"

#   3. EMU
#      Should just be beam
export EMU=beam

Now you might need to make some minor changes to the Makefile to accommodate the debugging purpose. Let's take a simple Makefile as example.

PRIV_DIR = $(MIX_APP_PATH)/priv
NIF_SO = $(PRIV_DIR)/nif.so

C_SRC = $(shell pwd)/c_src
LIB_SRC = $(shell pwd)/lib
CFLAGS += -I$(ERTS_INCLUDE_DIR)
CPPFLAGS += $(CFLAGS) -std=c++14 -Wall -Wextra -pedantic
LDFLAGS += -shared

UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Darwin)
	LDFLAGS += -undefined dynamic_lookup -flat_namespace -undefined suppress
endif

.DEFAULT_GLOBAL := build

build: clean $(NIF_SO)

clean:
	@ if [ -z "${NOT_SKIP_COMPILE}" ]; then \
		rm -f $(NIF_SO) ; \
	fi

$(NIF_SO):
	@ mkdir -p $(PRIV_DIR)
	@ if [ -z "${NOT_SKIP_COMPILE}" ]; then \
		$(CC) $(CPPFLAGS) $(LDFLAGS) $(C_SRC)/nif.cpp -o $(NIF_SO) ; \
	fi

Note that we use the environment variable NOT_SKIP_COMPILE to indicate whether to recompile the shared library. If NOT_SKIP_COMPILE is empty or does not exist in env, then the shared library will be recompiled each time we call mix test (or other commands).

macOS

Debug with lldb in terminal

On macOS, we can use either lldb or Xcode (with lldb, of course). To test the NIF library with lldb, we need

# This is basically the full version of the "mix test" command
lldb -- ${ERLEXEC} ${CMD_ARGS} test
(lldb) run
(lldb) c

Debug with Xcode in GUI

IMHO, this approach should make things much easier.

First, open Xcode and select "Debug executable..." in "Debug" from the menubar. Then choose erlexec in Finder. You can get the path to erlexec by

echo ${ERLEXEC}

After choosing erlexec, we'll need to set a few things in Xcode

Scheme - Run Debug

Add one entry in Arguments Passed On Launch, copy and paste everything in ${CMD_ARGS} to the new entry.

Add a second entry and write test. This is basically the full version of the mix test command.

And we need to set these environment variables, CFLAGS, BINDIR, ROOTDIR and EMU. You should have similar values as shown in the screenshot below.

Fill in launch arguments and set environment variables

To enable address sanitizer and/or enable other debug tools, you can click the Diagnostics tab and enable the ones you need.

Set working directory to the root directory of the mix project.

Also, we need to set the working directory in the Options tab to the root directory of the mix project.

With everything set correctly in the scheme panel, you can close it and click run or command+R to debug the NIF library in Xcode.

And because the NIF library is compiled with debuginfo, Xcode can jump to the line that triggers a crash. Moreover, you can also set breakpoints and rerun the test.

Various debug information is available

Furthermore, if you compiled erlang locally and you didn't delete the erlang source code, then stepping into erlang functions will also bring you to the corresponding line in the erlang source code! (Note, this feature is also available no matter we debug the NIF library in GUI or terminal, macOS or Linux. But to me, it's really easier to navigate and see variable values in Xcode.)

Step into Erlang functions and view variable values in Xcode.

Linux

To run address sanitizer on the NIF library, we need to do it in two steps. First, compile the shared library with the CFLAGS we just exported to the env. Second, inject libasan.so to erlexec. Sounds hard, but it is really simple.

# First, compile the shared library
#   Ignore the error message from libasan
bash -c "$ERLEXEC $CMD_ARGS test" || true

# Secondly, get the path to libasan.so and set LD_PRELOAD
#   so that it will be loaded before erlexec
export LIBASAN_SO="$(gcc -print-file-name=libasan.so)"
bash -c "LD_PRELOAD=${LIBASAN_SO} NOT_SKIP_COMPILE=skip $ERLEXEC $CMD_ARGS test"

If everything goes smoothly, libasan will print diagnosis messages after the process exits, for example

LD_PRELOAD=$(gcc -print-file-name=libasan.so) NOT_SKIP_COMPILE=skip /home/cocoa/.asdf/installs/erlang/24.2/erts-12.2/bin/erlexec -pa /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/eex/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/elixir/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/ex_unit/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/iex/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/logger/ebin /home/cocoa/.asdf/installs/elixir/1.13.2/bin/../lib/mix/ebin -elixir ansi_enabled true -noshell -s elixir start_cli -extra /home/cocoa/.asdf/installs/elixir/1.13.2/bin/mix test

=================================================================
==1663086==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 15 byte(s) in 1 object(s) allocated from:
    #0 0x7f2f67180bc8 in malloc (/usr/lib/gcc/x86_64-linux-gnu/9/libasan.so+0x10dbc8)
    #1 0x55b4f9b53498 in my_malloc /home/cocoa/.asdf/plugins/erlang/kerl-home/builds/asdf_24.2/otp_src_24.2/erts/etc/common/inet_gethost.c:2656

SUMMARY: AddressSanitizer: 15 byte(s) leaked in 1 allocation(s).
make: Nothing to be done for 'build'.
Compiling 1 file (.ex)
.........

Finished in 0.1 seconds (0.00s async, 0.1s sync)
9 tests, 0 failures

Randomized with seed 139197

=================================================================
==1663020==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 344064 byte(s) in 42 object(s) allocated from:
    #0 0x7f125d2b9bc8 in malloc (/usr/lib/gcc/x86_64-linux-gnu/9/libasan.so+0x10dbc8)
    #1 0x559e2393f3a1 in sys_thread_init_signal_stack sys/unix/sys_signal_stack.c:278

Direct leak of 8192 byte(s) in 1 object(s) allocated from:
    #0 0x7f125d2b9bc8 in malloc (/usr/lib/gcc/x86_64-linux-gnu/9/libasan.so+0x10dbc8)
    #1 0x559e2393f3ef in sys_thread_init_signal_stack sys/unix/sys_signal_stack.c:278
    #2 0x559e2393f3ef in sys_init_signal_stack sys/unix/sys_signal_stack.c:292

SUMMARY: AddressSanitizer: 352256 byte(s) leaked in 43 allocation(s).