Skip to content

New module loading#2229

Open
bettio wants to merge 3 commits intoatomvm:release-0.7from
bettio:new-module-loading
Open

New module loading#2229
bettio wants to merge 3 commits intoatomvm:release-0.7from
bettio:new-module-loading

Conversation

@bettio
Copy link
Collaborator

@bettio bettio commented Mar 23, 2026

size (bytes) | executable | description
x86_64 | 1018312 | src/AtomVM | stripped, -O3, OLD loader
x86_64 | 928200 | src/AtomVM | stripped, -O3, NEW loader

ESP32 | 1498203 | atomvm-esp32.bin | debug, -Og, OLD loader
ESP32 | 1418775 | atomvm-esp32.bin | debug, -Og, NEW loader

ESP32 | 1476281 | atomvm-esp32.bin | perf, -O2, OLD loader
ESP32 | 1402729 | atomvm-esp32.bin | perf, -O2, NEW loader

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

Using X macros allows us to turn opcodes.h simple definitions into an opcode
table with multiple columns.

Signed-off-by: Davide Bettio <davide@uninstall.it>
@bettio bettio force-pushed the new-module-loading branch 2 times, most recently from 146cf6d to 1f51a95 Compare March 25, 2026 13:38
bettio added 2 commits March 25, 2026 19:04
Replace duplicated decode logic in opcodesswitch.h with a new code
loader in module.c driven by opcode signatures declared in opcodes.def.

Each opcode declares a compact signature string (e.g. "Af", "jtbssd")
that describes its argument encoding. The loader iterates the signature
and dispatches to the appropriate DECODE_* macro for each character,
removing per-opcode inline decode duplication and reducing binary size.
Variable-length arguments use a [X] loop syntax that handles extended
list tags automatically, eliminating most custom handler functions.

The signature alphabet is inspired to OTP's beam_makeops conventions
(s, d, f, j, t, I, A, Q, e, b, F, etc.) so signatures are readable
by anyone familiar with BEAM internals.

Opcodes removed in commit 94f6f0b are marked with X_OPCODE_REMOVED.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Code loading is now handled by parse_core_chunk() in module.c using
opcode signatures. Remove the IMPL_CODE_LOADER compilation pass from
opcodesswitch.h, leaving it as execution-only.

- Remove all #ifdef IMPL_CODE_LOADER blocks (decode macros, TRACE,
  UNUSED, module_add_label, module_insert_line_ref_offset, entry
  points)
- Remove #ifdef IMPL_EXECUTE_LOOP guards (now always true)
- Remove redundant USED_BY_TRACE for variables already used in
  execution code

Signed-off-by: Davide Bettio <davide@uninstall.it>
@bettio bettio force-pushed the new-module-loading branch from 1f51a95 to 7ce8135 Compare March 25, 2026 18:07
@bettio bettio changed the title WIP: New module loading New module loading Mar 25, 2026
@bettio bettio marked this pull request as ready for review March 25, 2026 18:33
@petermm
Copy link
Contributor

petermm commented Mar 25, 2026

Oracle Review of the 3 Commits (opcodes.def X-macro, signature-driven loader, IMPL_CODE_LOADER removal)

High Severity

  1. Unchecked opcode-table indexing (module.c:731-739) — no bounds check on opcodes_signatures[opcode] before indexing; opcode > 185 causes OOB read before the NULL check runs.
  2. No code-buffer end checks (module.c:724-895) — the loader walks pc without a code_end pointer, so truncated/corrupt bytecode can read past the CODE chunk.
  3. Label range not validated (module.c:679-685) — module_add_label() blindly writes mod->labels[index] with no check that index < num_labels.

Medium Severity

  1. FP register off-by-one (module.c:324-335) — accepts freg == MAX_REG which is out of range (valid: 0..MAX_REG-1).
  2. List signature parser has no nesting/missing ] guard (module.c:746-778) — walks to ] without bounds; missing ] in a signature would overrun.

Low Severity

  1. 'c'/'f' signature types not enforced — documented semantics differ from actual parser behavior.
  2. Uninitialized label in 'm' trace path — UB when tracing atom-or-label decodes.

What Looks Good

  • The core refactor is sound; all opcodes spot-checked have correct signatures matching the executor.
  • X-macro approach is a clear maintainability win (single source of truth).
  • Special handlers for OP_LABEL, OP_LINE, OP_FMOVE, OP_BS_MATCH are correct.

break;
default:
fprintf(stderr, "bs_match loader: unknown command %i\n", (int) term_to_atom_index(command));
abort();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want AVM_ABORT() here.

if (opcode_signature == NULL) {
fprintf(stderr, "missing opcode: %i\n", (int) opcode);
fflush(stdout);
abort();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want AVM_ABORT() here.

default: {
fprintf(stderr, "unknown signature: %c\n", opcode_signature[arg_index]);
fflush(stdout);
abort();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want AVM_ABORT() here.


if (opcode_signature == NULL) {
fprintf(stderr, "missing opcode: %i\n", (int) opcode);
fflush(stdout);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush stdout but fprint stderr.


default: {
fprintf(stderr, "unknown signature: %c\n", opcode_signature[arg_index]);
fflush(stdout);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush stdout but fprintf stderr

} \
}

#define DECODE_DEST_REGISTER_GC_SAFE(dreg, decode_pc) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems unused now.


#ifdef ENABLE_TRACE

// About X macro: https://en.wikipedia.org/wiki/X_macro
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to repeat this.

module_insert_line_ref_offset(mod, line_refs, line_ref, offset);
}

// About X macro: https://en.wikipedia.org/wiki/X_macro
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to repeat this.

const uint8_t *new_encoded = encoded;
uint32_t len;
DECODE_LITERAL(len, new_encoded);
// TODO: check this: actually should be enough: len = *(new_encoded)++ >> 4;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a lot of aborts, should we have one here?


union opcode_arg
{
uint32_t u32_arg;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a union for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants