Skip to content

feat(cpp): add map type support#1590

Open
yordis wants to merge 9 commits intobytecodealliance:mainfrom
yordis:yordis/feat-cpp-map-support
Open

feat(cpp): add map type support#1590
yordis wants to merge 9 commits intobytecodealliance:mainfrom
yordis:yordis/feat-cpp-map-support

Conversation

@yordis
Copy link
Copy Markdown
Contributor

@yordis yordis commented Apr 14, 2026

This PR adds map type support to the C++ backend, following the same pattern established in #1562 (core), #1583 (Go), and #1584 (MoonBit).

yordis added 2 commits April 14, 2026 15:34
…ix rustfmt

wit::string lacked comparison operators, causing compilation failures
when used as a std::map key. Also fixes rustfmt formatting issues.
@yordis yordis marked this pull request as draft April 14, 2026 20:31
std::map keys are const, but the ABI lowering needs to call leak() on
string keys to transfer ownership to the flat buffer. Use const_cast
since the map is consumed during the lowering operation.
@alexcrichton
Copy link
Copy Markdown
Member

cc @cpetig and @TartanLlama

yordis added 2 commits April 14, 2026 18:04
const_cast fails when the map key type differs between contexts (e.g.
std::string_view in imports vs wit::string in exports). Copying by
value works universally and is safe since the map is consumed.
wasm-component-ld bundled with wasi-sdk 30 doesn't support the map
type encoding (0x63) in the component model binary format. The C++
map codegen is still validated by the codegen test. The runtime test
can be re-added when wasi-sdk ships a compatible wasm-component-ld.
@yordis yordis marked this pull request as ready for review April 14, 2026 23:20
Comment thread crates/cpp/src/lib.rs Outdated
let size = entry.size.format(POINTER_SIZE_EXPRESSION);
let align = entry.align.format(POINTER_SIZE_EXPRESSION);
self.push_str(&format!("auto&& {val} = {};\n", operands[0]));
self.push_str(&format!("auto {len} = (size_t)({val}.size());\n"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid the C-style cast here?

Comment thread crates/cpp/src/lib.rs Outdated
self.push_str(&format!("auto {len} = (size_t)({val}.size());\n"));
uwriteln!(
self.src,
"auto {ptr} = ({ptr_type})({len} > 0 ? cabi_realloc(nullptr, 0, {align}, {len} * {size}) : nullptr);",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid the C-style cast here?

Comment thread crates/cpp/src/lib.rs Outdated
let body_value = &body.1[1];
uwriteln!(
self.src,
"{result}.insert(std::make_pair({}, {}));",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think .emplace would be better here

Comment thread crates/cpp/src/lib.rs Outdated
self.include("<bit>");
}
if self.dependencies.needs_map {
self.include("<map>");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be better to have some configurability here? As a user, I'd often want to avoid std::map for performance reasons, and if I'm really perf-focused, I'd also avoid std::unordered_map and use some custom cache-friendly hash map type. Do you think we can enable this use-case without mandating an additional copy of the data?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that all the allowed key types from the spec would be hashable, so at least it may be better to default to std::unordered_map

Comment thread crates/cpp/src/lib.rs Outdated
);
uwriteln!(self.src, "size_t {idx} = 0;");
uwriteln!(self.src, "for (auto& {entry_name} : {val}) {{");
uwriteln!(self.src, "auto iter_map_key = {entry_name}.first;");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we're copying the key and taking a reference to the value? Can we at least move the key?

Comment thread crates/cpp/src/lib.rs Outdated
uwriteln!(self.src, "++{idx};");
uwriteln!(self.src, "}}");
if realloc.is_some() {
// ownership transfers
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this comment means or what this block is for

Comment thread crates/cpp/src/lib.rs Outdated
let entry = self.r#gen.sizes.record([*key, *value]);
let size = entry.size.format(POINTER_SIZE_EXPRESSION);
uwriteln!(self.src, "uint8_t* base = {ptr} + {i} * {size};");
uwriteln!(self.src, "(void) base;");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this line? In case the body doesn't use base?

@yordis
Copy link
Copy Markdown
Contributor Author

yordis commented Apr 16, 2026

@TartanLlama addressing your comments, the "why" behind things, I am effectively Junior in C++, please be patience with me here 🙏🏻 I am try to understand the intent and I will give it a second path

@cpetig
Copy link
Copy Markdown
Collaborator

cpetig commented Apr 16, 2026

If I were to implement this feature I would introduce a wit::map type which has the same layout as the lowered data. This way the data doesn't need to be copied unless the user code requests it (making map a lower cost abstraction).

On the other hand I just ran into an adjacent problem with list lowering (and avoiding re-allocation) in #1592 , so avoiding a copy in case of complex types might still be impossible - due to C++ type layout restrictions.

A std::map is of course more easy to use if you don't care about allocation and copy costs.

@yordis yordis marked this pull request as draft April 17, 2026 14:58
yordis added 4 commits April 17, 2026 12:13
std::map requires per-entry rb-tree allocation during lift; wit::map
mirrors the canonical ABI layout (flat pair array) so lift is a single
allocation, matching the existing wit::vector / wit::string pattern.

Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
wit::map::size() and std::span::size() already return size_t, so the
C-style cast was a no-op.

Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
Avoids an unused `base` local and the `(void) base;` suppression when
the element has no nested heap-owned fields (e.g. list<u32>, map<char, u32>).

Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
The cast converts void* (from cabi_realloc) to uint8_t*, which is a
well-defined static_cast and doesn't need a C-style cast.

Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
@yordis yordis requested a review from TartanLlama April 17, 2026 18:10
@yordis yordis marked this pull request as ready for review April 17, 2026 18:10
@yordis
Copy link
Copy Markdown
Contributor Author

yordis commented Apr 17, 2026

@cpetig @TartanLlama can you give it a pass again? 🙏🏻

Ideally, please share the intent of what you would like to see, I can figure out eventually the implementation but I am not qualify to the best practices or insights of C++. I can figure that out for sure, it would help to declare the intent so I can go off, study, and figure out what you want 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants